# Design and Performance Analysis of Low Power Multipliers

T.Loganayaki

Department of ECE Coimbatore institute of technology Coimbatore, India

## ABSTRACT

Hardware implementation of image processing algorithms is becoming the need on the day due to advancement in handheld devices, medical imaging systems, etc. Power management in those applications is a major concern. The multiplier can be one of the best solutions for the power management problem. This paper deals with a new inexact 4-2 compressor for exploitation in a multiplier. This design is used to improve the multiplier features like power and transistor count. Two different multipliers, utilizing the inexact 4-2 compressor are proposed and analyzed for an unsigned Dadda multiplier. Prevalent simulation results have been evaluated and utilizing image processing as an application for inexact compressor. The results show that the proposed design achieve significant accuracy improvement together with major reduction in power and number of gates and it is compared to exact 4-2 compressor. The proposed multiplier fallout an excellent value for image blending with respect to PSNR.

### **Keywords**

Unsigned Dadda multiplier, inexact compressor.

#### **1. INTRODUCTION**

Most of the semiconductor industry has perceived to an explosive growth of multimedia based application into mobile electronics. It is essential to inspect well- engineered technologies to address the challenging criteria for low power, less area and high speed VLSI designs. Most of the multimedia applications have DSP blocks. In which the exact models and algorithms used are not efficient to reduce the cost and complexity, but increases the performance .An accurate models are not efficient for these applications. Multimedia and image processing can accept errors and ambiguity in computation since, it will generate significant and useful results. This relaxation provides some freedom to do inexact computation. Multiplication is an important prerequisite for all DSP, digital filter etc, which is a crucial consideration to low power efficient design. This can be achieved by increasing the speed of the multiplication operation.

## **2. RELATED WORK**

Multiplication is one of the extensively used operations in computer arithmetic. Multiplication is obtained by repeated sum of partial products. For addition different full-adder cells have been extensively analyzed for inexact computing [2]. On the other hand, the inexact multipliers design has less consideration. The estimated adder's necessary for designing an inexact multiplier is not feasible, because it is very inefficient in terms of performance metrics. Compressor has been widely used [1] to speed up the partial product reduction phase and to decrease the power consumption. Optimized designs of 4-2 exact compressors have been proposed in [1, 5-

#### S.Umamaheswari, PhD

Department of ECE Coimbatore institute of technology Coimbatore, India

8]. No new design is proposed for the compressors to do inexact computation. Designs of inexact compressors have been proposed in [9]; yet, these designs do not target multiplication. This paper initially proposes the design and analysis of 4 - 2 compressor.

Initially in this paper, inexact 4-2 compressors are proposed and analyzed. It is shown that these basic compressors have better delay and power consumption than the exact 4-2 compressor found in the technical literature [8]. These approximate compressors are then used in the Dadda multiplier; two different multipliers are proposed for inexact multiplication. Extensive simulation results are provided for the image processing application. The results of two images are reported; these results show that the inexact multipliers has better in power consumption, reduced number of gates. And also, it results in excellent values of Peak Signal-to-Noise Ratio (PSNR) and resemblance to the image generated by an exact multiplier. The performance analysis and simulation results show that the proposed inexact designs are feasible for inexact computing.

## **3. EXACT 4-2 COMPRESSOR**

The main goal of the compressor is to speed up the process of parallel multiplication. Compressor logic is based on the process of counter. It is developed to reduce 'n' numbers to two number values. Therefore n-2 compressors have been widely used in computer arithmetic operation.



#### Figure 1 Implementation of 4-2 compressor

The 4-2 compressor is widely used building block for high accuracy and high speed multiplier .The 4-2 compressor has four inputs ( $X_1$ ,  $X_2$ ,  $X_3$ , and  $X_4$ ) and two outputs (Sum and Carry) along with a carry bit. The pattern really compresses five partial product's bits into three. The carry bit from the right side is denoted as Cin while the carry bit from the leftmost side is denoted as Cout. The output Cout, is self-determining of the input Cin which is used to speed up the carry save summation of the partial products.

$$Sum = x_1 \oplus x_2 \oplus x_3 \oplus x_4 \oplus C_{in} \tag{1}$$

$$C_{out} = (x_1 \oplus x_2)x_3 + \overline{(x_1 \oplus x_2)}x_1$$
(2)  

$$Carry = (x_1 \oplus x_2 \oplus x_3 \oplus x_4)C_{in} + \overline{(x_1 \oplus x_2 \oplus x_3 \oplus x_4)}x_4$$
(3)

The structural design Fig(1) is associated in such a way that four of the inputs are coming from the same bit point of the weight k while one bit is fed from the adjacent point k-1 (known as carry-in). The output of 4:2 compressor consists of one bit in the place k and two bits in the place k+1.



Figure 2.Gate level implementation of Exact compressor

This arrangement is called compressor since it compresses four partial products into two values. The realization of a 4-2 compressor is accomplished by utilizing two 3-2 compressor (full adders) in series.

| TABLE I |                               |    |    |    |      |       |     |
|---------|-------------------------------|----|----|----|------|-------|-----|
|         | TRUTH TABLE OF 4-2 COMPRESSOR |    |    |    |      |       |     |
| Cin     | X4                            | X3 | X2 | X1 | Cout | Carry | Sum |
| 0       | 0                             | 0  | 0  | 0  | 0    | 0     | 0   |
| 0       | 0                             | 0  | 0  | 1  | 0    | 0     | 1   |
| 0       | 0                             | 0  | 1  | 0  | 0    | 0     | 1   |
| 0       | 0                             | 0  | 1  | 1  | 1    | 0     | 0   |
| 0       | 0                             | 1  | 0  | 0  | 0    | 0     | 1   |
| 0       | 0                             | 1  | 0  | 1  | 1    | 0     | 0   |
| 0       | 0                             | 1  | 1  | 0  | 1    | 0     | 0   |
| 0       | 0                             | 1  | 1  | 1  | 1    | 0     | 1   |
| 0       | 1                             | 0  | 0  | 0  | 0    | 0     | 1   |
| 0       | 1                             | 0  | 0  | 1  | 0    | 1     | 0   |
| 0       | 1                             | 0  | 1  | 0  | 0    | 1     | 0   |
| 0       | 1                             | 0  | 1  | 1  | 1    | 0     | 1   |
| 0       | 1                             | 1  | 0  | 0  | 0    | 1     | 0   |
| 0       | 1                             | 1  | 0  | 1  | 1    | 0     | 1   |
| 0       | 1                             | 1  | 1  | 0  | 1    | 0     | 1   |
| 0       | 1                             | 1  | 1  | 1  | 1    | 1     | 0   |
| 1       | 0                             | 0  | 0  | 0  | 0    | 0     | 1   |
| 1       | 0                             | 0  | 0  | 1  | 0    | 1     | 0   |
| 1       | 0                             | 0  | 1  | 0  | 0    | 1     | 0   |
| 1       | 0                             | 0  | 1  | 1  | 1    | 0     | 1   |
| 1       | 0                             | 1  | 0  | 0  | 0    | 1     | 0   |
| 1       | 0                             | 1  | 0  | 1  | 1    | 0     | 1   |
| 1       | 0                             | 1  | 1  | 0  | 1    | 0     | 1   |
| 1       | 0                             | 1  | 1  | 1  | 1    | 1     | 0   |
| 1       | 1                             | 0  | 0  | 0  | 0    | 1     | 0   |
| 1       | 1                             | 0  | 0  | 1  | 0    | 1     | 1   |
| 1       | 1                             | 0  | 1  | 0  | 0    | 1     | 1   |
| 1       | 1                             | 0  | 1  | 1  | 1    | 1     | 0   |
| 1       | 1                             | 1  | 0  | 0  | 0    | 1     | 1   |
| 1       | 1                             | 1  | 0  | 1  | 1    | 1     | 0   |
| 1       | 1                             | 1  | 1  | 0  | 1    | 1     | 0   |
| 1       | 1                             | 1  | 1  | 1  | 1    | 1     | 1   |
| ·       |                               |    |    |    |      | 1     |     |

The above equation and Table I shows the truth table of 4-2 compressor. Figure 2 shows the design of 4-2 compressor based on XOR-XNOR gates[1].

#### **4. DADDA MULTIPLIER**

8x8 unsigned **Dadda multiplier** is a faster multiplier. It is utilized for all sizes of operand and needs only less number of gates for multiplication. Dadda multiplier has three steps:

- Generate the partial products by multiplying each bit of one of the arguments, by each bit of the other will results n2 arguments.
- The number of partial products is reduced to two rows by full adder and half adder cells.
- A conventional adder is used to produce the final binary output.

Dadda multipliers have a less number of reduction phases, but the numbers may be a few bits longer, thus this multiplier needs bigger adders.Dadda multiplier follows more complex rules. If any new layer is added, weight is passed by three or more wires. It Take any three values with equal weight and provide it into a full adder circuit. Then the result will be an output value with the same weight. If two binary values has the same weight and the current number of output values has also same weight, then it provides the partial output to the half adder circuit. Else, pass it to the next stage.

## 5. PROSPOSED INEXACT 4-2 COMPRESSOR

In this section, we discuss the method for designing inexact FA cells. To design an inexact 4-2 compressor it is possible to replace the precise full-adder cells with by inexact full adder cells. Though, this is not proficient, as it produces at least 17 erroneous results out of 32 possible outputs, i.e. the inaccuracy rate of this imprecise compressor is more than 53% (where the error rate is given by the ratio of the number of incorrect outputs over the total number of outputs). The 4-2 exact compressor is one of the extensively used; we use it as our basis for proposing inexact compressor. An inexact compressor design is proposed next to reduce the number of transistor count, power consumption and high error distance rate.

## 5.1. Inexact Compressor Design

In this section, we converse how we can arise with different inexact computing methods with lesser number of transistors. Since sequence connection transistors add larger delay, elimination of some of them will help faster charging/discharging of lump capacitances. Besides, the reduction by elimination of transistors also results in lower power dissipation. An additional advantage is reduced area. The conventional implementation of inexact compressor for multiplication is shown in Table I.

The carry output in an exact compressor has the same value of the input Cin in 24 out of 32 states. Thus, an inexact design must consider this characteristic. In Design 1, the carry is simplified to Cin by altering value of other 8 outputs.

$$\overline{Carry} = C_{in} \tag{4}$$

Since the Carry output has the higher weight of a binary bit, an erroneous value will create a different value in the output .For instance, if the input sample is "01011" (stage 12 of Table II), the correct output is "101" that is equal to 5.

By modifying the carry output to Cin, the inexact 4-2 compressor will produce the "010" sample in the output (i.e. a value of 2). This significant difference may not be adequate; though, it can be compensated by modifying the Cout and

Sum signals. In particular, the generalization of sum to a value of 0 (second half of Table II) minimize the variation among the inexact and the exact outputs in addition to the difficulty of its design. As well, the presences of some erroneous values in the sum will fall out in a reduction of the overall delay.

$$\overline{Sum} = \overline{C_{in}} \, \overline{(x_1 \oplus x_2)} + \overline{(x_3 \oplus x_4)} \tag{5}$$



Figure 3 Gate level implementation of inexact compressor

The gate level structure of the first proposed design (Figure 3) shows the critical path delay remains same. Even though the above stated modification of carry and sum will increase the error rate in the proposed inexact compressor, the design difficulty, hence the power consumption and area are significantly reduced. Table II shows the truth table of the first proposed design of inexact 4-2 compressor.

TABLE II TRUTH TABLE OF INEXACT 4-2 COMPRESSOR

|     |    |    | <u> </u> |    |       |       | ILDDOI |                |
|-----|----|----|----------|----|-------|-------|--------|----------------|
| Cin | X4 | X3 | X2       | X1 | Cout' | Carry | Sum'   | Differe<br>nce |
| 0   | 0  | 0  | 0        | 0  | 0     | 0     | 1      | 1              |
| 0   | 0  | 0  | 0        | 1  | 0     | 0     | 1      | 0              |
| 0   | 0  | 0  | 1        | 0  | 0     | 0     | 1      | 0              |
| 0   | 0  | 0  | 1        | 1  | 0     | 0     | 1      | -1             |
| 0   | 0  | 1  | 0        | 0  | 0     | 0     | 1      | 0              |
| 0   | 0  | 1  | 0        | 1  | 1     | 0     | 0      | 0              |
| 0   | 0  | 1  | 1        | 0  | 1     | 0     | 0      | 0              |
| 0   | 0  | 1  | 1        | 1  | 1     | 0     | 1      | 0              |
| 0   | 1  | 0  | 0        | 0  | 0     | 0     | 1      | 0              |
| 0   | 1  | 0  | 0        | 1  | 1     | 0     | 0      | 0              |
| 0   | 1  | 0  | 1        | 0  | 1     | 0     | 0      | 0              |
| 0   | 1  | 0  | 1        | 1  | 1     | 0     | 1      | 0              |
| 0   | 1  | 1  | 0        | 0  | 0     | 0     | 1      | -1             |
| 0   | 1  | 1  | 0        | 1  | 1     | 0     | 1      | 0              |
| 0   | 1  | 1  | 1        | 0  | 1     | 0     | 1      | 0              |
| 0   | 1  | 1  | 1        | 1  | 1     | 0     | 1      | -1             |
| 1   | 0  | 0  | 0        | 0  | 0     | 1     | 0      | 1              |
| 1   | 0  | 0  | 0        | 1  | 0     | 1     | 0      | 0              |
| 1   | 0  | 0  | 1        | 0  | 0     | 1     | 0      | 0              |
| 1   | 0  | 0  | 1        | 1  | 0     | 1     | 0      | -1             |
| 1   | 0  | 1  | 0        | 0  | 0     | 1     | 0      | 0              |
| 1   | 0  | 1  | 0        | 1  | 1     | 1     | 0      | 1              |
| 1   | 0  | 1  | 1        | 0  | 1     | 1     | 0      | 1              |
| 1   | 0  | 1  | 1        | 1  | 1     | 1     | 0      | 0              |
| 1   | 1  | 0  | 0        | 0  | 0     | 1     | 0      | 0              |
| 1   | 1  | 0  | 0        | 1  | 1     | 1     | 0      | 1              |
| 1   | 1  | 0  | 1        | 0  | 1     | 1     | 0      | 1              |
| 1   | 1  | 0  | 1        | 1  | 1     | 1     | 0      | 0              |
| 1   | 1  | 1  | 0        | 0  | 0     | 1     | 0      | -1             |
| 1   | 1  | 1  | 0        | 1  | 1     | 1     | 0      | 0              |
| 1   | 1  | 1  | 1        | 0  | 1     | 1     | 0      | 0              |
| 1   | 1  | 1  | 1        | 1  | 1     | 1     | 0      | -1             |

As shown in Table II, the proposed inexact design has 12 incorrect outputs out of 32 outputs. This is less than the error rate using the best inexact full-adder cell. Therefore, the proposed design reduced the total number of gates(transistor count) and power consumption when compared to exact 4-2 compressor design [1].

## 5.2 New Metrics

In this section, a new metric is proposed for evaluating the consistency of an multiplier. Consider as an example the case in which the exact output Sum of an adder is "101101" and other values can result as inexact outputs. For example, both "101101" and "100101" represent inexact values. However, these two output values have different implications when compared to the correct value: "101101" means the output is different by 1 (or at a distance of 1) from the correct value, while "100101" is different from the correct value. So, an output can take erroneous values that are considerably different from the addition; for example, a lower bit error has less impact on the output of an adder.

• Error distance (ED) [9]: It is defined of the difference between the estimated outputs. Suppose A sequence of bits and B is the estimated bits, the error distance is given by

$$ED = |A - B| \tag{7}$$

• Normalized error distance (NED) [9]: It is the maen of error distance divided by the maximum possible errors.

NED = Mean(ED)/Max(ED) (8)

- **Pass Rate:** It is the number of correct outputs by the total number of outputs.
- **Error Rate:** It is defined as the number of incorrect outputs by the total number of outputs.

#### 6. IMAGE MULTIPLICATION

In this section, the proposed 4-2 compressor is used for multiplication of two images. A multiplication is obtained by three steps [4].

• Generation of Partial product.

• A Carry Save Adder (CSA) is used to reduce the partial products' by two rows.

• A final addition with Carry Propagation Adder (CPA) will result one row final value.

Generally the CSA plays a significant role in power consumption. Thus, the 4-2 compressors have been extensively used [3, 4] to speed up the CSA tree and decrease its power consumption. Use of inexact 4-2 compressor in the CSA tree of multiplier fallout in an inexact multiplier.

By using an  $8\times 8$  unsigned Dadda multiplier an inexact multiplier is proposed in image processing application. The proposed inexact multiplier uses

Step1: AND gate is utilized to generate all partial products.

Step2: Design of the exact 4-2 compressor, full adder and half adder are utilized in the CSA tree to reduce the partial products by 4 rows. Again, same adders and 4-2 compressor are utilized to reduce the four rows into two rows.

Step 3: CPA is used to calculate the final binary result.



Figure 4. Flow chart for image multiplication



Figure 5. Reduction circuitry of an 8×8 Dadda multiplier1



Figure 6.Reduction circuitry of an 8×8Dadda multiplier 2

Figure (5) shows the reduction circuitry of an exact multiplier design for n=8. In this design each partial product bit is represented by a dot. In this figure half-adders, full-adders and 4-2 compressors are used. An 8x8 Dadda multiplier uses 3 half-adders, 3 full-adders and 18 4-2 compressors are necessary in the reduction phase. In this paper, two cases are measured for inexact multiplier.

Case 1: Inexact design is used for all 4-2 compressors (Multiplier 1) in Figure (a).

Case 2: The inexact design is used for all 4-2 compressors (Multiplier 2) in Figure (b).

The aim of first inexact multiplier 1 is to reduce the number of gates and power consumption compared with an exact multiplier design; still results in high error distance (Quantization noise).

The second inexact multiplier 2 is proposed to decrease the error distance. Therefore, to improve image quality the Peak Signal to Noise Ratio (PSNR) is estimated that the use of inexact compressor design in the (n-1) least significant columns and exact compressor design in n most significant column will result in better solution in the error distance rate.

Therefore, multipliers are used to multiply two images on a pixel by pixel, thus merge the two images into a single image. The processed image quality is calculated by the peak signal noise ratio (PSNR); the PSNR quantifies the maximum possible power of signal and the power of an image with loss of precision following an extra process, such as compression and/or inexact computation. The PSNR is usually used to measure the quality of a reconstructive process involving information loss and is defined by the mean square error (MSE). Given an accurate image I and an image K generated by an inexact process, the MSE is defined as:

$$MSE = \left(\frac{1}{mn}\right) \sum_{\substack{i=0-m-1\\j=0-n-1}} [I(i,j) - K(i,j)]^2$$
(9)

The term m and n are the image dimensions and I(i,j) and K(i,j) are the exact and obtained values of each pixel respectively.

The PSNR is defined as:

$$PSNR = 10\log_{10}(\frac{MAXi^2}{MSE})$$
(10)

In (10), term  $MAX_i^2$  is the representation of maximum value of each pixel; for example, when a pixel is represented by 8 bits, then its maximum value is 255.

## 7. SIMULATION RESULTS

In this section, the designs of the proposed explained in Section 5 are evaluated. Two inexact multipliers design is also simulated and compared metrics (error rate and NED) and the design (power and number of gates) have been considered and designs.



(a)







(b)

Figure 7. Image multiplication (a) example 1, (b) example 2 (both using an exact multiplier)

Figure (7) shows the two images taken as an examples for multiplication. Both the images are multiplied with the help of inexact multiplier and compared with the exact multiplier outputs.

TABLE III Accuracy comparison for first example

| Design              | Avg. NED | Pass Rate<br>(%) | Error Rate<br>(%) |
|---------------------|----------|------------------|-------------------|
| Exact<br>compressor | 0        | 100              | 0                 |
| Multiplier 1        | 0.0053   | 30.35            | 69.65             |
| Multiplier 2        | 0.0022   | 65.35            | 34.65             |

 TABLE IV

 Accuracy comparison for second example

| Design              | Avg. NED | Pass Rate<br>(%) | Error Rate<br>(%) |
|---------------------|----------|------------------|-------------------|
| Exact<br>compressor | 0        | 100              | 0                 |
| Multiplier 1        | 0.0011   | 51.80            | 48.20             |
| Multiplier 2        | 0.0015   | 74.75            | 24.25             |



(a) (b) Figure 8. Image multiplication results for example1, (a) Multiplier 1, (b) Multiplier 2.

Table V Power consumption, Number of gates and PSNR comparison for first example

| Design                  | Power<br>Consumption(mW) | No.of<br>Gates | PSNR(dB) |
|-------------------------|--------------------------|----------------|----------|
| Exact 4-2<br>compressor | 59                       | 3,146          | 48.13    |
| Multiplier1             | 50                       | 1,919          | 15.22    |
| Multiplier2             | 57.83                    | 2,737          | 43.03    |



(a) (b) Figure 9. Image multiplication results for example2, (a) Multiplier 1, (b) Multiplier 2.

TABLE VI Power consumption, Number of gates and PSNR comparison for second example

| Design                  | Power<br>Consumption(mW) | No.of<br>Gates | PSNR(dB) |
|-------------------------|--------------------------|----------------|----------|
| Exact 4-2<br>compressor | 59                       | 3,146          | 48.13    |
| Multiplier1             | 50                       | 1,919          | 5.275    |
| Multiplier2             | 57.83                    | 2,737          | 45.03    |

Tables III and IV show that the accuracy comparison of the output image generated by multipliers. Multiplier 2 are very good in terms of accuracy. But from Tables V and VI shows that the power consumption and number of gate count of the output images generated by Multipliers 1, is very less when compared to exact multiplier and Multiplier 2. Consistently, Multiplier 1 has the worst PSNR among proposed designs. As discussed before, the proposed inexact multipliers have a higher error distance for very large and very small input values in the products. As a result the pixels that have high RGB model values such as of a white color or small RGB values such of black color, show a larger imprecision values than other pixels due to the estimated nature of the compressors. However, the error distance of Multiplier 2 is improved.

## 8. CONCLUSION

Two different inexact multipliers have been proposed in this paper to examine the performance of the inexact compressors for the abovementioned metrics for inexact multiplication. The inexact compressors used in the reduction module of a Dadda multiplier. The following conclusions can be drawn from the simulation results presented in this manuscript.

- The first proposed multipliers show a significant improvement in terms of power consumption and transistor count compared to an exact multiplier.
- The second proposed multiplier, achieve a better PSNR value when compared to multiplier 1, thus it feasible for most applications.

Therefore proposed designs may implement in other arithmetic circuits for applications in which inaccurate computing can be used.

## 9. REFERENCES

- C. Chang, J. Gu, M. Zhang, "Ultra Low-Voltage Low-Power CMOS 4-2 and 5-2 Compressors for Fast Arithmetic Circuits," IEEE Transactions on Circuits & Systems, Vol. 51, No. 10, pp. 1985-1997, Oct. 2004.
- [2] V. Gupta, D. Mohapatra, S. P. Park, A. Raghunathan, K. Roy, "IMPACT: IMPrecise adders for low-power approximate computing," Low Power Electronics and Design (ISLPED) 2011 International Symposium on. 1-3 Aug. 2011.
- [3] D. Radhakrishnan and A. P. Preethy, "Low-power CMOS pass logic 4-2 compressor for high-speed multiplication," in Proc. 43rd IEEE Midwest Symp. Circuits Syst., vol. 3, 2000, pp. 1296–1298.
- [4] Z. Wang, G. A. Jullien, and W. C. Miller, "A new design technique for column compression multipliers," IEEE Trans. Comput., vol. 44, pp. 962–970, Aug. 1995.
- [5] J. Gu, C. H. Chang, "Ultra Low-voltage, low-power 4-2 compressor for high speed multiplications," in Proc. 36th IEEE Int. Symp. Circuits Systems, Bangkok, Thailand, May 2003.
- [6] M. Margala and N. G. Durdle, "Low-power low-voltage 4-2 compressors for VLSI applications," in Proc. IEEE Alessandro Volta Memorial Workshop Low-Power Design, 1999, pp. 84–90.
- [7] B. Parhami, "Computer Arithmetic: Algorithms and Hardware Designs," 2nd edition, Oxford University Press, New York, 2010.

- [8] K. Prasad and K. K. Parhi, "Low-power 4-2 and 5-2 compressors," in Proc. of the 35th Asilomar Conf. on Signals, Systems and Computers, vol. 1, 2001, pp. 129– 133.
- [9] J. Liang, J. Han, F. Lombardi, "New metrics for the reliability of approximate and probabilistic adders," IEEE Trans. on Computers, vol. 63, no. 9, pp. 1760 -1771, 2013.
- [10] P. Kulkarni, P. Gupta, M. Ercegovac, "Trading accuracy for power with an Underdesigned Multiplier architecture," 24th InternationalConference on VLSI Design, 2011.
- [11] H.R. Mahdiani, A. Ahmadi, S.M. Fakhraie, C. Lucas, "Bio-Inspired imprecise computational blocks for efficient VLSI implementation of soft-computing applications," IEEE Transactions on Circuits and Systems, vol. 57 no. 4, 2010.
- [12] C.-H. Lin, I.-C. Lin, "High accuracy approximate multiplier with error correction," IEEE 31st International Conference on Computer Design (ICCD), 2013.
- [13] K. Bhardwaj, P.S. Mane, J. Henkel, "Power- and areaefficient Approximate Wallace Tree Multiplier for errorresilient systems,"15th International Symposium on Quality Electronic Design (ISQED), 2014.
- [14] C. Liu, J. Han and F. Lombardi, "A Low-Power, High-Performance Approximate Multiplier with Configurable Partial Error Recovery," DATE 2014, Dresten, Germany, 2014.