# Performance Evaluation and Synthesis of Vedic Multiplier

Umesh Akare Asst. Prof. PIGCE Nagpur T. V. More P.G. Scholar PIGCE Nagpur R. S. Lonkar P.G. Scholar PIGCE Nagpur

# ABSTRACT

Digital multipliers are the core components of all the digital signal processors (DSPs) and the speed of the DSP is largely determined by the speed of its multipliers. Higher throughput arithmetic operations are important to achieve the desired performance in many real-time signal and image processing applications. Minimizing power consumption for digital systems involves optimization at all levels of the design. This optimization includes the implemented technology, the circuit style and topology, the architecture and at the highest level the algorithms that are being implemented. Multiplier is not only a high delay block but also a major source of power dissipation. This work presents a systematic design methodology for fast and area efficient digital multiplier based on the Vertical and Crosswise algorithm of ancient Indian Vedic Mathematics. The performance of this Vedic multiplier is compared with the conventional and fast multipliers being used in practice.

#### **General Terms**

Multiplier, architecture, Vedic algorithms, booth, array and bypassing algorithms

## **Keywords**

Digital Multiplier; Optimization; Urdhva Tiryakbhayam; Vertical and crosswise algorithm, bypassing algorithms

## **1. INTRODUCTION**

The demand for high speed processing has been increasing as a result of expanding computer and signal processing applications. Higher throughput arithmetic operations are important to achieve the desired performance in many realtime signal and image processing applications [1]. One of the key arithmetic operations in such applications is multiplication and the development of fast multiplier circuit has been a subject of interest over decades. Reducing the time delay and power consumption are very essential requirements for many applications. Multiplication is an important fundamental function in arithmetic operations. Digital multipliers are the core components of all the digital signal processors (DSPs) and the speed of the DSP is largely determined by the speed of its multipliers. They are the most commonly used components in any digital circuit design. They are fast, reliable and efficient components that are utilized to implement any operation. Depending upon the arrangement of the components, there are different types of multipliers available [2].In the past multiplication was implemented generally with a sequence of addition, subtraction and shift operations. Two most common multiplication algorithms followed in the digital hardware are array multiplication algorithm and Booth multiplication algorithm. Due to the importance of digital multipliers in DSP, it has always been an active area of research. Multiplication-based operations such as Multiply and Accumulate(MAC) and inner product are among some of the frequently used Computation-Intensive Arithmetic Functions(CIAF) currently implemented in many Digital Signal Processing (DSP) applications such as convolution, Fast Fourier Transform(FFT), filtering and in microprocessors in its arithmetic and logic unit. Since multiplication dominates the execution time of most DSP algorithms, so there is a need of high speed multiplier. Currently, multiplication time is still the dominant factor in determining the instruction cycle time of a DSP chip [3]. The speed of multiplication operation is of great importance in DSP as well as in general processor. In many DSP algorithms, the multiplier lies in the critical delay path and ultimately determines the performance of algorithm. In many DSP algorithms, the multiplier lies in the critical delay path and ultimately determines the performance of algorithm. Urdhva Tiryakbhyam Sutra from vedic mathematics is a general multiplication formula applicable to all cases of multiplication. It literally means "Vertically and crosswise". It is based on a novel concept through which the generation of all partial products can be done with the concurrent addition of these partial products. The parallelism in generation of partial products and their summation is obtained using Urdhava Triyakbhyam explained in fig 1.

Here we present the multiplier based on an algorithm Urdhva Tiryakbhyam (Vertical & Crosswise) of ancient Indian Vedic Mathematics.

# 2. VERTICAL AND CROSSWISE

## 2.1 Vedic Mathematics

His Holiness Jagadguru Shankaracharya Bharati Krishna Teerthaji Maharaja (1884-1960) comprised all this work together and gave its mathematical explanation while discussing it for various applications. Swahiji constructed 16 sutras (formulae) and 16 Upa sutras (sub formulae) after extensive research in Atharva Veda. Obviously these formulae are not to be found in present text of Atharva Veda because these formulae were constructed by Swamiji himself. Vedic mathematics is not only a mathematical wonder but also it is logical. Vedic mathematics is dealing with various branches of mathematics like arithmetic, algebra, geometry etc. [4]. These methods and ideas can be directly applied to trigonometry, plain and spherical geometry, conics, calculus (both differential and integral), and applied mathematics of various kinds. This is a very interesting field and presents some effective algorithms which can be applied to various branches of engineering such as computing and digital signal processing.

# 2.2 Urdhva Tiryakbhyam

Urdhva Tiryakbhyam (Vertical & Crosswise) algorithm can be generalized for n x n bit number. This Multiplier has the advantage that has the number of bits increases, gate delay and area increases very slowly as compared to other multipliers. Therefore it is time, space and power efficient. It is demonstrated that this architecture is quite efficient in terms of silicon area/speed [5]. Since in this multiplier the partial products and their sums are calculated in parallel, the multiplier is independent of the clock frequency of the processor. Therefore the multiplier will require the same amount of time to calculate the product and hence is independent of the clock frequency. By adopting the Vedic multiplier, structure. Due to its regular structure, it can be easily layout in microprocessors and designers can easily circumvent this power of multiplier. It can easily be increased by increasing the input and output data bus widths since it has a quite a regular problems to avoid catastrophic device failures. The net advantage is that it reduces the need of microprocessors to operate at increasingly high clock frequencies. While at higher clock frequency generally results in increased processing power, its disadvantage is that it also increases power dissipation which results in higher device operating temperatures.

#### 2.3 Implementing Algorithm

To illustrate this multiplication scheme, let us consider the multiplication of two decimal numbers (325 \* 738). Line diagram for the multiplication is shown in Fig.1. The digits on the both sides of the line are multiplied and added with the carry from the previous step. This generates one of the bits of the result and a carry. This carry is added in the next step and hence the process goes on. If more than one line are there in one step, all the results are added to the previous carry. In each step, least significant bit acts as the result bit and all other bits act as carry for the next step. Initially the carry is taken to be zero. To make the methodology more clear, an

alternate illustration is given with the help of line diagrams in figure 1. where the dots represent bit "0" or "1". Vedic mathematics. Various tricks and short cuts are suggested to optimize the process. These methods are based on concept of Multiplication using deficits and excess and Changing the base to simplify the operation. Various methods of multiplication proposed in Vedic mathematics. 4x4 bit vedic multiplier using vertical and crosswise algorithm is described stepwise manner in the following section.

# 3. ALGORITHM FOR 4X4 BIT VEDIC MULTIPLIER

Vertical and crosswise algorithm can be understood clearly from the fig. 2. The systematic procedure can also be elaborated further stepwise as follows. CP = Cross Product (Vertically and Crosswise) X3 X2 X1 X0 - Multiplicand Y3 Y2 Y1 Y0 -Multiplier HGFEDCBA P7 P6 P5 P4 P3 P2 P1 P0 - Product PARALLEL COMPUTATION METHODOLOGY STEP 1. CP X0 = X0 \* Y0 = A $\mathbf{Y}\mathbf{0}$ STEP 2. CP X1 X0 = X1 \* Y0+X0 \* Y1 = B Y1 Y0 STEP 3. CP X2 X1 X0 = X2 \* Y0 + X0 \* Y2 + X1 \* Y1 = C Y2 Y1 Y0 STEP 4. CP X3 X2 X1 X0 = X3 \* Y0 + X0 \* Y3 + X2 \* Y1 +X1 \* Y2 = DY3 Y2 Y1 Y0



#### 325 X 738 = 239850

Fig 1: Multiplication of two decimal numbers by urdhva tiryakbhyam [5].

STEP 5. CP X3 X2 X1 = X3 \* Y1+X1 \* Y3+X2 \* Y2 = E Y3 Y2 Y1 STEP 6. CP X3 X2 = X3 \* Y2+X2 \* Y3 = F Y3 Y2 STEP 7. CP X3 = X3 \* Y3 = G Y3

| A3 A2 | A1 A'0     |
|-------|------------|
| x     | x          |
| B1 B0 | B1 B0      |
| A1 A0 | ]          |
| x     |            |
| B3 B2 |            |
|       | X<br>B1 B0 |

| Tuble IV Comparison of Maniphers       |                     |                     |                     |
|----------------------------------------|---------------------|---------------------|---------------------|
| COMPARISON                             | ARRAY<br>MULTIPLIER | BOOTH<br>MULTIPLIER | VEDIC<br>MULTIPLIER |
| TOTAL TIME<br>FOR<br>EXECUTION<br>(ns) | 32.001              | 16.276              | 6.216               |
| NUMBER OF<br>SLICES                    | 123                 | 58                  | 27                  |
| NUMBER OF<br>LUTS 4                    | 99                  | 28                  | 14                  |
| NUMBER OF<br>IO BUFFER                 | 32                  | 16                  | 16                  |

Fig 2: Algorithm of 4x4 bit vedic multiplier Table 1. Comparison of Multipliers

#### 4. IMPLEMENTATION

Block diagram of 4x4 Vedic Multiplier is given in fig. 3. Let's divide A and B into two parts, say A3 A2 & A1 A0 for A and B3B2 & B1B0 for B. Using the fundamental of Vedic multiplication, taking two bit at a time and using 2 bit multiplier block,

A3A2 A1A0

X B3B2 B1B0

#### Q7Q6Q5Q4Q3Q2Q1Q0

Each block as shown in fig. 3 above is 2x2 bits multiplier. First 2x2 multiplier inputs are A1 A0 and B1 B0.The last block is 2x2 multiplier with inputs A3 A2 and B3 B2. The middle one shows two, 2x2 bits multiplier with inputs A3A2 & B1B0 and A1A0 & B3B2. So the final result of multiplication, which is of 8 bit, Q7Q6Q5Q4Q3Q2Q1Q0 [6]. Let's analyze 4x4 multiplications, say A3A2A1A0 and B3B2B1B0. Following are the output line for the multiplication result, Q7Q6Q5Q4Q3Q2Q1Q0. RTL view is as shown in fig. 4. We have simulated 4X4 bit Vedic multiplier. In behavioral simulation we give "1010" (in decimal number system 10) and "1010" (decimal number system 10) as inputs and we get output as "01100100" (decimal number system 100) as shown in fig. 5

#### 5. CONCLUSION

Higher throughput arithmetic operations are important to achieve the desired performance in many real-time signal and image processing applications. Multiplier is not only a high delay block but also a major source of power dissipation. This work presents a systematic design methodology for fast and area efficient digital multiplier based on the Vertical and Crosswise algorithm known as Urdhava Triyakbhyam sutra of ancient Indian Vedic Mathematics. It is based on a novel concept through which the generation of all partial products can be done with the concurrent addition of these partial products. The parallelism in generation of partial products and their summation is obtained using Urdhava Triyakbhyam. The designs of Booth multiplier (4X4) and array multiplier (4X4) and 4X4 bits Vedic multiplier (4X4) based on vertical and crosswise algorithm have been implemented on Spartan XC3S200-5-pq208 device. The computation delay for 4X4 bits Booth multiplier was 16.276 ns and for 4X4 bits Array multiplier was 32.001 ns. Also computation delays for 4X4 bits Vedic multiplier was obtained as 6.276ns. It is therefore seen that the Vedic multipliers are much faster than the conventional multipliers. Our Vedic Multiplier is found much more efficient than of Array and booth multiplier in terms of execution time, logic timing and logic percentage, number of luts used, routing time, ,number slices used. This entire 4X4 bit various multiplier are implemented on Spartan XC3S2005-pq208 device. The comparative experimental results are obtained and displayed here in fig 6-9



Fig 3: Line diagram for 4\*4 bit binary multiplication [5].



Fig 4: RTL view of 4x4 bit vedic multiplier by modelsim



Fig 5: Simulation result of vedic 4\*4 bit multiplier



Fig 6: Comparison of logic percentage



Fig 7: Comparison on time of execution



Fig 8: Comparison on number of LUT's



Fig 9: Comparison on number of slices

#### 6. REFERENCES

- [1] Himanshu Thapliyal and Hamid R. Arabnia, "A Time-Area- Power Efficient Multiplier and Square Architecture Based On Ancient Indian Vedic Mathematics", Department of Computer Science, The University of Georgia, 415 Graduate Studies Research Center Athens, Georgia 30602-7404, U.S.A.
- [2] E. Abu-Shama, M. B. Maaz, M. A. Bayoumi, "A Fast and Low Power Multiplier Architecture", The Center for

Advanced Computer Studies, The University of Southwestern Louisiana Lafayette, LA 70504.

- [3] Purushottam D. Chidgupkar and Mangesh T. Karad, "The Implementation of Vedic Algorithms in Digital Signal Processing", Global J. of Engng. Educ., Vol.8, No.2 © 2004 UICEE Published in Australia.
- [4] S. Hong, S. Kim, M.C. Papaefthymiou, and W.E.Stark, .Low power parallel multiplier design for DSP applications through coefficient optimization., *in Proc. of Twelfth Annual IEEE Int. ASIC/SOConf.* Sep. 1999, pp. 286-290.
- [5] Ming-Chen Wen, Sying-Jyan Wang, and Yen-Nan Lin, Low PowerParallel Multiplier with Column Bypassing., *Electronics letters*, 10,12 May 2005 Volume 41, Issue Page(s): 581 -583.
- [6] Jagadguru Swami Sri Bharati Krishna Tirthji Maharaja, "Vedic Mathematics", Motilal Banarsidas, Varanasi, India, 1986.
- [7] Himanshu Thapliyal and M.B Srinivas, "VLSI Implementation of RSA Encryption System Using Ancient Indian Vedic Mathematics", Center for VLSI and Embedded System Technologies, International Institute of Information Technology Hyderabad-500019, India
- [8] Shripad Kulkarni, "Discrete Fourier Transform (DFT) by using Vedic Mathematics", report, vedicmathsindia.blogspot.com, 2007.