### A Novel Approach to Design High Speed Arithmetic Logic Unit Based On Ancient Vedic Multiplication Technique

Mr. Abhishek Gupta<sup>1</sup>, Mr. Utsav Malviya<sup>2</sup>, Prof. Vinod Kapse<sup>3</sup>

VLSI Technology & Embedded Systems, Department of Electronics & Communication M.tech-IV<sup>th</sup> semester<sup>[1]</sup>, Assistant Professor<sup>[2]</sup>, Head of the Department<sup>[3]</sup> Gyan Ganga Institute of Technology and Sciences Rajeev Gandhi Technical University, Jabalpur, Madhya Pradesh, India

Abstract: This paper is devoted for designing high speed arithmetic logic unit. All of us know that ALU is a module which can perform arithmetic and logic operations. The reason behind choosing this topic as a research work is that, ALU is the key element of digital processors like as microprocessors, microcontrollers, central processing unit etc. Every digital domain based technology depends upon the operations performed by ALU either partially or whole. That's why it highly required designing high speed ALU, which can enhance the efficiency of those modules which lies upon the operations performed by ALU. The speed of ALU greatly depends upon the speed of multiplier. There are so many multiplication algorithms exist now-a-days at algorithmic and structural level. Our work proved that Vedic multiplication technique is the best algorithm in terms of speed. Further we have seen that the conventional Vedic multiplication hard wares have some limitations. So to overcome those limitations a novel approach has been proposed to design the Vedic multiplier with the use of unique addition tree structure, which is used to add partially generated products. For designing the two bit Vedic multiplier conventional hardware of Vedic multiplier has been used. For designing the four and eight bit level Vedic multiplier divide and conquer approach has been used. After designing the proposed Vedic multiplier, it has been integrated into an eight bit module of arithmetic logic unit along with the conventional adder, subtractor, and basic logic gates. The proposed ALU is able to perform three different arithmetic and eight different logical operations at high speed. All of these operational submodules (adder, subtractor, multiplier and logical gates) have been designed as the combinatorial circuit. And for the synchronization of these operational sub-modules, the multiplexers which have been used to integrate these submodules in a single unit have been triggered by positive edge clock To design proposed arithmetic logic unit verilog hardware description language (HDL) has been used. For designing operational sub-modules data flow modeling and for integration purpose behavioral modeling style has been used. For this design the target FPGA which we have takes belongs to Virtex-2P (family), XC2VP2 (device), FG256 (package) with speed grade of -7. For synthesis purpose Xilinx synthesis tool (XST) of Xilinx ISE-9.2i has been used. The behavioral simulation purpose ISE simulator has been used.

The maximum combinational path delay of proposed multiplier is 11.886 ns. And the ALU that has been designed can operate at the maximum frequency of 741.455 MHZ.

**Keywords:** - Vedic Urdhva Triyambakam multiplication algorithm, Arithmetic Unit, Arithmetic Logic Unit, Addition tree structure.

#### I. Introduction

As all of us know that the Computation unit is main unit of any technology, which performs different arithmetic operations like as addition, subtraction and multiplication etc. also in some places it performs logical operations also like as and, or, invert, x-or etc. which is dominant feature in the digital domain based applications. ALU is the execution unit which does not only performs Arithmetic operations but also Logical operations. And that's why ALU is called as the heart of Microprocessor, Microcontrollers, and CPUs. No technology can exist, without those operations which are performed by ALU. Every technology uses works upon those operations either fully or partially which are performed by ALU. The block diagram of ALU is given below, where ALU has been implemented on FPGA tool



#### Figure 1.1 Block Diagram of ALU

Here the input interface to access ALU module is input switches on FPGA board, and after processing on the data the result can be seen from LCD output of FPGA. For multiplication purpose vedic Urdhva Triyambakam multiplication scheme has been used. Urdhva Triyakbhyam Sutra is a general multiplication formula applicable to all cases of multiplication. It literally means "Vertically and Crosswise". To illustrate this multiplication scheme, let us consider the multiplication of two decimal numbers ( $32 \times$ 44). The conventional methods already know to us will require 16 multiplication using Urdhva Triyakbhyam Sutra is shown in following figure. The Vedic multiplication algorithm for 2 digit decimal numbers is shown below:-



#### **Figure 1.2 Vedic Multiplication Technique**

On the basis of this the conventional vedic Multiplier hardware has been designed which is shown below for 4x4 Bit, Using the same approach N-Bit Multiplier can be introduced[10]:-



#### Figure 1.3 Conventional Four Bit Vedic Multiplier

But it has large carry propagation path delay which limits the speed of it. So to overcome this problem many methodologies have been introduced in which the latest and popular technique is to replace the conventional addition structure with carry save addition (CSA) structure. But we have seen that its speed is also limited because of intermediate processes followed by the CSA to convert three operands into two operands addition. Diagram of this is given following for 4-Bit level Multiplication [4]:



#### Figure 1.4 Four Bit Vedic Multiplier with CSA

#### **Proposed ALU Module** II.

Our Proposed 8x8 bit Arithmetic Unit is shown in the following:-





Here a and b are the two 8 bit inputs of our Arithmetic Unit. And other sections of the design are selfexplanatory.

For 2-Bit multiplication Conventional Vedic multiplication Hardware has been used. As at 2 bit level multiplication we have not to worry about the carry propagation path.



Diagram for Unique addition tree structure for partial product addition for 4 bit is given in the following:-

| <u>Aft</u>    | er ado<br><u>bits w</u> | lition ti<br>/ill Pas | <u>hese fo</u><br>is to re | ollowii<br>esult. | <u>ng</u> | <u>The</u><br>Pass t | ese fo<br><u>2 bit</u><br>to res | ollowing<br><u>s will</u><br>sult as it | <u>t is.</u>       |   |
|---------------|-------------------------|-----------------------|----------------------------|-------------------|-----------|----------------------|----------------------------------|-----------------------------------------|--------------------|---|
|               |                         |                       |                            | P0[3]             | P0[2]     | P0[1]                | P0[0                             | ]                                       |                    |   |
|               |                         | P1[3]                 | P1[2]                      | P1[1]             | P1[0]     |                      |                                  |                                         |                    |   |
|               |                         | P2[3]                 | P2[2]                      | P2[1]             | P2[0]     |                      |                                  |                                         |                    |   |
| P3[3]         | P3[2]                   | P3[1]                 | P3[0]                      | 1'B0              | 1'B0      |                      |                                  |                                         |                    |   |
| Q[7]          | Q[6]                    | Q[5]                  | Q[4]                       | Q[3]              | Q[2]      | Q[1]                 | Q[(                              | <u>)</u>                                |                    |   |
| Figure        | 2.3 P                   | ropos                 | ed Ad                      | lditio            | n Tre     | e Stru               | ctur                             | e of 4-l                                | Bit                |   |
|               |                         |                       | $\mathbf{N}$               | lultip            | lier      |                      |                                  |                                         |                    |   |
| Diagram       | for                     | Uniqu                 | ie ad                      | dition            | tree      | struct               | ure                              | for pa                                  | urtial             |   |
| product a     | dditic                  | on for                | 8 bit i                    | s give            | en in tl  | ne folle             | owin                             | g:-                                     |                    |   |
|               |                         |                       |                            | •                 |           |                      | - 1                              |                                         |                    |   |
|               |                         | After                 | addition                   | these for         | llowing   | bits                 |                                  | These fo                                | llowing 4 bits     |   |
|               |                         |                       | will p                     | ass to re         | sult      |                      |                                  | will pass                               | to result as it is | 5 |
|               |                         |                       |                            |                   | P0[7] P   | 0[6] P0[5]           | P0[4]                            | P0[3] P0[2                              | ] P0[1] P0[0]      |   |
|               |                         | P1[7]                 | P1[6] P <sup>.</sup>       | 1[5] P1[4         | ] P1[3] P | 1[2] P1[1]           | ] P1[0]                          |                                         |                    |   |
|               |                         | P2[7]                 | P2[6] P2                   | 2[5] P2[4         | ] P2[3] P | 2[2] P2[1]           | P2[0]                            |                                         |                    |   |
| P3[7] P3[6] I | P3[5] P3                | [4] P3[3]             | P3[2] P3                   | B[1] P3[0         | ] 1'b0 1' | b0 1'b0              | 1'b0                             |                                         | l                  |   |
|               |                         |                       |                            |                   |           |                      |                                  |                                         | <b>z</b>           |   |

Q[15]Q[14]Q[13]Q[12]Q[11]Q[10]Q[9]Q[8]Q[7]Q[6]Q[5]Q[4]Q[3]Q[2]Q[1]Q[0] Figure 2.4 Proposed Addition Tree Structure of 8-Bit Multiplier

Here the assignment of partial products P0, P1, P2, P3 has given from right to left at output of vedic N/2-Bit Vedic multiplier, where N shows the no. of bits in one input of multiplier. And also which addition tree structure we have designed is very simple to understand, design and implement. Here for the addition purpose the unique addition tree structure. The block diagram for 4 bit level multiplication shown below.



#### Figure 2.5 Block Diagram of Proposed 4-Bit Vedic Multiplier

Block diagram of 8-Bit Vedic Multiplier is shown below:-

b[7:4] a[7:4] b[3:0] a[7:4] b[7:4] a[3:0] b[3:0] a[3:0] Vedic 4 Bit Multiplier Adder Adder Adder Adder Adder Adder Adder Addition Tree structure Q[3:0]

#### Figure 2.6 Block Diagram of Proposed 8-Bit Vedic Multiplier

After designing the Arithmetic Unit, it has been incorporated into the ALU module. The block diagram of proposed ALU is given following, which is self explanatory in itself:-



Figure 2.7 Proposed ALU

For designing the Logical unit we have used the simple conventional logic gates and multiplexer has been used for integration purpose. It can be easily built by referring any standard book of digital electronics so it has not been discussed here.Control word for the proposed ALU is:-

CONTROL WORD OF ALU

|      | S [5] | S[4] | S    | 3]   | S | 5[2]      | S [1]           | S [0]     |            |
|------|-------|------|------|------|---|-----------|-----------------|-----------|------------|
|      |       |      |      |      |   |           |                 |           |            |
| S[5] | S[4]  | S[3] | S[2] | S[1] | 1 | S[0]      | OPERATIONS      |           | NS         |
|      |       |      |      |      |   | PERFORMED |                 |           | ED         |
| 0    | X     | x    | X    | 0    |   | 0         | ADD             | DITION (a | a,b)       |
| 0    | X     | x    | x    | 0    |   | 1         | SUBT            | RACTIO    | V (a,b)    |
| 0    | X     | X    | x    | 1    | Τ | 0         | MULTIF          | LICATIO   | DN (a,b)   |
| 0    | x     | x    | x    | 1    |   | 1         |                 | NOP = 0   |            |
| 1    | 0     | 0    | 0    | X    | Τ | х         | AND (a,b)       |           | )          |
| 1    | 0     | 0    | 1    | x    |   | х         | OR (a,b)        |           |            |
| 1    | 0     | 1    | 0    | X    |   | х         | N               | IOR (a,b  | )          |
| 1    | 0     | 1    | 1    | X    |   | X         | DATA BUFFER (a) |           | R (a)      |
| 1    | 1     | 0    | 0    | x    |   | х         | N.              | AND (a,ł  | <b>)</b> ) |
| 1    | 1     | 0    | 1    | X    |   | х         | X-OR (a,b)      |           | )          |
| 1    | 1     | 1    | 0    | x    |   | x         | X-NOR (a,b)     |           | b)         |
| 1    | 1     | 1    | 1    | X    |   | x         |                 | NOT (a)   |            |

Figure 2.8 Control Word

#### III. QUANTITATIVE RESULTS

Following table shows the area and timing constraints of proposed Vedic multiplier at different bit levels.

| N Bit<br>multiplier | Number<br>of LUT<br>used as | Number of<br>occupied<br>Slices | Total eq.<br>gate count<br>for design | Additional<br>JTAG gate<br>count for | Maximum<br>combinational<br>path delay |
|---------------------|-----------------------------|---------------------------------|---------------------------------------|--------------------------------------|----------------------------------------|
|                     | logic                       |                                 |                                       | design                               | (ns)                                   |
| 2-Bit               | 4                           | 2                               | 24                                    | 384                                  | 4.626ns                                |
| 4-Bit               | 31                          | 16                              | 230                                   | 768                                  | 8.387ns                                |
| 8-Bit               | 139                         | 70                              | 1073                                  | 1536                                 | 11.886ns                               |

## Multiplier

### IV. COMPARATIVE RESULTS

To show the efficiency of proposed Vedic multiplier at eight bit level, it has been compared with some other popular multiplier structures based on different multiplication algorithms at the eight bit level. For the comparison purpose some standard papers have been used. For true and reliable comparison, proposed multiplier has been implemented on the same platform of target FPGA, which has been used by the reference papers. Comparative tables are shown below:-

(a)In the following given table the target FPGA used belongs to Virtex 2P (family), XC2VP2 (device), FG 256 (Package), -7 (speed grade).

| Maximur           | Maximum Combination Path Delay for Different Multipliers at Eight<br>Level in Nano Seconds |                                     |                                   |                               |                             |          |
|-------------------|--------------------------------------------------------------------------------------------|-------------------------------------|-----------------------------------|-------------------------------|-----------------------------|----------|
| Karatsuba<br>[10] | Vedic<br>Karatsuba<br>[10]                                                                 | Modified<br>Booth<br>Wallace<br>[4] | Vedic with<br>Partitioning<br>[4] | Conventional<br>Vedic<br>[10] | Vedic<br>with<br>CSA<br>[4] | Proposed |
| 31.039            | 18 695                                                                                     | 15 815                              | 15 685                            | 15 418                        | 13.07                       | 11 886   |

## Table 4.1 Comparative Table 1 for Different Multipliers at 8-Bit Level

(b)In the following given table the target FPGA used belongs to Spartan 3 (family), XC3S50 (device), PQ 208 (Package), -4 (speed grade).

| Maximum Co   | Maximum Combination Path Delay for Different Multipliers at<br>Eight Level in Nano Seconds |                              |          |  |  |
|--------------|--------------------------------------------------------------------------------------------|------------------------------|----------|--|--|
| Array<br>[2] | Booth<br>[2]                                                                               | Conventional<br>Vedic<br>[2] | Proposed |  |  |
| 32.01        | 29.549                                                                                     | 21.679                       | 19.467   |  |  |

# Table 4.2 Comparative Table 2 for Different Multipliers at 8-Bit Level

By designing the proposed Vedic multiplier for the same reconfigurable hardware as shown in [4] and [2], make the comparison platform (hardware) independent, algorithmic, technique and approach based comparison. So by comparing with different multipliers at the same platform it can be concluded that the algorithm and approaches which has been proposed to design Vedic multiplier, in this thesis work, is better in comparison to the other popular algorithms and approaches shown in [4] and [2]. In [10] M. Ramalatha et.al. Have not shown that which target FPGA they have used to design their modules so we have compared our proposed multiplier design with our conventional target FPGA, which we have used to make the overall design of ALU. So by this it can be concluded that our proposed algorithm, approach and platform are better than [10].





3. Device utilization summary of proposed ALU:-

|                                                | Device Utilization Summary |           |             |         |  |  |  |  |
|------------------------------------------------|----------------------------|-----------|-------------|---------|--|--|--|--|
| Logic Utilization                              | Used                       | Available | Utilization | Note(s) |  |  |  |  |
| Number of Slice Flip Rops                      | 25                         | 2.816     | 1%          |         |  |  |  |  |
| Number of 4 input LUTs                         | 216                        | 2,816     | 7%          |         |  |  |  |  |
| Logic Distribution                             |                            |           |             |         |  |  |  |  |
| Number of occupied Slices                      | 116                        | 1.408     | 8%          |         |  |  |  |  |
| Number of Slices containing only related logic | 116                        | 116       | 100%        |         |  |  |  |  |
| Number of Slices containing unrelated logic    | 0                          | 116       | 0%          |         |  |  |  |  |
| Total Number of 4 input LUTs                   | 227                        | 2,816     | 8%          |         |  |  |  |  |
| Number used as logic                           | 216                        |           |             |         |  |  |  |  |
| Number used as a route-thru                    | 11                         |           |             |         |  |  |  |  |
| Number of bonded IOBs                          | 39                         | 140       | 27%         |         |  |  |  |  |
| IOB Flip Rops                                  | 16                         |           |             |         |  |  |  |  |
| Number of PPC405s                              | 0                          | 0         | 0%          |         |  |  |  |  |
| Number of GCLKs                                | 1                          | 16        | 6%          |         |  |  |  |  |
| Number of GTs                                  | 0                          | 4         | 0%          |         |  |  |  |  |
| Number of GT10s                                | 0                          | 0         | 0%          |         |  |  |  |  |
| Total equivalent gate count for design         | 2,133                      |           |             |         |  |  |  |  |
| Additional JTAG gate count for IOBs            | 1.872                      |           |             |         |  |  |  |  |

4. Simulation results of proposed ALU as per control word of ALU:-





### VI. Conclusion

We have proposed a new technique to design Vedic multiplier using unique addition tree structure, which gives better response in terms of speed in comparison to the conventional vedic multiplier hardware, Vedic multiplier with partitioning, Vedic multiplier with carry save adder, Modified Booth Wallace , Karatsuba, Vedic Karatsuba, Array, Booth, Wallace multiplier. And then this multiplier module has been put in the ALU along with conventional modules.

#### References

- S G Dani, "Ancient Indian Mathematics A Conspectus\*", GENERAL\_ ARTICLE, RESONANCE, March 2012, Springer Link.
- [2]. Pushpalata Verma, K.K. Metha, "Implementation of an Efficient Multiplier based on Vedic Mathematics Using EDA Tool", International journal of engineering and advance technology, ISSN: 2249-8958, Volume-1, Issue-5, June-2012.
- [3]. R.Bhaskar, Ganapathi Hegde, P.R.Vaya," An efficient hardware model for RSA Encryption system using Vedic

mathematics". Procedia Engineering Volume 30, 2012, Pages 124–128, SciVerse Science Direct, ELSEVIER.

- [4]. Devika Jaina, Kabiraj Sethi, Rutuparna Panda, "Vedic Mathematics Based Multiply Accumulate Unit", 978-0-7695-4587-5/11 \$26.00 © 2011 IEEE.
- [5]. Balpande,S., Akare, U., Lande, S.," Performance Evaluation and Synthesis of Multiplier Used in FFT Operation Using Conventional and Vedic Algorithms ". 19-21 Nov.2010, IEEE.
- [6]. Syed Azman bin Syed Ismail\*, Pumadevi a/p Siva subramniam, "Multiplication with the Vedic Method", Procedia Social and Behavioral Sciences 8 (2010) 129– 133, SciVerse Science Direct, ELSEVIER.
- [7]. Ashish Raman, Anvesh Kumar and R.K. Sarin, "Small area reconfigurable FFT design by Vedic Mathematics", 26-28 Feb. 2010, IEEE.
- [8]. Anvesh kumar, Ashish raman, "Low Power ALU Design by Ancient Mathematics", 978-1-4244-5586-7/10, 2010 IEEE.
- [9]. Parth Mehta, Dhanashri Gawali, "Conventional versus Vedic Mathematical Method for Hardware Implementation of a Multiplier", December 2009, pp. 640-642 IEEE.
- [10]. M. Ramalatha, K. Deena Dayalan, P. Dharani, S. Deborah Priya, "High Speed Energy Efficient ALU Design using Vedic Multiplication Techniques", 978-1-4244-3834-1/09, 2009 IEEE.
- [11]. Harpreet Singh Dhillon and AbhijitMitra, "A Reduced- Bit Multiplication Algorithm for Digital Arithmetics", International Journal of Computational and Mathematical Sciences 2:2,2008 © www.waset.org Spring 2008.
- [12]. Honey Durga Tiwari, Ganzorig Gankhuyag, Chan Mo Kim, Yong Beom Cho, "Multiplier design based on Ancient Vedic Mathematics", 978-1-4244-2599-008/\$25.00 © 2008, IEEE.
- [13]. Shamim Akhter, "VHDL Implementation of Fast NxN Multiplier Based on Vedic Mathematics", Jaypee Institute of Information Technology University, Noida, 201307 UP, INDIA, 2007 IEEE.
- [14]. Himanshu Thapliyal, Saurabh Kotiyal and M. B Srinivas, "Design and Analysis of A Novel Parallel Square and Cube Architecture Based On Ancient Indian Vedic Mathematics", Centre for VLSI and Embedded System Technologies, International Institute of Information Technology, Hyderabad, 500019, India, 2005 IEEE.
- [15]. HimanshuThapliyal and M.B Srinivas, "A High Speed and Efficient Method of Elliptic Curve Encryption Using Ancient Indian Vedic Mathematics", Proceedings of the 8<sup>th</sup> MAPLD Conference (NASA office of Logic Design), Washington D.C, USA, Sep 2005.
- [16]. Purushottam D. Chidgupkar, Mangesh T. Karad, "The Implementation of Vedic Algorithms in Digital Signal Processing\*", Global J. of Engng. Educ., Vol.8, No.2 © 2004 UICEE Published in Australia 4th Global Congress on Engineering Education, held in Bangkok, Thailand, from 5 to 9 July 2004.
- [17]. Amartya Kumar Dutta, "Mathematics in Ancient India", SERIES ARTICLE, April 2002 SpringerLink.
- [18]. Jagadguru Swami Sri Bharati Krishna Tirthji Maharaja, "Vedic Mathematics", MotilalBanarsidas, Varanasi, India, 1986, Book.
- [19]. Sameer Palnitkar, "Verilog HDL-A Guide to Digital Design and Synthesis", Sun Soft Press, 1996, Book.
- [20]. Morris Mano, "Digital circuits and systems", Prentice-Hal, Last updated on-02/27/2011, 11:55, Book. Wayne Wolf, "Computer as components", 2<sup>nd</sup> edition, Book.
- [21]. http://www.xilinx.com/support/documentation/data\_sheets/ ds312.pdf
- [22]. http://en.wikipedia.org/wiki/Verilog
- [23]. http://www.xilinx.com/prs\_rls/2007/software/0786\_ise92i. htm