A Very Fast and Low Power Carry Select Adder Circuit
Samiappa Sakthikumaran1, S. Salivahanan, V. S. Kanchana Bhaaskaran2, V. Kavinilavu,
B. Brindha and C. Vinoth
Department of Electronics and Communication Engineering
SSN College of Engineering
Kalavakkam, Rajiv Gandhi Salai, (Off) Chennai
sakthikumaran87@gmail.com1
vskanchana@hotmail.com2
for the CSA structure by considering the carry input to be 0
and 1 respectively. The final sum and carry output is chosen
by the use of multiplexers [1]. The carry-out bit of the
preceding block of the adder acts as the select signal to the
multiplexer. The multipliers use adders in their final stage
and the speed performance of multipliers is determined by
the type of adders they actually employ [2] [3] in the
addition process. Reference [2] prefers the CSA for the
addition operation of the most significant bits (MSBs) of
multiplier. This is due to the fact that the addition can be
accomplished while awaiting the arrival of the carry signal
from the least significant bit region. The CSA thus combines
the advantage of reduced delay and area efficient architecture
when compared to conditional sum adder [4]. In this paper,
we employ a novel fast incrementer circuit in place of adders
to increment the interim sum when the carry in is obtained as
logic 1. This is proved to be more advantageous than the
conventional CSA method.
The rest of this paper is organized as follows. Section II
presents a review of the conventional adder structures.
Section III presents the proposed carry select adder
architecture. Section IV depicts the ASIC implementation
and the results of the proposed adder circuit are analyzed.
Section V concludes.
AbstractCarry Select Adder (CSA) is known to be the fastest
adder among the conventional adder structures. It is used in
many data processing units for realizing faster arithmetic
operations. In this paper, we present an innovative CSA
architecture. It employs a novel incrementer circuit in the
interim stages of the CSA. Validation of the proposed design is
done through design and implementation of 16, 32 and 64-bit
adder circuits. Comparisons with existing conventional fast
adder architectures have been made to prove its efficiency. The
performance analysis shows that the proposed architecture
achieves three fold advantages in terms of delay-area-power.
Keywords- Fast Adders, Carry select Adder, Carry save Adder,
Incrementer
I.
INTRODUCTION
The design of high-speed and low-power VLSI
architectures needs efficient arithmetic processing units,
which are optimized for the performance parameters, namely,
speed and power consumption. Adders are the key
components in general purpose microprocessors and digital
signal processors. They also find use in many other functions
such as subtraction, multiplication and division. As a result,
it is very pertinent that its performance augers well for their
speed performance. Furthermore, for the applications such as
the RISC processor design, where single cycle execution of
instructions is the key measure of performance of the circuits,
use of an efficient adder circuit becomes necessary, to realize
efficient system performance. Additionally, the area is an
essential factor which is to be taken into account in the
design of fast adders. Towards this end, high-speed, low
power and area efficient addition and multiplication has
always been a fundamental requirement of high-performance
processors and systems. The major speed limitation of adders
arises from the huge carry propagation delay encountered in
the conventional adder circuits, such as ripple carry adder
and carry save adder.
The main advantage of CSA is its reduced propagation
delay characteristics. This is realized by the use of parallel
stages that results from multiple pairs of ripple carry adder.
The ripple carry adders generate their interim sum and carry
II.
CONVENTIONAL ADDER CIRCUITS
Fig.1 shows the internal logic schematic of a carry select
adder constructed using the conventional 4-bit ripple carry
adder (RCA). The RCA uses multiple full adders to perform
addition operation. Each full adder inputs a carry-in, which is
the carry-out of the preceding adder. The CSA divides the
words to be added into blocks and forms two sums for each
block in parallel, one with assumed carry in (Cin) of 0 and
the other with Cin of 1. As shown in Fig. 1, the carry-out
from one stage of 4- bit RCA is used as the select signal for
the multiplexer. This selects the corresponding sum bit from
the next block of data. This speeds-up the computation
process of the adder. Thus, the carry select adder achieves
higher speed of operation at the cost of increased number of
devices used in the circuit. This in turn increases the area and
power consumed by the circuits of this type of structure.
___________________________________
978-1-4244 -8679-3/11/$26.00 2011 IEEE
273
Figure 1. Conventional 16-Bit Carry Select Adder
A carry save adder is used to compute the sum of three or
more bits in binary format. It is widely used in the final
stages of fast multipliers for summing the partial products to
give out the final value [5] [6]. The advantage of carry save
adder is that the sum is computed faster than the
conventional RCA. The carry save adder is better than the
conventional carry select adder, in terms of area and power
consumption while slower than carry select adder. In this
paper, we have compared the conventional carry save adder
with our proposed carry select adder. The results prove that
our adder is advantageous than the conventional adder in
terms of speed, area and low power consumption. Hence, this
makes it a good choice to replace the carry save adder
structure, in the final stages of fast multipliers. This improves
the speed of operation of the high performance VLSI circuits.
Figure 2. 4-Bit Basic Block
Figure 1 shows the conventional 16-bit carry select adder
with two RCA blocks each for every 4-bit groups. The first
4-bit RCA block adds the 4-bits assuming a 0 carry bit. On
the other hand, the second RCA adds the 4-bits with logic 1
carry bit. The final sum is obtained based on the carry bit
from the previous stage. Hence, it has a replicated RCA
block for every 4-bit group.
The proposed adder circuit is obtained by replacing the
second RCA block of Fig. 1 with the basic unit. The 4-bit
basic unit consists of the incrementer as shown in Fig. 2. The
4-bits A0 to A3 are applied to the basic unit as input. The
incremented outputs B1 to B4 along with the carry of C0 are
generated through the logic function as shown in Fig. 2.
III.
PROPOSED CARRY SELECT ADDER
The Boolean expressions depicting the 4-bit basic block
are listed below:
B1= ~A0;
B2=A1^A0;
C1=A1*A0;
C2=A2*A3;
C0=C1*C2;
C3=C1*A2;
B3=A2^C1;
B4=A3^C3;
It can be seen that the carry out (C0) of the block is
calculated in parallel along with B3 by using a parallel chain
of AND gates, whereas a series pattern of carry propagation
is used in RCA structure, which reduces the delay of
incrementing in CSA when compared with the conventional
RCA. Figure 3 shows the proposed 16-bit carry select adder,
which equally divides the word size of the adder into blocks
of 4-bit each.
Cin
Figure 3. Proposed 16-Bit Carry Select Adder
The least significant 4-bits are added using conventional
RCA, while other blocks are added in parallel along with the
given incrementer.
Once all the interim sums and carries are calculated, the
final sums are computed using multiplexers having minimal
delay. The multiplexer block receives the two sets of 5-bit
input (four sum bits and one carry bit each) and selects the
final sum based on the select input from the previous stage.
Use of the basic unit with the 10-to-5 multiplexer thus
achieves fast incrementing action with reduced device count.
Thus, the proposed CSA excels the conventional CSA
circuit in terms of speed by reducing the carry propagation
latency.
IV.
conventional and proposed adder structures in terms of delay,
area and power. The area indicates the total cell area of the
design; the total power is the sum of dynamic power, internal
power, net power and leakage power. The delay is the critical
path delay of the adder circuits.
The results depicted in Fig. 4 shows that the proposed
CSA has higher speed when compared to conventional CSA
and carry save adder. The marginal improvement in speed
increases with the rise in the word size of the adders. This
shows that the design can be very well incorporated into
complex VLSI Designs and DSP applications in order to
increase the operating speed of the circuits.
Figure 5 compares the adder circuits for the area
comparisons. It shows that the area of the conventional carry
select adder is more than the carry save adder whereas the
proposed circuit occupies a lesser area when compared to its
conventional adder counterparts. The quantum of area gain
achieved in the proposed circuit increases with the increase
in the word-size of the adders.
ASIC IMPLEMENTATION AND RESULT
The circuit design in this paper has been developed using
Verilog-HDL and synthesized in Synopsys Front-end Design
Vision tool using SAED 90nm generic library. Table I
exhibits the post layout simulation results of both the
Table I: Comparison of Adders for Delay, Area and Power
Word-Size
16-bit
32-bit
64-bit
Adder
Delay(ns)
Area(m2)
Power(mW)
Carry Save
1.19
1113.013
0.563
Carry Select
0.95
1231.995
0.589
Proposed CSA
0.85
877.911
0.49
Carry Save
2.29
2302.19
1.16
Carry Select
2.13
2578.5
1.22
Proposed CSA
1.62
1798.96
1.02
Carry Save
4.5
4680.543
2.36
Carry Select
4.24
5130.659
2.41
Proposed CSA
3.14
3641.067
2.07
275
V.
CONCLUSION
The proposed structure proves to be a easier solution for
improving the speed of carry select adder. The conventional
CSA suffers from the disadvantage of occupying more chip
area, which has been overcome using the proposed 4-bit
incrementer unit. The proposed unit is also found to consume
less power. The proposed carry select adder can be used to
speed up the final addition in parallel multiplier circuits and
other architectures which uses adder circuits. The structure
has been synthesized with Synopsys front-end bundle using
SAED 90nm technology.
Figure 4. Comparison of Adders for Delay
ACKNOWLEDGMENT
We are thankful to the Department of Electronics and
Communication Engineering, SSN College of Engineering,
Rajiv Gandhi Salai, Chennai for the support rendered in
carrying out this work.
Figure 5. Comparison of Adders for Area
REFERENCES
[1].
[2].
[3].
Figure 6. Comparison of Adders for Power
[4].
In addition to the realization of higher speed and lesser
area as discussed above, Fig. 6 depicts that the proposed
architecture consumes lesser power also, when compared
with the conventional CSA and carry save adders. The result
comparison shows that there is a proportional increase in the
gain of proposed circuit with the increase in the word size of
the adder. Thus, the proposed CSA overweighs both the
carry select adder and carry save adders in terms of area,
delay and power. Thus, it identifies itself a better alternative
for use in high speed, low power arithmetic architectures.
[5].
[6].
276
[Link], Carry-Select Adder, IRE Transactions on
Electronic Computers, Pp. 340-344, 1962.
A.K.W. Yeung and R.K. Yu, A self-timed multiplier with
optimized final adder, Univ. California Berkeley, Final Rep.,
CS 2921, Fall 1989.
C.S. Wallace, A suggestion for a fast multiplier, IEEE Trans.
on Computers, Vol.13, Pp, 14-17, 1964.
J. Skansky, Conditional-Sum Addition Logic, IRE Trans. on
Electronic Computers, EC-9, Pp. 226-231, 1960.
V.G. Oklobdzija, High-Speed VLSI Arithmetic Units: Adders
and Multipliers, in Design of High-Performance
Microprocessor Circuits, Book edited by [Link],
IEEE press, 2000.
[Link], Harish M Kittur and [Link] Kannan, ASIC
Implementation of Modified Faster Carry Save Adder,
European Journal of Scientific Research, Vol.42 No.1,
Pp.53-58, 2010