Full-Custom vs. Standard-Cell Design Flow – A Quantitative Adder Comparison (original) (raw)

Full-custom vs. standard-cell design flow - an adder case study

2003

Full-custom design is considered superior to standard-cell design when a high-performance circuit is requested. The structured routing of critical wires is considered to be the most important contributor to this performance gap. However, this is only true for bitsliced designs, such as ripple-carry adders, but not for designs with inter-bitslice interconnections spanning several bitslices, such as tree adders and reduction-tree multipliers. It is found that standard-cell design techniques scale better with the data width than full-custom bitsliced layouts for designs dominated by inter-bitslice interconnections.

Design and Analysis of an Efficient Full Adder Using Systematic Cell Design Methodology

ICTACT Journal on Microelectronics, 2017

In this paper, a high performance and low power full adder using Systematic Cell Design Methodology (SCDM) is explained. The design is initially executed for 1 bit and afterward reached out to 4 bit too. The circuit was implemented using Mentor Graphics tools at 180 nm technology. The performance parameters like average propagation delay, average power and Power Delay Product (PDP) are compared with existing hybrid adders like SRCPL adder and DPL adder. The proposed adder has less number of transistors in the critical path leading to less propagation delay. The utilization of transmission gate all through the design guarantees high driving ability and full voltage swing at the output. The proposed adder is observed to work productively when compared with different adders in terms of average power, average propagation delay and PDP. The Schematic Driven Layout of the proposed adder is obtained using Mentor Graphics IC station and the physical verifications are done using Calibre tool.

Layout Designing for Different Full Adder Topologies at 0.18μm Technology Node

Abstract: In this paper VLSI layout designing and optimization techniques for different full adder topologies like Complementary MOSFET full adder (CMOS),Transmission gate full adder(TGA),Complementary Pass transistor Logic full adder(CPL) and Domino Full Adder has been discussed. Power consumption and propagation delay are the major issue for low voltage level circuit application designing in recent years. Full adders are the very important circuit element for calculating the basic four mathematical operations (addition, subtraction, multiplication and division) functions in Integrated circuits. In VLSI systems such as microprocessors and application specific DSP architecture are using different full adders for calculating mathematical operations. In this paper one bit full adders topologies has been used for analysis. The layout designing of the basic logic gates and different full adders is done using L-Edit v14.11 Tanner EDA tool using 0.18μm technology node. The results shows that parameters like Power consumption and total propagation delay, for a particular aspect ratio the Transmission Gate Full adder consume less power and having less propagation delay. Keywords: CMOS full Adder, CPL full adder, layout designing, Domino Logic Full adder, L-Edit v14.11 Tanner EDA tool, Transmission Gate Full adder.

A Comparative Analysis on Parameters of Different Adder Topologies

IRJET, 2022

Due to their widespread use in the effective implementation of fundamental binary arithmetic, adders are an essential component in digital integrated design. A basic adder topology unquestionably requires higher working speeds, tolerable power consumption, and significantly less chip area. An extensive comparative examination of many modern adder topologies is provided in the current paper. The in-depth analysis of the adder that is described in this paper aims to make it easier to choose an adder topology for any digital design while balancing the trade-offs between area, propagation delay, and power dissipation. In this paper, a thorough comparison of four distinct adder topologies-the Ripple Carry Adder, Carry Save Adder, Carry Skip Adder, and Carry Select Adder-is made on basis of a number of design metrics and performance factors.

A Design Method for Heterogeneous Adders

Lecture Notes in Computer Science, 2007

The performance of existing adders varies widely in their speed and area requirements, which in turn sometimes makes designers pay a high cost in area especially when the delay requirements exceeds the fastest speed of a specific adder, no matter how small the difference is. To expand the design space and manage delay/area tradeoffs, we propose new adder architecture and a design methodology. The proposed adder architecture, named heterogeneous adder, decomposes an adder into blocks (sub-adders) consisting of carry-propagate adders of different types and precision. The flexibility in selecting the characteristics of sub-adders is the basis in achieving adder designs with desirable characteristics. We consider the area optimization under delay constraints and the delay optimization under area constraints by determining the bit-width of sub-adders using Integer Linear Programming. We demonstrate the effectiveness of the proposed architecture and the design method on 128bit operands.

Towards optimal performance-area trade-off in adders by synthesis of parallel prefix structures

Proceedings of the 50th Annual Design Automation Conference, 2013

This paper proposes an efficient algorithm to synthesize prefix graph structures that yield adders with the best performancearea trade-off. For designing a parallel prefix adder of a given bit-width, our approach generates prefix graph structures to optimize an objective function such as size of prefix graph subject to constraints like bit-wise output logic level. Besides having the best performance-area trade-off our approach, unlike existing techniques, can (i) handle more complex constraints such as maximum node fanout or wirelength that impact the performance/area of a design and (ii) generate several feasible solutions that minimize the objective function. Generating several optimal solutions provides the option to choose adder designs that mitigate constraints such as wire congestion or power consumption that are difficult to model as constraints during logic synthesis. Experimental results demonstrate that our approach improves performance by 3% and area by 9% over even a 64-bit full custom designed adder implemented in an industrial highperformance design.

Design Space Exploration of Split-Path Data Driven Dynamic Full Adder

This paper presents the design, the analysis and the complete characterization of a novel split-path Data Driven Dynamic (sp-D3L) full adder cell in IBM's 65 nm CMOS process. The split path D3L design style derived from standard D3L allows the design of high speed dynamic circuits without the power overhead of the clock tree while providing significantly higher performance than the D3L due to reduced capacitance at the pre-charge node. To demonstrate the performance benefits of the new split-path dynamic approach, we present comparison of the proposed adder with conventional static and dynamic adder cells. All the adder circuits were characterized for speed, power, area, noise margins, supply voltage scaling as well as fan-out capabilities. To evaluate the combined impact of load driven by the adder and load presented by the adder to the driving circuit, a combined fan-infan-out analysis with varying loads was also performed. Monte Carlo simulations were performed to evaluate the reliability of the adder design against random process, voltage and temperature variations. To compare with state of the art, we also performed a comparison of our proposed adder with several low power as well as high performance adders proposed recently in literature. Furthermore, to simulate the behavior of the adder in data path elements, we built ripple carry adders of varying lengths using the proposed adder. The new design was found to achieve from 16% to 27% performance advantages over its static and dynamic counterparts at nominal supply voltage. With supply voltage scaled from 1 V to 0.8 V, the adder shows 12%, 34% and 39% PDP advantage over domino, static and conventional D3L designs respectively. Fan-out analysis showed the adder to perform with 11% to 41% better PDP than the others at worst case FO32 loading.

Polynomial Time Algorithm for Area and Power Efficient Adder Synthesis in High-Performance Designs

—Adders are the most fundamental arithmetic units, and often on the timing critical paths of microprocessors. Among various adder configurations, parallel prefix adders provide the best performance vs. power/area trade-off, especially for higher bit-widths. With aggressive technology scaling, the performance of a parallel prefix adder, in addition to the dependence on the logic-level, is determined by wire-length and congestion which can be mitigated by adjusting fan-out. This paper proposes a polynomial-time algorithm to synthesize n bit parallel prefix adders targeting the minimization of the size of the prefix graph with log 2 n logic level and any arbitrary fan-out restriction. A structure aware prefix node cloning is then applied to the resultant prefix adder solutions to further optimize the size of the prefix graphs. The design space exploration by our approach provides a set of pareto-optimal solutions for delay vs. power trade-off, and these pareto-optimal solutions can be used in high-performance designs instead of picking from a fixed library (Kogge–Stone, Sklansky, etc.). Experimental results demonstrate that our approach: 1) excels highly competitive industry standard Synopsys design compiler adder, regular adders such as Sklansky adder and Kogge–Stone adder, and a highly run-time/memory intensive recent algorithm in 32 nm technology node and 2) improves performance/area over even 64 bit custom designed adders targeting 22 nm technology library and implemented in an industrial high-performance design.

Physical vs. Physically-Aware Estimation Flow: Case Study of Design Space Exploration of Adders

2014

Selecting an appropriate estimation method for a given technology and design is of crucial interest as the estimations guide future project and design decisions. The accuracy of the estimations of area, timing, and power (metrics of interest) depends on the phase of the design flow and the fidelity of the models. In this research, we use design space exploration of low-power adders as a case study for comparative analysis of two estimation flows: Physical layout Aware Synthesis (PAS) and Place and Route (PnR). We study and compare post-PAS and post-PnR estimations of the metrics of interest and the impact of various design parameters and input switching activity factor (αI). Adders are particularly interesting for this study because they are fundamental microprocessor units, and their design involves many parameters that create a vast design space. We show cases when the post-PAS and post-PnR estimations could lead to different design decisions, especially from a low-power designer point of view. Our experiments reveal that post-PAS results underestimate the side-effects of clock-gating, pipelining, and extensive timing optimizations compared to post-PnR results. We also observe that PnR estimation flow sometimes reports counterintuitive results DesignfParameters: HDLfcodefuCG able 6fAF6fn S O6fT CLK Metrics offInterestf upost-PASO

Design of high-speed low-power parallel-prefix adder trees in nanometer technologies

This paper presents a novel approach to design high-speed low-power parallel-prefix adder trees. Sub-circuits typically used in the design of parallel-prefix trees are deeply analyzed and separately optimized. The modules used for computing the group propagate and generate signals have been designed to improve their energy-delay behavior in an original way. When the ST 45 nm 1 V CMOS technology is used, in comparison with conventional implementations, the proposed approach exhibits computational delay with mean value and standard deviation up to 40% and 48% lower and achieves energy consumption with mean value and standard deviation up to 57% and 40% lower.