Power aware setup timing optimization in physical design of ASICs (original) (raw)
Related papers
An Optimized Power Performance and Area in ASIC Physical Design
International Journal of Electronics, Electrical and Computational System, 2017
For Moore’s law to continue to be pragmatically valid, new process technologies must provide more than the projected increases in density, chip capacity, chip level performance or, performance vs. power-improvement which has been increasingly difficult to achieve. This paper deals with study and implementation of practices to get better PPA in ASIC physical design which is applicable to all digital circuits, both combinational and sequential. Various Place and Route techniques are used to achieve this using Cadence’s SOCE. The general Place and Route flow involves Floor planning, Power planning, Placement, CTS, Routing, Parasitic extraction, Timing and Power analysis. Apart from these stages, there are intermediate stages which allow for timing optimizations. A reference block is chosen and multiple experiments are performed with flow variations at each of Place and Route stage targeting PPA and frequency of 1.4GHz using 14nm technology. Generalized flow is tweaked to achieve the better PPA. All experimental data captured and concluded.
A Novel Net Weighting Algorithm for Power and Timing-Driven Placement
Nowadays, many new low power ASICs applications have emerged. This new market trend made the designer's task of meeting the timing and routability requirements within the power budget more challenging. One of the major sources of power consumption in modern integrated circuits (ICs) is the Interconnect. In this paper, we present a novel Power and Timing-Driven global Placement (PTDP) algorithm. Its principle is to wrap a commercial timing-driven placer with a nets weighting mechanism to calculate the nets weights based on their timing and power consumption. The new calculated weight is used to drive the placement engine to place the cells connected by the critical power or timing nets close to each other and hence reduce the parasitic capacitances of the interconnects and, by consequence, improve the timing and power consumption of the design. This approach not only improves the design power consumption but facilitates also the routability with only a minor impact on the timing closure of a few designs. The experiments carried on 40 industrial designs of different nodes, sizes, and complexities and demonstrate that the proposed algorithm is able to achieve significant improvements on Quality of Results (QoR) compared with a commercial timing driven placement flow. We effectively reduce the interconnect power by an average of 11.5% that leads to a total power improvement of 5.4%, a timing improvement of 9.4%, 13.7%, and of 3.2% in Worst Negative Slack (WNS), Total Negative Slack (TNS), and total wirelength reduction, respectively.
Power-aware hold optimization for ASIC physical synthesis
Hold timing closure is an important milestone at the physical design phase of every Application Specific Integrated Circuit (ASIC). Many approaches have been proposed by different researchers and commercial Electronic Design Automation (EDA) providers to fix hold timing violations, but there has been no effort to study the impact of each technique on power consumption. Nowadays, the rise of low power applications demand keeps pushing for the invention of new power reduction techniques. In this paper, we presented a novel approach for power consumption reduction by reducing the power increase seen during the hold timing optimization. A sample of 100 industrial post-CTS designs from different applications and fabrication process technologies (from 180 nm to 28 nm) was used to measure the ratios of Δpower/Δhold_timing and Δarea/Δhold_timing of each technique. The ratios were calculated after legalization and global routing to include not only the power/area added directly by the hold optimization, but also the power/area increases induced indirectly by the additional timing fixes needed after placement legalization and routing repair. By considering the impact on power consumption and area increase of each technique while optimizing the design we have reduced substantially the power increase and the area overhead caused by the hold fixing. Experimental results show a power reduction of 7%, and an area reduction of 1% on average, with a beneficial impact on hold timing and a neutral impact on setup timing.
An Efficient Timing and Clock Tree Aware Placement Flow with Multibit Flip-Flops for Power Reduction
Communications in computer and information science, 2017
Power consumption has become a bottleneck for modern system-on-chip (SoC) designs. With the advancement towards the deep sub-micron technology, the SoC design consists of components that prompt to a higher power density. In VLSI designs, the performance of an integrated circuit (IC) is governed by the frequency of the clock at which it operates, thus clocking is the major source of power dissipation in a design. Designing clock network is a critical task for high-performance circuits as it directly impacts clock skew, jitter, chip power and area of SoC under process variations. Multi-bit flip-flops (MBFFs) have appeared as a low-power solution for the nanometer technology. The number of clock sinks reduces during clock tree synthesis (CTS) with the application of MBFFs. As a result, the clock network shows increment in core utilization, improvement in routing, reduction in power consumption and timing violations. The clock insertion delay (CID) is another key metric of clock network and decreasing CID results in shorter clock network, less impact on crosstalk, less impact of process variation, and reduction in hold penalties. This work introduces a novel placement strategy in integration with the electronic design automation (EDA) tool for MBFF generation having the prerequisite knowledge of clock tree architecture. The strategy irrespective of traditional placement flow consists of MBFFs that are generated by replacing single-bit FFs iteratively during placement. FF merging and MBFF generation algorithm have been proposed. The approach is made timingaware with useful skew optimization. Experiment results show improvement in chip power by 44%, core density by 11.3% and clock power by 10.4%. In addition to the above, another algorithm for minimizing the CID of the design has been proposed. This algorithm splits up the clock tree sinks with maximum CID to a separate pool, after the deep analysis of the clock tree structure. It also takes into account the floorplan of the chip, placement pin and the macro placement changes on the sinks. The results show that the average CID reduces by 9.2%. Certificate This is to certify that the thesis titled "An Efcient Timing and Clock Tree Aware Placement ow with Multibit Flip-Flop Generation for Power Reduction" submitted by Jasmine Kaur Gulati to Indraprastha Institute of Information Technology, Delhi for the award of the Master of Technology in Electronics and Communication & Engineering is an original research work carried out by her under my guidance and supervision. The results enclosed in the thesis have not been submitted in any other university or institute for the reward of any other degree.
International Journal For Research In Applied Science & Engineering Technology, 2020
To meet the requirements of consumers the portable electronic devices are embedded with advanced integrated System on Chip (SoC) Circuits. The complex SoC's are power hungry and needs power optimization at various levels of the chip design. Based on the observation of the power consumption, the optimization has become a real issue, and may also be the limiting factor of future growth. This paper provides the details of different types of power dissipation and their major causes. Further, the paper focus on the different aspects in which power can be optimized. The beginner gets an idea during the design flow what are the causes of power consumption and at which level of abstraction need to be concentrated to reduce power. It also provides advantage and disadvantages associated with power optimization. And summary describes which abstraction levels results in how much power savings and error percentage.
Total Power Optimization Combining Placement, Sizing and
2015
Power dissipation is quickly becoming one of the most important limiters in nanometer IC design for leakage increases exponentially as the technology scaling down. However, power and timing are often conflicting objectives during optimization. In this paper, we propose a novel total power optimization flow under performance constraint. Instead of using placement, gate sizing, and multiple-Vt assignment techniques independently, we combine them together through the concept of slack distribution management to maximize the potential for power reduction. We propose to use the linear pro-gramming (LP) based placement and the geometric programming (GP) based gate sizing formulations to improve the slack distribu-tion, which helps to maximize the total power reduction during the Vt-assignment stage. Our formulations include important practi-cal design constraints, such as slew, noise and short circuit power, which were often ignored previously. We tested our algorithm on a set of industria...
Enhancing sensitivity-based power reduction for an industry IC design context
Integration, 2019
For many years, discrete gate sizing has been widely used for timing and power optimization in VLSI designs. The importance of gate sizing optimization has been emphasized by academia for many years, especially since the 2012/2013 ISPD gate sizing contests [1, 2]. These contests have provided practical impetus to academic sizers through the use of realistic constraints and benchmark formats. At the same time, due to simplified delay/power Liberty models and timing constraints, the contests fail to address real-world criteria for gate sizing that are highly challenging in practice. We observe that lack of consideration of practical issues such as electrical and multi-corner constraints-along with limited sets of benchmarks-can misguide the development of contestfocused academic sizers. Thus, we study implications of the "gap" between academic sizers and product design use cases. In this paper, we note important constraints of modern industrial designs that are generally not comprehended by academic sizers. We also point out that various optimization techniques used in academic sizers can fail to offer benefits in product design contexts due to differences in the underlying optimization formulation and constraints. To address this gap, we develop a new robust academic sizer, Sizer, from a fresh implementation of Trident [3]. Experimental results show that Sizer is able to achieve up to 10% leakage power and 4% total power reductions compared to leading commercial tools on designs implemented with foundry technologies, and 7% leakage power reduction on a modern industrial design in the multi-corner multi-mode (MCMM) context.
Timing-Aware Power-Noise Reduction in Placement
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2000
We describe a placement-level decoupling capacitance (decap) insertion technique whose objective is to reduce power noise, taking into account circuit timing. Our approach consists of prediction and correction steps. Before placement, we estimate the power noise of each cell considering switching frequency of cells that, after placement, will most likely be in the neighborhood. If a frequently switching cell has neighbors that switch infrequently, it is unlikely that this cell will suffer from a power-noise problem. Based on the cell power-noise estimation, we add decap padding to each cell. Then, we invoke a standard cell placement tool and perform power grid analysis. We eliminate the power grid noise by gate sizing. Our technique can allocate decaps to improve power noise, power consumption, and timing. We propose two gate-sizing algorithms. The first one uses a sequence of linear programs (SLP) formulation, and the second one uses a budgeting-based heuristic algorithm. The SLP algorithm can produce better power-noise results than the heuristic, at the expense of runtime. Experimental results show that our techniques can effectively reduce power noise and still meet timing constraints.
Proceedings of the 42nd annual conference on Design automation - DAC '05, 2005
Lowering power is one of the greatest challenges facing the IC industry today. We present a power-aware placement method that simultaneously performs (1) activity-based register clustering that reduces clock power by placing registers in the same leaf cluster of the clock trees in a smaller area and (2) activity-based net weighting that reduces net switching power by assigning a combination of activity and timing weights to the nets with higher switching rates or more critical timing. The method applies to designs with multiple clocks and gated clocks. We implemented the method and obtained experimental results on 8 real-world designs after placement, routing, extraction and analysis. The poweraware placement method achieved on average 25.3% and 11.4% reduction in net switching power and total power respectively, with 2.0% timing, 1.2% cell area and 11.5% runtime impact. This method has been incorporated into a commercial physical design tool.
Simultaneous delay and power optimization in global placement
2004 IEEE International Symposium on Circuits and Systems (IEEE Cat. No.04CH37512)
Delay and power minimization are two important objectives in the current circuit designs. Retiming is a very effective way for delay optimization for sequential circuits. In this paper we propose a framework for multi-level global placement with retiming, targeting simultaneous delay and power optimization. We propose GEO-P for power optimization and GEO-PD algorithm for simultaneous delay and power optimization and provide smooth wirelength, power and delay tradeoff. In GEO-PD, we use retiming based timing analysis and visible power analysis to identify timing and power critical nets and assign proper weights to them to guide the multi-level optimization process. We show an effective way to translate the timing and power analysis results from the original netlist to a coarsened subnetlist for effective multi-level delay and power optimization. Our GEO-P achieves 27% average power improvement and our GEO-PD provides gains in both delay and power improvement. To the best of our knowledge, this is the first paper addressing simultaneous delay and power optimization in multi-level global placement.