Reducing power consumption of instruction ROMs by exploiting instruction frequency (original) (raw)

Reducing power consumption of dedicated processors through instruction set encoding

Proceedings of the 8th Great Lakes Symposium on VLSI (Cat. No.98TB100222)

With the increased clock frequency of modern, high-performance processors over 500 MHz, in some cases, limiting the power dissipation has become the most stringent design target. It is thus mandatory for processor engineers to resort to a large variety of optimization techniques to reduce the power requirements in the hot zones of the chip. In this paper, we focus on the power dissipated by the instruction fetch and decode logic, a portion of the processor architecture where a lot of capacitance switching normally takes place. We propose a methodology for determining an encoding of the instruction set that guarantees the minimization of the number of bit transitions occurring inside the registers of the pipeline stages involved in instruction fetching and decoding. The assignment of the binary patterns to the op-codes is driven by the statistics concerning instruction adjacency collected through instruction-level simulation of typical software applications; therefore, the technique is best exploited when applied t o e n c o de the instruction set of core p r ocessors and microcontrollers, since c omponents of these types are c ommonly used to execute xed p ortions of machine code within embedded systems. We illustrate the e ectiveness of the methodology through the experimental data we have obtained o n an existing microprocessor.

Instruction level power analysis and optimization of software

Journal of VLSI signal processing systems for signal, image and video technology, 1996

The increasing popularity of power constrained mobile computers and embedded computing applications drives the need for analyzing and optimizing power in all the components of a system. Software constitutes a major component of today's systems, and its role is projected to grow even further. Thus, an ever increasing portion of the functionality of today's systems is in the form of instructions, as opposed to gates. This motivates the need for analyzing power consumption from the point of view of instructions-something that traditional circuit and gate level power analysis tools are inadequate for. This paper describes an alternative, measurement based instruction level power analysis approach that provides an accurate and practical way of quantifying the power cost of software. This technique has been applied to three commercial, architecturally di erent processors. The salient results of these analyses are summarized. Instruction level analysis of a processor helps in the development of models for power consumption of software executing on that processor. The power models for the subject processors are described and interesting observations resulting from the comparison of these models are highlighted. The ability to evaluate software in terms of power consumption makes it feasible to search for low power implementations of given programs. In addition, it can guide the development of general tools and techniques for low power software. Several ideas in this regard as motivated by the power analysis of the subject processors are also described.

Power Aware Encoding for the Instruction Address Buses Using Program Constructs

2007

This paper examines the address traces produced by various program constructs. By using the correlation induced by the program constructs within these traces, we develop a scalable bus encoding algorithm to significantly reduce the switching activity on the instruction address bus. Simulation results for Spec2000 benchmarks show that for modest coding complexity, the proposed scheme reduces switching activity on the instruction address bus by as much as 88% and the overall bus power by as much

Power-efficient Instruction Encoding Optimization for Various Architecture Classes

Journal of Computers, 2008

A huge application domain, in particular, wireless and handheld devices strongly requires flexible and powerefficient hardware with high performance. This can only be achieved with Application Specific Instruction-Set Processors (ASIPs). A key problem is to determine the instruction encoding of the processors for achieving minimum power consumption in the instruction bus and in the instruction memory. In this paper, a framework for determining powerefficient instruction encoding in RISC and VLIW architectures is presented. We have integrated existing and novel techniques in this framework and propose novel heuristic approaches. The framework accepts an existing processor's instruction-set and a set of implementations of various applications. The output, which is an optimized instruction encoding under the constraint of a well-defined cost model, minimizes the power consumption of the instruction bus and the instruction memory. This results in strong reduction of the overall power consumption. Case studies with commercial embedded processors show the effectiveness of this framework.

Energy aware register file implementation through instruction predecode

Proceedings IEEE International Conference on Application-Specific Systems, Architectures, and Processors. ASAP 2003, 2003

The register file is a power-hungry device in modern architectures. Current research on compiler technology and computer architectures encourages the implementation of larger devices to feed multiple data paths and to store global variables. However, low power techniques are not able to appreciably reduce power consumption in this device without a time penalty. This paper introduces an efficient hardware approach to reduce the register file energy consumption by turning unused registers into a low power state. Bypassing the register fields of the fetch instruction to the decode stage allows the identification of registers required by the current instruction (instruction predecode) and allows the control logic to turn them back on. They are put into the low-power state after the instruction use. This technique achieves an 85% energy reduction with no performance penalty. The simplicity of the approach makes it an effective low-power technique for embedded processors.

Switching activity minimization by efficient instruction set architecture design

The 2002 45th Midwest Symposium on Circuits and Systems, 2002. MWSCAS-2002., 2002

Power consumption can be greatly minimized by reducing the bus signal transition activity (also called switching activity) in the control and data path circuit. Switching activity occurs due to the switching between two instructions (of the embedded software) on successive clock cycles. Our belief is that the binary encoding of instructions (machine code) plays a significant role in determining the amount of switching in a circuit. Thus, our aim is to realise a machine encoding of instructions of an ASIP such that for a given data path, it will minimize the average switching activity in the control path circuit of the ASIP and hence the total switching activity in the ASIP. Given the application-domain of the ASIP, we have used information theoretic techniques to arrive at an encoding of the op-code that minimizes redundancy and also the switching activity. We have compared our encoding of instruction op-codes with those obtained by other encoding techniques using a switching activity estimator designed by us.

Automatic selection of instruction op-codes of low-power core processors

IEE Proceedings - Computers and Digital Techniques, 1999

A methodology is presented for automatically detennining an assignment of instruction op-codes that guarantees the minimisation of bit transitions occurring inside the registers of the pipeline stages involved in instruction fetching and decoding. The assignment of the binary patterns to the op-codes is driven by the statistics concerning instruction adjacency collected through instruction-level simulation of typical software applications. Therefore the technique is best exploited when applied to encode the instruction set of core processors and microcontrollers, since components of these types are commonly used to execute fixed portions of machine code within embedded systems. The effectiveness of the methodology is illustrated through experimental data obtained on a realistic case study, namely, the MIPS R4000 RISC microprocessor.

Reducing and smoothing power consumption of ROM-based controller implementations

Proceedings of the 23rd symposium on Integrated circuits and system design - SBCCI '10, 2010

Interest in automated methodologies increased last decades due to the ever-growing processing complexity and time-to-market constraints. CAD tools prove their efficiency in power consumption management, which is nowadays a major constraint for embedded systems. Efficient low power techniques for Finite State Machine (FSM) design have been proposed for logic-based controllers. In this paper, we explore the circuit power consumption reduction when the FSM is mapped in ROM blocks. The described methodology achieves power reduction of ROMbased controllers through the transformation of don't care values in the decoder part of the design. This methodology allows a reduction of the number of resource commutations and smoothes them over the processing execution, limiting current spikes. Experiments show that the number of commutation can be decreased from 64% compared to an area-optimized ROM implementation.

Power optimization of core-based systems by address bus encoding

IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 1998

This paper presents a solution to the problem of reducing the power dissipated by a digital system containing an intellectual proprietary core processor which repeatedly executes a special-purpose program. The proposed method relies on a novel, application-dependent low-power address bus encoding scheme. The analysis of the execution traces of a given program allows an accurate computation of the correlations that may exist between blocks of bits in consecutive patterns; this information can be successfully exploited to determine an encoding which sensibly reduces the bus transition activity. Experimental results, obtained on a set of special-purpose applications, are very satisfactory; reductions of the bus activity up to 64.8% (41.8% on average) have been achieved over the original address streams. In addition, data concerning the quality and the performance of the automatically synthesized encoding/decoding circuits, as well as the results obtained for a realistic core-based design, indicate the practical usefulness of the proposed power optimization strategy.

Power efficient instruction caches for embedded systems

2005

Instruction caches typically consume 27% of the total power in modern high-end embedded systems. We propose a compiler-managed instruction store architecture (K-store) that places the computation intensive loops in a scratchpad like SRAM memory and allocates the remaining instructions to a regular instruction cache. At runtime, execution is switched dynamically between the instructions in the traditional instruction cache and the ones in the K-store, by inserting jump instructions. The necessary jump instructions add 0.038% on an average to the total dynamic instruction count. We compare the performance and energy consumption of our K-store with that of a conventional instruction cache of equal size. When used in lieu of a 8KB, 4-way associative instruction cache, K-store provides 32% reduction in energy and 7% reduction in execution time. Unlike loop caches, K-store maps the frequent code in a reserved address space and hence, it can switch between the kernel memory and the instruction cache without any noticeable performance penalty.