Rajaraman Ramanarayanan - Academia.edu (original) (raw)
Papers by Rajaraman Ramanarayanan
2012 IEEE Hot Chips 24 Symposium (HCS), 2012
Proceedings of the 20th International Conference on Vlsi Design Held Jointly With 6th International Conference Embedded Systems, 2007
Increasing use of on-chip networks as communication infrastructure in both high performance and l... more Increasing use of on-chip networks as communication infrastructure in both high performance and low end computing makes it important to consider their power consumption. Several previously proposed approaches to power management in the context of NoCs (network-on-chips) are either pure hardware based or focus exclusively on a single application execution scenario. This paper makes two major contributions. First, it proposes a software-based proactive on-chip network power management scheme that operates under a given process scheduler. Second, it presents a power-aware process scheduling strategy, with the goal of maximizing power savings when we have multiple applications in the system. The paper also evaluates the proposed schemes under the different execution scenarios in the context of NoCs based on a two-dimensional mesh topology and compares them to each other as well as to a previouslyproposed hardware-based network power management scheme. Our experimental evaluation using six data-intensive applications shows that the proposed software based approach is competitive with the hardware based scheme. Also, we found that the power aware scheduling brings significant energy savings.
IEEE International [Systems-on-Chip] SOC Conference, 2003. Proceedings., 2000
Abstracr-Sofi the critical charge hy increasing the gate capacitance while errors are gaining imp... more Abstracr-Sofi the critical charge hy increasing the gate capacitance while errors are gaining importance as technology scales. Flip-flops, an important component of pipelined architectures, are becoming more susceptible to soft errors. This work analyzes soft error rates on a variety of flip-flops. The analysis was performed hy implementing and simulating the various designs in 70 nm, 1V CMOS technology. First, we evaluate the critical charge for the snsceptihle nodes in each design. Further, we implement two hardening techniques and present the results. One attempts to increase the other improves the overall robustness of the circuit by replicating the master stage of the master slave nip-flops, which leads to reduced power and area overhead. 0-7803-8182-3/03/$17.00 02003 IEEE
SCS 2003. International Symposium on Signals, Circuits and Systems. Proceedings (Cat. No.03EX720), 2000
Due to technology scaling, smaller devices and lower operating voltages, next generation circuits... more Due to technology scaling, smaller devices and lower operating voltages, next generation circuits are highly sus- ceptible to soft errors. Another important problem con- fronting silicon scaling is static power consumption. In this paper, we analyze the effect of increasing threshold voltage (widely used for reducing static power consumption) on the soft error rate (SER). We find that increasing threshold volt- age improves SER of transmission gate based flip-flops, but can adversely affect the robustness of combinational logic due to the effect of higher threshold voltages on the atten- uation of transient pulses. We also show that clever use of high Vt can improve the robustness of 6T-SRAMs.
15th Annual IEEE International ASIC/SOC Conference, 2002
... Thus, when flip-flops are not used, it is necessary to not only gate their clock for reducing... more ... Thus, when flip-flops are not used, it is necessary to not only gate their clock for reducing dynamic power hut also to set their inputs to a low-leakage state. ... The setup time again is 0 while the holdtime is 0.1 ns. But the worst-case propagation delay is around 0.98 ns. ...
20th International Conference on VLSI Design held jointly with 6th International Conference on Embedded Systems (VLSID'07), 2007
Increasing use of on-chip networks as communication infrastructure in both high performance and l... more Increasing use of on-chip networks as communication infrastructure in both high performance and low end computing makes it important to consider their power consumption. Several previously proposed approaches to power management in the context of NoCs (network-on-chips) are either pure hardware based or focus exclusively on a single application execution scenario. This paper makes two major contributions. First, it proposes a software-based proactive on-chip network power management scheme that operates under a given process scheduler. Second, it presents a power-aware process scheduling strategy, with the goal of maximizing power savings when we have multiple applications in the system. The paper also evaluates the proposed schemes under the different execution scenarios in the context of NoCs based on a two-dimensional mesh topology and compares them to each other as well as to a previouslyproposed hardware-based network power management scheme. Our experimental evaluation using six data-intensive applications shows that the proposed software based approach is competitive with the hardware based scheme. Also, we found that the power aware scheduling brings significant energy savings.
IFIP International Federation for Information Processing, 2006
Due to technology scaling, devices are getting smaller, faster and operating at lower voltages. T... more Due to technology scaling, devices are getting smaller, faster and operating at lower voltages. The reduced nodal capacitances and supply voltages coupled with more dense and larger chips are increasing soft errors and making them an important design constraint. As designers aggressively address the excessive power consumption problem that is considered as a major design limiter they need to be aware of the impact of the power optimizations on the soft error rates(SER). In this chapter, we analyze the effect of increasing threshold voltage and reducing the operating voltages, widely used for reducing power consumption, on the soft error rate. While reducing the operating voltage increases the susceptibility to soft errors, increasing the threshold voltages offers mixed results. We find that increasing threshold voltage (Vt) improves SER of transmission gate based flip-flops, but can adversely affect the robustness of combinational logic due to the effect of higher threshold voltages on the attenuation of transient pulses. We also show that, in certain circuits, clever use of high Vt can improve the robustness to soft errors.
21st International Conference on VLSI Design (VLSID 2008), 2008
This paper describes a unified PopCount/ BitScanForward/BitScanReverse datapath circuit designed ... more This paper describes a unified PopCount/ BitScanForward/BitScanReverse datapath circuit designed for 2.1GHz operation with total power consumption of 6.5mW, targeted for 65nm 64-bit microprocessor execution cores. The unified datapath uses a hybrid 3:2 compressor-based Wallace tree to count the number of '1's in the 64-bit input, along with a novel encoding scheme that enables reuse of the same tree to identify the bit-location of the 1 st set bit when scanning the input in the forward and reverse directions. This circuit thus combines the functions of 3 separate units, enabling 26% reduction in total energy and 20% lower area, while achieving single-cycle latency & throughput. 21st International Conference on VLSI Design 1063-9667/08 $25.00
2010 Symposium on VLSI Circuits, 2010
Abstract An all-digital True Random Number Generator is fabricated in 45nm CMOS with 2.4Gbps rand... more Abstract An all-digital True Random Number Generator is fabricated in 45nm CMOS with 2.4Gbps random bit throughput and total power consumption of 7mW. Two-step coarse/fine-grained tuning with a self-calibrating feedback loop enables robust operation in the presence of ...
2012 IEEE International Solid-State Circuits Conference, 2012
Near-threshold computing brings the promise of an order of magnitude improvement in energy effici... more Near-threshold computing brings the promise of an order of magnitude improvement in energy efficiency over the current generation of microprocessors [1]. However, frequency degradation due to aggressive voltage scaling may not be acceptable across all single-threaded or performance-constrained applications. Enabling the processor to operate over a wide voltage range helps to achieve best possible energy efficiency while satisfying varying performance
2010 Proceedings of ESSCIRC, 2010
A multi-mode Secure Hashing Algorithm (SHA) accelerator is fabricated in 45nm CMOS and occupies 0... more A multi-mode Secure Hashing Algorithm (SHA) accelerator is fabricated in 45nm CMOS and occupies 0.0625mm 2 with 18Gbps throughput and total power consumption of 50mW. The reconfigurable hardware accelerator computes SHA-1/224/256/384/512 message-digest using unified SHA bit-slices and configurable compression circuits resulting in 40% area reduction and <3% performance overhead for reconfiguration with 23Gbps peak throughput in SHA-224/256 modes. SHA frequency ranges from 21MHz-1.8GHz across 320mV-1.35V supply voltage range.
2013 IEEE 21st Symposium on Computer Arithmetic, 2013
ABSTRACT
IEEE Transactions on Dependable and Secure Computing, 2000
Radiation-induced soft errors in combinational logic is expected to become as important as direct... more Radiation-induced soft errors in combinational logic is expected to become as important as directly induced errors on state elements. Consequently, it has become important to develop techniques to quickly and accurately predict soft-error rates (SERs) in combinational circuits. In this work, we present methodologies to model soft errors in both the device and logic levels. At the device level, a
ABSTRACT Chip multiprocessors are becoming increasingly popular in embedded domain since they hav... more ABSTRACT Chip multiprocessors are becoming increasingly popular in embedded domain since they have important advantages over their single core counterparts from the parallelism, power efficiency, validation, and verification perspectives. However, extracting maximum performance from these multiprocessors requires compiler support in form of effective code parallelization. The goal of this paper is to present and experimentally evaluate a locality aware dynamic loop scheduling strategy that implements both locality aware loop iteration distribution across parallel processors and dynamic load balancing at runtime. This hybrid scheme has been implemented and tested along with four other previously-proposed loop scheduling schemes, including a locality aware one. Our experimental analysis reveals that the proposed approach generates better results than all other scheduling schemes (static or dynamic) tested. Our results also show that the improvements brought by the proposed scheduling scheme are consistent across experiments with different values of our major simulation parameters such as the number of processors and cache size per processor
2010 IEEE International Solid-State Circuits Conference - (ISSCC), 2010
... CMOS Amit Agarwal, Sanu K Mathew, Steven K Hsu, Mark A Anders, Himanshu Kaul, Farhana Sheikh,... more ... CMOS Amit Agarwal, Sanu K Mathew, Steven K Hsu, Mark A Anders, Himanshu Kaul, Farhana Sheikh, Rajaraman Ramanarayanan, Suresh Srinivasan, Ram Krishnamurthy, Shekhar Borkar Intel, Hillsboro, OR Computationally ...
2012 IEEE Hot Chips 24 Symposium (HCS), 2012
Proceedings of the 20th International Conference on Vlsi Design Held Jointly With 6th International Conference Embedded Systems, 2007
Increasing use of on-chip networks as communication infrastructure in both high performance and l... more Increasing use of on-chip networks as communication infrastructure in both high performance and low end computing makes it important to consider their power consumption. Several previously proposed approaches to power management in the context of NoCs (network-on-chips) are either pure hardware based or focus exclusively on a single application execution scenario. This paper makes two major contributions. First, it proposes a software-based proactive on-chip network power management scheme that operates under a given process scheduler. Second, it presents a power-aware process scheduling strategy, with the goal of maximizing power savings when we have multiple applications in the system. The paper also evaluates the proposed schemes under the different execution scenarios in the context of NoCs based on a two-dimensional mesh topology and compares them to each other as well as to a previouslyproposed hardware-based network power management scheme. Our experimental evaluation using six data-intensive applications shows that the proposed software based approach is competitive with the hardware based scheme. Also, we found that the power aware scheduling brings significant energy savings.
IEEE International [Systems-on-Chip] SOC Conference, 2003. Proceedings., 2000
Abstracr-Sofi the critical charge hy increasing the gate capacitance while errors are gaining imp... more Abstracr-Sofi the critical charge hy increasing the gate capacitance while errors are gaining importance as technology scales. Flip-flops, an important component of pipelined architectures, are becoming more susceptible to soft errors. This work analyzes soft error rates on a variety of flip-flops. The analysis was performed hy implementing and simulating the various designs in 70 nm, 1V CMOS technology. First, we evaluate the critical charge for the snsceptihle nodes in each design. Further, we implement two hardening techniques and present the results. One attempts to increase the other improves the overall robustness of the circuit by replicating the master stage of the master slave nip-flops, which leads to reduced power and area overhead. 0-7803-8182-3/03/$17.00 02003 IEEE
SCS 2003. International Symposium on Signals, Circuits and Systems. Proceedings (Cat. No.03EX720), 2000
Due to technology scaling, smaller devices and lower operating voltages, next generation circuits... more Due to technology scaling, smaller devices and lower operating voltages, next generation circuits are highly sus- ceptible to soft errors. Another important problem con- fronting silicon scaling is static power consumption. In this paper, we analyze the effect of increasing threshold voltage (widely used for reducing static power consumption) on the soft error rate (SER). We find that increasing threshold volt- age improves SER of transmission gate based flip-flops, but can adversely affect the robustness of combinational logic due to the effect of higher threshold voltages on the atten- uation of transient pulses. We also show that clever use of high Vt can improve the robustness of 6T-SRAMs.
15th Annual IEEE International ASIC/SOC Conference, 2002
... Thus, when flip-flops are not used, it is necessary to not only gate their clock for reducing... more ... Thus, when flip-flops are not used, it is necessary to not only gate their clock for reducing dynamic power hut also to set their inputs to a low-leakage state. ... The setup time again is 0 while the holdtime is 0.1 ns. But the worst-case propagation delay is around 0.98 ns. ...
20th International Conference on VLSI Design held jointly with 6th International Conference on Embedded Systems (VLSID'07), 2007
Increasing use of on-chip networks as communication infrastructure in both high performance and l... more Increasing use of on-chip networks as communication infrastructure in both high performance and low end computing makes it important to consider their power consumption. Several previously proposed approaches to power management in the context of NoCs (network-on-chips) are either pure hardware based or focus exclusively on a single application execution scenario. This paper makes two major contributions. First, it proposes a software-based proactive on-chip network power management scheme that operates under a given process scheduler. Second, it presents a power-aware process scheduling strategy, with the goal of maximizing power savings when we have multiple applications in the system. The paper also evaluates the proposed schemes under the different execution scenarios in the context of NoCs based on a two-dimensional mesh topology and compares them to each other as well as to a previouslyproposed hardware-based network power management scheme. Our experimental evaluation using six data-intensive applications shows that the proposed software based approach is competitive with the hardware based scheme. Also, we found that the power aware scheduling brings significant energy savings.
IFIP International Federation for Information Processing, 2006
Due to technology scaling, devices are getting smaller, faster and operating at lower voltages. T... more Due to technology scaling, devices are getting smaller, faster and operating at lower voltages. The reduced nodal capacitances and supply voltages coupled with more dense and larger chips are increasing soft errors and making them an important design constraint. As designers aggressively address the excessive power consumption problem that is considered as a major design limiter they need to be aware of the impact of the power optimizations on the soft error rates(SER). In this chapter, we analyze the effect of increasing threshold voltage and reducing the operating voltages, widely used for reducing power consumption, on the soft error rate. While reducing the operating voltage increases the susceptibility to soft errors, increasing the threshold voltages offers mixed results. We find that increasing threshold voltage (Vt) improves SER of transmission gate based flip-flops, but can adversely affect the robustness of combinational logic due to the effect of higher threshold voltages on the attenuation of transient pulses. We also show that, in certain circuits, clever use of high Vt can improve the robustness to soft errors.
21st International Conference on VLSI Design (VLSID 2008), 2008
This paper describes a unified PopCount/ BitScanForward/BitScanReverse datapath circuit designed ... more This paper describes a unified PopCount/ BitScanForward/BitScanReverse datapath circuit designed for 2.1GHz operation with total power consumption of 6.5mW, targeted for 65nm 64-bit microprocessor execution cores. The unified datapath uses a hybrid 3:2 compressor-based Wallace tree to count the number of '1's in the 64-bit input, along with a novel encoding scheme that enables reuse of the same tree to identify the bit-location of the 1 st set bit when scanning the input in the forward and reverse directions. This circuit thus combines the functions of 3 separate units, enabling 26% reduction in total energy and 20% lower area, while achieving single-cycle latency & throughput. 21st International Conference on VLSI Design 1063-9667/08 $25.00
2010 Symposium on VLSI Circuits, 2010
Abstract An all-digital True Random Number Generator is fabricated in 45nm CMOS with 2.4Gbps rand... more Abstract An all-digital True Random Number Generator is fabricated in 45nm CMOS with 2.4Gbps random bit throughput and total power consumption of 7mW. Two-step coarse/fine-grained tuning with a self-calibrating feedback loop enables robust operation in the presence of ...
2012 IEEE International Solid-State Circuits Conference, 2012
Near-threshold computing brings the promise of an order of magnitude improvement in energy effici... more Near-threshold computing brings the promise of an order of magnitude improvement in energy efficiency over the current generation of microprocessors [1]. However, frequency degradation due to aggressive voltage scaling may not be acceptable across all single-threaded or performance-constrained applications. Enabling the processor to operate over a wide voltage range helps to achieve best possible energy efficiency while satisfying varying performance
2010 Proceedings of ESSCIRC, 2010
A multi-mode Secure Hashing Algorithm (SHA) accelerator is fabricated in 45nm CMOS and occupies 0... more A multi-mode Secure Hashing Algorithm (SHA) accelerator is fabricated in 45nm CMOS and occupies 0.0625mm 2 with 18Gbps throughput and total power consumption of 50mW. The reconfigurable hardware accelerator computes SHA-1/224/256/384/512 message-digest using unified SHA bit-slices and configurable compression circuits resulting in 40% area reduction and <3% performance overhead for reconfiguration with 23Gbps peak throughput in SHA-224/256 modes. SHA frequency ranges from 21MHz-1.8GHz across 320mV-1.35V supply voltage range.
2013 IEEE 21st Symposium on Computer Arithmetic, 2013
ABSTRACT
IEEE Transactions on Dependable and Secure Computing, 2000
Radiation-induced soft errors in combinational logic is expected to become as important as direct... more Radiation-induced soft errors in combinational logic is expected to become as important as directly induced errors on state elements. Consequently, it has become important to develop techniques to quickly and accurately predict soft-error rates (SERs) in combinational circuits. In this work, we present methodologies to model soft errors in both the device and logic levels. At the device level, a
ABSTRACT Chip multiprocessors are becoming increasingly popular in embedded domain since they hav... more ABSTRACT Chip multiprocessors are becoming increasingly popular in embedded domain since they have important advantages over their single core counterparts from the parallelism, power efficiency, validation, and verification perspectives. However, extracting maximum performance from these multiprocessors requires compiler support in form of effective code parallelization. The goal of this paper is to present and experimentally evaluate a locality aware dynamic loop scheduling strategy that implements both locality aware loop iteration distribution across parallel processors and dynamic load balancing at runtime. This hybrid scheme has been implemented and tested along with four other previously-proposed loop scheduling schemes, including a locality aware one. Our experimental analysis reveals that the proposed approach generates better results than all other scheduling schemes (static or dynamic) tested. Our results also show that the improvements brought by the proposed scheduling scheme are consistent across experiments with different values of our major simulation parameters such as the number of processors and cache size per processor
2010 IEEE International Solid-State Circuits Conference - (ISSCC), 2010
... CMOS Amit Agarwal, Sanu K Mathew, Steven K Hsu, Mark A Anders, Himanshu Kaul, Farhana Sheikh,... more ... CMOS Amit Agarwal, Sanu K Mathew, Steven K Hsu, Mark A Anders, Himanshu Kaul, Farhana Sheikh, Rajaraman Ramanarayanan, Suresh Srinivasan, Ram Krishnamurthy, Shekhar Borkar Intel, Hillsboro, OR Computationally ...