EVC-Based Power Gating Approach to Achieve Low-Power and High Performance NoC (original) (raw)
Related papers
A Dynamic Bypass Approach to Realize Power Efficient Network-on-Chip
2019 IEEE 21st International Conference on High Performance Computing and Communications; IEEE 17th International Conference on Smart City; IEEE 5th International Conference on Data Science and Systems (HPCC/SmartCity/DSS), 2019
High power consumption becomes the major bottleneck that prevents applying Network-on-Chips (NoCs) on future many-core systems. Power gating is an effective way to reduce the power consumption of a NoC. However, conventional power gating approaches cause significant packet latency increase as well as additional power consumption overhead due to the power gating mechanism. One comprehensive way to reduce these negative impacts is to bypass the powered-off routers in a NoC to transfer packets. Therefore, in this paper, we propose a dynamic bypass (D-bypass) approach, which is based on a reservation mechanism to allow different upstream routers to forward packets through the same powered-off router at different times. With this feature, our D-bypass power gating approach overcomes the drawbacks in related power gating approaches. Compared with a conventional NoC without power gating, our D-bypass approach causes only 2.55% performance penalty, which is less than 28.67%, 19.26%, 7.24%, and 6.69% penalties in related approaches. With small hardware overhead, our approach just consumes on average 22.23% of total power consumption in a NoC, which is slightly better compared to 27.06%, 23.89%, 26.45%, and 24.70% total power consumption in related approaches.
Power Punch: Towards Non-blocking Power-gating of NoC Routers
As chip designs penetrate further into the dark silicon era, innovative techniques are much needed to power off idle or under-utilized system components while having minimal impact on performance. On-chip network routers are potentially good targets for power-gating, but packets in the network can be significantly delayed as their paths may be blocked by powered-off routers. In this paper, we propose Power Punch, a novel performance-aware, power reduction scheme that aims to achieve non-blocking power-gating of on-chip network routers. Two mechanisms are proposed that not only allow power control signals to utilize existing slack at source nodes to wake up powered-off routers along the first few hops before packets are injected, but also allow these signals to utilize hop count slack by staying ahead of packets to " punch through " any blocked routers along the imminent path of packets, preventing packets from having to suffer router wakeup latency or packet detour latency. Full system evaluation on PARSEC benchmarks shows Power Punch saves more than 83% of router static energy while having an execution time penalty of less than 0.4%, effectively achieving near non-blocking power-gating of on-chip network routers.
SMART: A scalable mapping and routing technique for power-gating in NoC routers
2017 Eleventh IEEE/ACM International Symposium on Networks-on-Chip (NOCS), 2017
Reducing the size of the technology increases leakage power in Network-on-Chip (NoC) routers drastically. Power-gating, particularly in NoC routers, is one of the most efficient approaches for alleviating the leakage power. Although applying power-gating techniques alleviates NoC power consumption due to high proportion of idleness in NoC routers, since the timing behavior of packets is irregular, even in low injection rates, performance overhead in power-gated routers is significant. In this paper, we present SMART, a Scalable Mapping And Routing Technique, with virtually no area overhead on the network. It improves the irregularity of the timing behavior of packets in order to mitigate leakage power and lighten the imposed performance overhead. SMART employs a special deterministic routing algorithm, which reduces number of packets encounter power-gated routers. It establishes a dedicated path between each source-destination pair to maximize using powered-on routers, which roughly...
A survey on energy-efficient methodologies and architectures of network-on-chip
Computers & Electrical Engineering, 2014
Integration of large number of electronic components on a single chip has resulted in complete and complex systems on a single chip. The energy efficiency in the System-on-Chip (SoC) and its communication subset, the Network-on-Chip (NoC), is a key challenge, due to the fact that these systems are typically battery-powered. We present a survey that provides a broad picture of the state-of-the-art energy-efficient NoC architectures and techniques, such as the routing algorithms, buffered and bufferless router architectures, fault tolerance, switching techniques, voltage islands, and voltage-frequency scaling. The objective of the survey is to educate the readers with the latest design-improvements that are carried out in reducing the power consumption in the NoCs.
A survey of low power techniques for efficient Network-on-Chip design
2016
Power consumption continues to be a challenge for designers as the complexity of NoC increases. The scaling down of technology towards the deep nanometer era will only cause an increase in the amount of power NoC components will consume. Therefore, low power design solution is one of the essential requirements of future NoC-based System-on-Chip (SoC) applications. Several techniques have been proposed over the years to improve the performance of the NoCs, trading-off power efficiency; particularly power hungry elements in NoC routers. Power dissipation can be reduced by optimizing the router elements, applying architecture saving techniques and communication links. This paper presents recent contributions and efficient saving techniques at the router, NoC architecture and Communication link level.
An Improvised Bottleneck Routing Algorithm for virtual Buffer based energy efficient NoC ’ s
2017
Network on chip technology will be a major factor in future communication system which is based on intra core system, But when comes to power usage this practical NoC system consumes considerably huge power , then the Architecture of the crossbar routers will directly increases with respect to the no of intra core system . When comes to bulky systems, the average power consumption in crossbar switches relatively high. While detailing the components of the routers we found that buffers in the input terminals leads to major power consumption, when we try to remove buffers in NoC routers, the overall performance will reduce due to the bottleneck Increase. To make These Network on Chip crossbars efficiently without degradation in performance we propose a better systematic approach in the crossbar switches by incorporating virtual Memory system as a substitute of conventional register based memory, this will leads to energy efficiency with increase in overall performance. We are going to...
A Power and Energy Exploration of Network-on-Chip Architectures
In this study, we analyse the move towards Networks-on-Chips from an energy perspective by accurately modelling a Circuit-Switched router, a Wormhole router and a speculative Virtual-Channel router in a 90nm CMOS process. All the routers are shown to dissipate significant idle state power. The additional energy required to route a packet through the router is then shown to be dominated by the data-path. This leads to the key result that, if this trend continues, the energy cost of more elaborate control will not be vast, making it easier to justify. Given effective clockgating, this additional energy is also shown to be more or less independent of network congestion. Accurate speed and area metrics are also reported for the networks, which will allow a more complete comparison to be made across the NoC architectural space considered.
A high level power model for Network-on-Chip (NoC) router
a b s t r a c t This paper presents a high level power estimation methodology for a Network-on-Chip (NoC) router, that is capable of providing cycle accurate power profile to enable power exploration at system level. Our power macro model is based on the number of flits passing through a router as the unit of abstraction. Experimental results show that our power macro model incurs less than 5% average absolute cycle error compared to gate level analysis. The high level power macro model allows network power to be readily incorporated into simulation infrastructures, providing a fast and cycle accurate power profile, to enable power optimization such as power-aware compiler, core mapping, and scheduling techniques for CMP. As a case study, we demonstrate the use of our model for evaluating the effect of different core mappings using SPLASH-2 benchmark showing the utility of our power macro model.
Boomerang: Reducing Power Consumption of Response Packets in NoCs with Minimal Performance Impact
IEEE Computer Architecture Letters, 2010
Most power reduction mechanisms for NoC channel buffers rely on on-demand wakeup to transition from a low-power state to the active state. Two drawbacks of on-demand wakeup limit its effectiveness: 1) performance impact caused by wakeup delays, and 2) energy and area cost of sleep circuitry itself. What makes the problem harder to solve is that solutions to either problem tend to exacerbate the other. For example, faster wakeup from a power-gated state requires greater charge/discharge current for the sleep transistors while using nimbler sleep transistors implies long wakeup delays. As a result, powerdowns have to be conservatively prescribed, missing many power-saving opportunities.
… (HOTI), 2011 IEEE 19th …, 2011
Manycore systems require energy-efficient on-chip networks that provide high throughput and low latency. The performance of these on-chip networks affects cache access latency and, consequently, system performance. This paper proposes solutions to address the performance limitations related to the use of snoop-based cache coherence protocol on switched network-on-chip (NoC). We propose a new network flow control technique, Express Virtual Channel with Taps (EVC-T), for transmitting both broadcast packets and data packets efficiently. In addition, we propose a low-latency broadcast packet notification tree network that maintains the order of broadcast packets on an unordered NoC. We evaluate our technique using both synthetic traffic and parallel benchmark suites through detailed system simulation. EVC-T reduces the average network latency by 24% with a negligible change in power for synthetic benchmarks. For NAS parallel applications, EVC-T increases the instructions per cycle (IPC) by 9% on average with minimal increase in power. Our technique reduces the energy-delay product (EDP) by 13% on average across all benchmarks.