Energy efficiency evaluation of multi-level parallelism on low power processors (original) (raw)
Related papers
Characterizing the Performance-Energy Tradeoff of Small ARM Cores in HPC Computation
Lecture Notes in Computer Science, 2014
Deploying large numbers of small, low-power cores has been gaining traction recently as a system design strategy in high performance computing (HPC). The ARM platform that dominates the embedded and mobile computing segments is now being considered as an alternative to high-end x86 processors that largely dominate HPC because peak performance per watt may be substantially improved using off-the-shelf commodity processors. In this work we methodically characterize the performance and energy of HPC computations drawn from a number of problem domains on current ARM and x86 processors. Unsurprisingly, we find that the performance, energy and energy-delay product of applications running on these platforms varies significantly across problem types and inputs. Using static program analysis we further show that this variation can be explained largely in terms of the capabilities of two processor subsystems: single instruction multiple data (SIMD)/floating point and the cache/memory hierarchy; and that static analysis of this kind is sufficient to predict which platform is best for a particular application/input pair. In the context of these findings, we evaluate how some of the key architectural changes being made for upcoming 64-bit ARM platforms may impact HPC application performance.
Energy aware execution environments and algorithms on low power multi-core architectures
2016
Proceedings of the First PhD Symposium on Sustainable Ultrascale Computing Systems (NESUS PhD 2016) Timisoara, Romania. February 8-11, 2016.Energy consumption is a key aspect that conditions the proper functioning of nowadays data centers and high performance computing just like the launch of new services, due to its environmental negative impact and the increasing economic costs of energy. The energy efficiency of the applications used in these data centers could be improved, especially when systems’ utilization rate is low or moderate, or when targeting memory bounded applications. In this sense, energy proportionality stands for systems which power consumption is in line with the amount of work performed in each moment. As a response to these needs, the main objective of this project is to study, design, develop and analyze experimental solutions (models, programs, tools and techniques) aware of energy proportionality for scientific and engineering applications on low-power archi...
Proceedings of the 9th conference on Computing Frontiers, 2012
This paper studies the overall system power variations of two multi-core architectures, an 8-core Intel and a 32-core AMD workstation, while using these machines to execute a wide variety of sequential and multi-threaded benchmarks using varying compiler optimization settings and runtime configurations. Our extensive experimental study provides insights for answering two questions: 1) what degrees of impact can application level optimizations have on reducing the overall system power consumption of modern CMP architectures; and 2) what strategies can compilers and application developers adopt to achieve a balanced performance and power efficiency for applications from a variety of science and embedded systems domains.
Experiences with Mobile Processors for Energy Efficient HPC
Design, Automation & Test in Europe Conference & Exhibition (DATE), 2013, 2013
The performance of High Performance Computing (HPC) systems is already limited by their power consumption. The majority of top HPC systems today are built from commodity server components that were designed for maximizing the compute performance. The Mont-Blanc project aims at using low-power parts from the mobile domain for HPC. In this paper we present our first experiences with the use of mobile processors and accelerators for the HPC domain based on the research that was performed in the project. We show initial evaluation of NVIDIA Tegra 2 and Tegra 3 mobile SoCs and the NVIDIA Quadro 1000M GPU with a set of HPC microbenchmarks to evaluate their potential for energy-efficient HPC.
On understanding the energy consumption of ARM-based multicore servers
ACM SIGMETRICS Performance Evaluation Review, 2013
There is growing interest to replace traditional servers with low-power multicore systems such as ARM Cortex-A9. However, such systems are typically provisioned for mobile applications that have lower memory and I/O requirements than server application. Thus, the impact and extent of the imbalance between application and system resources in exploiting energy efficient execution of server workloads is unclear. This paper proposes a trace-driven analytical model for understanding the energy performance of server workloads on ARM Cortex-A9 multicore systems. Key to our approach is the modeling of the degrees of CPU core, memory and I/O resource overlap, and in estimating the number of cores and clock frequency that optimizes energy performance without compromising execution time. Since energy usage is the product of utilized power and execution time, the model first estimates the execution time of a program. CPU time, which accounts for both cores and memory response time, is modeled a...
2017
The High Performance Computing (HPC) community recognizes energy consumption as a major problem. Extensive research is underway to identify means to increase energy efficiency of HPC systems including consideration of alternative building blocks for future systems. This thesis considers one such system, the Texas Instruments Keystone II, a heterogeneous Low-Power System-on-Chip (LPSoC) processor that combines a quad core ARM CPU with an octa-core Digital Signal Processor (DSP). It was first released in 2012. Four issues are considered: i) maximizing the Keystone II ARM CPU performance; ii) implementation and extension of the OpenMP programming model for the Keystone II; iii) simultaneous use of ARM and DSP cores across multiple Keystone SoCs; and iv) an energy model for applications running on LPSoCs like the Keystone II and heterogeneous systems in general. Maximizing the performance of the ARM CPU on the Keystone II system is fundamental to adoption of this system by the HPC commu...
Evaluation of Low-Power Computing when Operating on Subsets of Multicore Processors
Journal of Signal Processing Systems, 2013
Given the accelerated growth in tablet devices, smartphones, and netbooks, designers are faced with serious challenges to meet the needs of mobility in terms of battery life and form factor. It is vital to investigate how to deliver the best mobile experience to users while ensuring adequate levels of performance. In this paper, we present a power management evaluation of multi-core processor systems by comparing thermal power, battery life, and performance when running different types of workloads under a limited number of cores. To show the potential gains from a system power management perspective, we have assessed a mobile platform featuring the Second Generation Intel Core i5 processor, and tested it on a wide selection of workloads and benchmarks. Experimental results show significant thermal power reduction (up to 40 %) in a variety of scenarios, while system performance was sustained in most cases but sacrificed in a few other uncommon situations.
Energy consumption model over parallel programs implemented on multicore architectures
International Journal of Advanced Computer Science and Applications, 2015
In High Performance Computing, energy consumption is becoming an important aspect to consider. Due to the high costs that represent energy production in all countries it holds an important role and it seek to find ways to save energy. It is reflected in some efforts to reduce the energy requirements of hardware components and applications. Some options have been appearing in order to scale down energy use and, consequently, scale up energy efficiency. One of these strategies is the multithread programming paradigm, whose purpose is to produce parallel programs able to use the full amount of computing resources available in a microprocessor. That energy saving strategy focuses on efficient use of multicore processors that are found in various computing devices, like mobile devices. Actually, as a growing trend, multicore processors are found as part of various specific purpose computers since 2003, from High Performance Computing servers to mobile devices. However, it is not clear how multiprogramming affects energy efficiency. This paper presents an analysis of different types of multicorebased architectures used in computing, and then a valid model is presented. Based on Amdahl's Law, a model that considers different scenarios of energy use in multicore architectures it is proposed. Some interesting results were found from experiments with the developed algorithm, that it was execute of a parallel and sequential way. A lower limit of energy consumption was found in a type of multicore architecture and this behavior was observed experimentally.
Performance Analysis of HPC Applications on Low-Power Embedded Platforms
Design, Automation & Test in Europe Conference & Exhibition (DATE), 2013, 2013
This paper presents performance evaluation and analysis of well-known HPC applications and benchmarks running on low-power embedded platforms. The performance to power consumption ratios are compared to classical x86 systems. Scalability studies have been conducted on the Mont-Blanc Tibidabo cluster. We have also investigated optimization opportunities and pitfalls induced by the use of these new platforms, and proposed optimization strategies based on auto-tuning.
Improving the energy efficiency of software systems for multi-core architectures
The ICT has an huge impact on the world CO2 emissions and recent study estimates its account to 2% of these emissions. This growing account emissions makes IT energy efficiency an important challenge. State-of-the-art has proven that the processor is the main power consumer. Processor are nowadays more and more complex and they are used in many hardware systems, such as computers or smartphones. This thesis is thus focusing on the software energy efficiency for multi-core systems. In this paper, we therefore report our motivations to understand deeply their architectures for improving their energy efficiencies. Manufacturers have worked tremendously to improve performance and reduce power consumption of their processors. However a lot of things remains to do in the software side. We claim that energy-efficient softwares can play a deterministic role to reduce the IT carbon footprint. To answer this challenge, we are believing on the software-metric approach with a minimal hardware investment. For this purpose, an efficient, scalable and non-invasive tool is needed. As a result, we created PowerAPI, to provide fine-grained power estimations at process and code-level for optimizing the software enegergy efficiency automatically. This solution will help to identify clearly the energy leaks for optimizing automatically the power consumed by software.