A Bayesian network approach for compiler auto-tuning for embedded processors (original) (raw)

Portable compiler optimisation across embedded programs and microarchitectures using machine learning

Building an optimising compiler is a difficult and time consuming task which must be repeated for each generation of a microprocessor. As the underlying microarchitecture changes from one generation to the next, the compiler must be retuned to optimise specifically for that new system. It may take several releases of the compiler to effectively exploit a processor's performance potential, by which time a new generation has appeared and the process starts again.

BaCO: A Fast and Portable Bayesian Compiler Optimization Framework

arXiv (Cornell University), 2022

We introduce the Bayesian Compiler Optimization framework (BaCO), a general purpose autotuner for modern compilers targeting CPUs, GPUs, and FPGAs. BaCO provides the flexibility needed to handle the requirements of modern autotuning tasks. Particularly, it deals with permutation, ordered, and continuous parameter types along with both known and unknown parameter constraints. To reason about these parameter types and efficiently deliver high-quality code, BaCO uses Bayesian optimization algorithms specialized towards the autotuning domain. We demonstrate BaCO's effectiveness on three modern compiler systems: TACO, RISE & ELEVATE, and HPVM2FPGA for CPUs, GPUs, and FPGAs respectively. For these domains, BaCO outperforms current state-ofthe-art autotuners by delivering on average 1.36×-1.56× faster code with a tiny search budget, and BaCO is able to reach expert-level performance 2.9×-3.9× faster.

An Evaluation of Autotuning Techniques for the Compiler Optimization Problems

2016

Diversity of today’s architectures have forced programmers and compiler researchers to port their application across many different platforms. Compiler auto-tuning itself plays a major role within that process as it has certain levels of complexities that the standard optimization levels fail to bring the best results due to their average performance output. To address the problem, different optimization techniques has been used for traversing, pruning the huge space, adaptability and portability. In this paper, we evaluate our different autotuning approaches including the use of Design Space Exploration (DSE) techniques and machine learning to further tackle the both problems of selecting and the phase-ordering of the compiler optimizations. It has been experimentally demonstrated that using these techniques have positive effects on the performance metrics of the applications under analysis and can bring up to 60% performance improvement with respect to standard optimization levels...

Automatic Tuning of Compilers Using Machine Learning

SpringerBriefs in Applied Sciences and Technology

The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, express or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Compiler Optimization using Machine Learning Techniques

2019

Compilation optimization is very essential in order to increase the running speed of programs as well as minimize the object file size. Machine learning algorithms are used to select the best compiler options that result in such improvements. Compiler auto-tuning refers to the process of optimizing the performance of the code during the intermediate code-generation phase of compilation. This paper deals with machine learning based compilation optimization on feature processing, compiler autotuning and compiler optimization techniques such as loop nest optimizations and automatic generation of optimization heuristics for a target processor by machine learning. It then explores the concept of evolving iterative compilers which attempts a large number of optimization strategies and choosing the best one. It proposes an approach of selecting compiler transformations – namely probabilistic optimization. Using this approach, we can achieve significant performance improvements.

A Mixed Method of Parallel Software Auto-Tuning Using Statistical Modeling and Machine Learning

2018

A mixed method combining formal and auto-tuning approaches and aimed at maximizing efficiency of parallel programs (in terms of execution time) is proposed. The formal approach is based on algorithmic algebra and the usage of tools for automated design and synthesis of programs based on high-level algorithm specifications (schemes). Parallel software auto-tuning is the method of adjusting some structural parameters of a program to a target hardware platform to speed-up computation as much as possible. Previously, we have developed a framework intended to automate the generation of an auto-tuner from a program source code. However, auto-tuning for complex and nontrivial parallel systems is usually time-consuming due to empirical evaluation of huge amount of parameter values combinations of an initial parallel program in a target environment. In this paper, we extend our approach with statistical modeling and neural network algorithms that allow to reduce significantly the space of po...

Automatic selection of compiler options using genetic techniques for embedded software design

2013 IEEE 14th International Symposium on Computational Intelligence and Informatics (CINTI), Budapest, Hungary, ISBN: 978-1-4799-0194-4., 2013

ROM size and CPU load are considered as critical resources for the software design process of the embedded software. Thus it is necessary to produce software that follows specific ROM and CPU load requirements. Compiler options play major role in the optimization of code size and CPU load of the software. Selection of the best compiler option-set that provides the required code size and CPU load is a challenging process due to the wide range of options provided by modern compilers. In this paper we are providing a new technique that enables the designers to select automatically the best compiler options set that matches their design requirements based on genetic techniques. We have also added a new genetics operator called pass-over operator to enhance the chromosomes selection for the next generation.

Automatic application-specific microarchitecture reconfiguration

2006

Applications for constrained embedded systems are subject to strict time constraints and restrictive resource utilization. With soft core processors, application developers can customize the processor for their application, constrained by resources but aimed at high application performance. With such freedom in the design space of the processor, however, comes complexity. We present here an automatic optimization technique that helps the developers with the processor microarchitecture customization. A naive approach exploring all possible configurations is exponential with the number of parameters and hence is clearly infeasible, even with only tens of reconfigurable parameters. Instead, our approach runs in time that is linear with the number of parameter values, based on an assumption of parameter independence. This makes the approach feasible and scalable. For the dimensions that we customize, namely application runtime and hardware resources, we formulate their costs as a constrained binary integer nonlinear optimization program. Though the results are not guaranteed to be optimal, we find they are nearoptimal in practice. Our technique itself is general and can be applied to other design-space exploration problems.

Automatic Tuning of Compiler Optimizations and Analysis of their Impact

Procedia Computer Science, 2013

Modern compilers can work on many platforms and implement a lot of optimizations, which are not always tuned well for every target platform. In the paper we present the Tool for Automatic Compiler Tuning (TACT), which helps to identify such underperforming compiler optimizations. Using GCC for ARM, we show how this tool can be used to improve performance of several popular applications, and how the results can be further analyzed to find places for improvement in the GCC compiler itself.

The use of compiler optimizations for embedded systems software

Crossroads, 2008

Optimizing embedded applications using a compiler can generally be broken down into two major categories: hand-optimizing code to take advantage of a particular processor's compiler and applying built-in optimization options to proven and well-polished code. The former is well documented for different processors, but little has been done to find generalized methods for optimal sets of compiler options based on common goal criteria such as application code size, execution speed, power consumption, and build time. This article discusses the fundamental differences between these two general categories of optimizations using the compiler. Examples of common, built-in compiler options are presented using a simulated ARM processor and C compiler, along with a simple methodology that can be applied to any embedded compiler for finding an optimal set of compiler options.