An approach for code generation in the Sparse Polyhedral Framework (original) (raw)

Enabling Code Generation within the Sparse Polyhedral Framework

2010

Loop transformation frameworks based on the polyhedral model use libraries such as Polylib, ISL, and Omega to represent and manipulate polyhedra and use tools like CLooG to generate loops that scan the modified polyhedra. Most of these libraries are restricted to iteration space sets and memory/array access functions with affine constraints that preclude the specification of run-time reordering transformations (i.e., inspector/executor strategies) within the existing code generation tools. Automatic generation of inspector and executor code is important for the parallelization and data locality improvements in irregular computations such as those that manipulate sparse data structures. We enable the specification of run-time reordering transformations at compile time in the Sparse Polyhedral Framework (SPF) by representing indirect memory references and run-time generated data and iteration reorderings using uninterpreted function symbols. This paper presents techniques for manipula...

The Sparse Polyhedral Framework: Composing Compiler-Generated Inspector-Executor Code

Proceedings of the IEEE, 2018

Irregular applications such as big graph analysis, material simulations, molecular dynamics simulations, and finite element analysis have performance problems due to their use of sparse data structures. Inspector-executor strategies improve sparse computation performance through parallelization and data locality optimizations. An inspector reschedules and reorders data at runtime, and an executor is a transformed version of the original computation that uses the newly reorganized schedules and data structures. Inspector-executor transformations are commonly written in a domain-specific or even application-specific fashion. Significant progress has been made in incorporating such inspector-executor transformations into existing compiler transformation frameworks, thus enabling their use with compile-time transformations. However, composing inspector-executor transformations in a general way has only been done in the context of the Sparse Polyhedral Framework (SPF). Though SPF enables the general composition of such transformations, the resulting inspector and executor performance suffers due to missed specialization opportunities. This paper reviews the history and current state of the art for inspector-executor strategies and reviews how the SPF enables the composition of inspector-executor transformations. Further, it describes a research vision to combine this generality in SPF with specialization to achieve composable and high performance inspectors and executors, producing a powerful compiler framework for sparse matrix computations.

Set and Relation Manipulation for the Sparse Polyhedral Framework

Lecture Notes in Computer Science, 2013

The Sparse Polyhedral Framework (SPF) extends the Polyhedral Model by using the uninterpreted function call abstraction for the compile-time specification of run-time reordering transformations such as loop and data reordering and sparse tiling approaches that schedule irregular sets of iteration across loops. The Polyhedral Model represents sets of iteration points in imperfectly nested loops with unions of polyhedral and represents loop transformations with affine functions applied to such polyhedra sets. Existing tools such as ISL, Cloog, and Omega manipulate polyhedral sets and affine functions, however the ability to represent the sets and functions where some of the constraints include uninterpreted function calls such as those needed in the SPF is non-existant or severely restricted. This paper presents algorithms for manipulating sets and relations with uninterpreted function symbols to enable the Sparse Polyhedral Framework. The algorithms have been implemented in an open source, C++ library called IEGenLib (The Inspector/Executor Generator Library).

Polyhedral Code Generation in the Real World

Lecture Notes in Computer Science, 2006

The polyhedral model is known to be a powerful framework to reason about high level loop transformations. Recent developments in optimizing compilers broke some generally accepted ideas about the limitations of this model. First, thanks to advances in dependence analysis for irregular access patterns, its applicability which was supposed to be limited to very simple loop nests has been extended to wide code regions. Then, new algorithms made it possible to compute the target code for hundreds of statements while this code generation step was expected not to be scalable. Such theoretical advances and new software tools allowed actors from both academia and industry to study more complex and realistic cases. Unfortunately, despite strong optimization potential of a given transformation for e.g., parallelism or data locality, code generation may still be challenging or result in high control overhead. This paper presents scalable code generation methods that make possible the application of increasingly complex program transformations. By studying the transformations themselves, we show how it is possible to benefit from their properties to dramatically improve both code generation quality and space/time complexity, with respect to the best state-of-theart code generation tool. In addition, we build on these improvements to present a new algorithm improving generated code performance for strided domains and reindexed schedules.

Compile-time composition of run-time data and iteration reorderings

Sigplan Notices, 2003

Many important applications, such as those using sparse data structures, have memory reference patterns that are unknown at compile-time. Prior work has developed runtime reorderings of data and computation that enhance locality in such applications.

Dynamic and speculative polyhedral parallelization using compiler-generated skeletons

Speculative parallelization is a classic strategy for automatically parallelizing codes that cannot be handled at compile-time due to the use of dynamic data and control structures. Another motivation of being speculative is to adapt the code to the current execution context, by selecting at run-time an efficient parallel schedule. However, since this parallelization scheme requires on-the-fly semantics verification, it is in general difficult to perform advanced transformations for optimization and parallelism extraction. We propose a framework dedicated to speculative parallelization of scientific nested loop kernels, able to transform the code at runtime by re-scheduling the iterations to exhibit parallelism and data locality. The run-time process includes a transformation selection guided by profiling phases on short samples, using an instrumented version of the code. During this phase, the accessed memory addresses are interpolated to build a predictor of the forthcoming accesses. The collected addresses are also used to compute on-the-fly dependence distance vectors by tracking accesses to common addresses. Interpolating functions and distance vectors are then employed in dynamic dependence analysis and in selecting a parallelizing transformation that, if the prediction is correct, does not induce any rollback during execution. In order to ensure that the rollback time overhead stays low, the code is executed in successive slices of the outermost original loop of the nest. Each slice can be either a parallelized version, a sequential original version, or an instrumented version. Moreover, such slicing of the execution provides the opportunity of transforming differently the code to adapt to the observed execution phases. Parallel code generation is achieved almost at no cost by using binary code patterns that are generated at compile-time and that are simply patched at run-time to result in the transformed code.

On automatic data structure selection and code generation for sparse computations

Lecture Notes in Computer Science, 1994

Traditionally restructuring compilers were only able to apply program transformations in order to exploit certain characteristics of the target architecture. Adaptation of data structures was limited to e.g. linearization or transposing of arrays. However, as more complex data structures are required to exploit characteristics of the data operated on, current compiler support appears to be inappropriate. In this paper we present the implementation issues of a restructuring compiler that automatically converts programs operating on dense matrices into sparse code, i.e. after a suited data structure has been selected for every dense matrix that in fact is sparse, the original code is adapted to operate on these data structures. This simpli es the task of the programmer and, in general, enables the compiler to apply more optimizations.

A practical automatic polyhedral parallelizer and locality optimizer

2008

We present the design and implementation of an automatic polyhedral source-to-source transformation framework that can optimize regular programs (sequences of possibly imperfectly nested loops) for parallelism and locality simultaneously. Through this work, we show the practicality of analytical model-driven automatic transformation in the polyhedral model.Unlike previous polyhedral frameworks, our approach is an end-to-end fully automatic one driven by an integer linear optimization framework that takes an explicit view of finding good ways of tiling for parallelism and locality using affine transformations. The framework has been implemented into a tool to automatically generate OpenMP parallel code from C program sections. Experimental results from the tool show very high performance for local and parallel execution on multi-cores, when compared with state-of-the-art compiler frameworks from the research community as well as the best native production compilers. The system also enables the easy use of powerful empirical/iterative optimization for general arbitrarily nested loop sequences.

Dynamic and Speculative Polyhedral Parallelization of Loop Nests Using Binary Code Patterns

Procedia Computer Science, 2013

Speculative parallelization is a classic strategy for automatically parallelizing codes that cannot be handled at compile-time due to the use of dynamic data and control structures. Another motivation of being speculative is to adapt the code to the current execution context, by selecting at run-time an efficient parallel schedule. However, since this parallelization scheme requires on-the-fly semantics verification, it is in general difficult to perform advanced transformations for optimization and parallelism extraction. We propose a framework dedicated to speculative parallelization of scientific nested loop kernels, able to transform the code at runtime by re-scheduling the iterations to exhibit parallelism and data locality. The run-time process includes a transformation selection guided by profiling phases on short samples, using an instrumented version of the code. During this phase, the accessed memory addresses are interpolated to build a predictor of the forthcoming accesses. The collected addresses are also used to compute on-the-fly dependence distance vectors by tracking accesses to common addresses. Interpolating functions and distance vectors are then employed in dynamic dependence analysis and in selecting a parallelizing transformation that, if the prediction is correct, does not induce any rollback during execution. In order to ensure that the rollback time overhead stays low, the code is executed in successive slices of the outermost original loop of the nest. Each slice can be either a parallelized version, a sequential original version, or an instrumented version. Moreover, such slicing of the execution provides the opportunity of transforming differently the code to adapt to the observed execution phases. Parallel code generation is achieved almost at no cost by using binary code patterns that are generated at compile-time and that are simply patched at run-time to result in the transformed code.

A Polyhedral Approach to Ease the Composition of Program Transformations

Lecture Notes in Computer Science, 2004

We wish to extend the effectiveness of loop-restructuring compilers by improving the robustness of loop transformations and easing their composition in long sequences. We propose a formal and practical framework for program transformation. Our framework is well suited for iterative optimization techniques searching not only for the appropriate parameters of a given transformation, but for the program transformations themselves, and especially for compositions of program transformations. This framework is based on a unified polyhedral representation of loops and statements, enabling the application of generalized control and data transformations without reference to a syntactic program representation. The key to our framework is to clearly separate the impact of each program transformation on three independent components: the iteration domain, the iteration schedule and the memory access functions. The composition of generalized transformations builds on normalization rules specific to each component of the representation. Our techniques have been implemented on top of Open64/ORC.