accelerate (original) (raw)
accelerate: An embedded language for accelerated array processing
Data.Array.Accelerate defines an embedded array language for computations for high-performance computing in Haskell. Computations on multi-dimensional, regular arrays are expressed in the form of parameterised collective operations, such as maps, reductions, and permutations. These computations may then be online compiled and executed on a range of architectures.
A simple example
As a simple example, consider the computation of a dot product of two vectors of floating point numbers:
dotp :: Acc (Vector Float) -> Acc (Vector Float) -> Acc (Scalar Float) dotp xs ys = fold (+) 0 (zipWith (*) xs ys)
Except for the type, this code is almost the same as the corresponding Haskell code on lists of floats. The types indicate that the computation may be online-compiled for performance - for example, usingData.Array.Accelerate.LLVM.PTX it may be on-the-fly off-loaded to the GPU.
See the Data.Array.Accelerate module for further information.
Additional components
The following supported add-ons are available as separate packages. Use them by adding them as dependencies to your project's cabal file.
accelerate-llvm-native: Backend supporting parallel execution on multicore CPUs.accelerate-llvm-ptx: Backend supporting parallel execution on CUDA-capable NVIDIA GPUs. Requires a GPU with compute capability 2.0 or greater. See the following table for supported GPUs:http://en.wikipedia.org/wiki/CUDA#Supported_GPUscontainers-accelerate: Container types for use with Accelerate.hashable-accelerate: Class for types which can be converted to a hash value.colour-accelerate: Colour representations in Accelerate (RGB, sRGB, HSV, and HSL).mwc-random-accelerate: Generate Accelerate arrays filled with high quality pseudorandom numbers.
Additional libraries that have worked in the past but are not included in the current release (they may be updated later, check to be sure):
accelerate-examples: Computational kernels and applications demonstrating the use of Accelerate.accelerate-io*: Fast conversions between Accelerate arrays and other array and data formats.accelerate-fft: Discrete Fourier transforms, with FFI bindings to optimised implementations.accelerate-blas: Numeric linear algebra, with FFI bindings to optimised implementations.accelerate-bignum: Fixed-width large integer arithmetic.gloss-accelerate: Generate gloss pictures from Accelerate.gloss-raster-accelerate: Parallel rendering of raster images and animations.lens-accelerate: Lens operators for Accelerate types.linear-accelerate: Linear vector spaces in Accelerate.
Examples and documentation
Haddock documentation is included in the package.
The accelerate-examples package demonstrates a range of computational kernels and several complete applications, including:
- An implementation of the Canny edge detection algorithm
- Interactive Mandelbrot and Julia set generators
- A particle-based simulation of stable fluid flows
- An _n_-body simulation of gravitational attraction between solid particles
- An implementation of the PageRank algorithm
- A simple interactive ray tracer
- A cellular automata simulation
- A "password recovery" tool, for dictionary lookup of MD5 hashes
lulesh-accelerate is an implementation of the Livermore Unstructured Lagrangian Explicit Shock Hydrodynamics (LULESH) mini-app. LULESH represents a typical hydrodynamics code such as ALE3D, but is highly simplified and hard-coded to solve the Sedov blast problem on an unstructured hexahedron mesh.
Mailing list and contacts
- Gitter chat: https://gitter.im/AccelerateHS/Lobby
- Mailing list: accelerate-haskell@googlegroups.com (discussion of both use and development welcome).
- Sign up for the mailing list here:http://groups.google.com/group/accelerate-haskell
- Bug reports and issue tracking:https://github.com/AccelerateHS/accelerate/issues
Modules
[Index] [Quick Jump]
- Crypto
- Hash
* Crypto.Hash.XKCP
- Hash
- Data
- Array
* Data.Array.Accelerate
* Data.Array.Accelerate.AST
* Data.Array.Accelerate.AST.Environment
* Data.Array.Accelerate.AST.Idx
* Data.Array.Accelerate.AST.LeftHandSide
* Data.Array.Accelerate.AST.Var
* Analysis
* Data.Array.Accelerate.Analysis.Hash
* Data.Array.Accelerate.Analysis.Match
* Array
* Data.Array.Accelerate.Array.Data
* Data.Array.Accelerate.Array.Remote
* Data.Array.Accelerate.Array.Remote.Class
* Data.Array.Accelerate.Array.Remote.LRU
* Data.Array.Accelerate.Array.Remote.Table
* Data.Array.Accelerate.Array.Unique
* Data.Array.Accelerate.Async
* Control
* Data.Array.Accelerate.Control.Monad
* Data
* Data.Array.Accelerate.Data.Bits
* Data.Array.Accelerate.Data.Complex
* Data.Array.Accelerate.Data.Either
* Data.Array.Accelerate.Data.Fold
* Data.Array.Accelerate.Data.Functor
* Data.Array.Accelerate.Data.Maybe
* Data.Array.Accelerate.Data.Monoid
* Data.Array.Accelerate.Data.Ratio
* Data.Array.Accelerate.Data.Semigroup
* Debug
* Data.Array.Accelerate.Debug.Internal
* Data.Array.Accelerate.Debug.Trace
* Data.Array.Accelerate.Error
* Data.Array.Accelerate.Interpreter
* Data.Array.Accelerate.Lifetime
* Data.Array.Accelerate.Pretty
* Representation
* Data.Array.Accelerate.Representation.Array
* Data.Array.Accelerate.Representation.Elt
* Data.Array.Accelerate.Representation.Shape
* Data.Array.Accelerate.Representation.Slice
* Data.Array.Accelerate.Representation.Stencil
* Data.Array.Accelerate.Representation.Tag
* Data.Array.Accelerate.Representation.Type
* Data.Array.Accelerate.Representation.Vec
* Data.Array.Accelerate.Smart
* Sugar
* Data.Array.Accelerate.Sugar.Array
* Data.Array.Accelerate.Sugar.Elt
* Data.Array.Accelerate.Sugar.Foreign
* Data.Array.Accelerate.Sugar.Shape
* Data.Array.Accelerate.Sugar.Vec
* Test
* Data.Array.Accelerate.Test.NoFib
* Data.Array.Accelerate.Test.Similar
* Data.Array.Accelerate.Trafo
* Data.Array.Accelerate.Trafo.Config
* Data.Array.Accelerate.Trafo.Delayed
* Data.Array.Accelerate.Trafo.Fusion
* Data.Array.Accelerate.Trafo.LetSplit
* Data.Array.Accelerate.Trafo.Sharing
* Data.Array.Accelerate.Trafo.Simplify
* Data.Array.Accelerate.Trafo.Substitution
* Data.Array.Accelerate.Trafo.Var
* Data.Array.Accelerate.Type
* Data.Array.Accelerate.Unsafe - Data.BitSet
- Primitive
* Data.Primitive.Vec
- Array
Flags
Manual Flags
| Name | Description | Default |
|---|---|---|
| debug | Enable debug tracing messages.With debugging enabled, applications will read the following options from the environment variable ACCELERATE_FLAGS, and via the command-line as:./program +ACC ... -ACCNote that a backend may not implement (or be applicable to) all options.The following flags control phases of the compiler. The are enabled with-f and can be reversed with -fno-:acc-sharing: Enable sharing recovery of array expressions (True).exp-sharing: Enable sharing recovery of scalar expressions (True).fusion: Enable array fusion (True).inplace: Enable in-place array updates (True).force-recomp: Force recompilation of array programs (False).fast-math: Allow algebraically equivalent transformations which may change floating point results (e.g., reassociate) (True).fast-permute-const: Allow non-atomic `permute const` for product types (True).The following options control debug message output, and are enabled with-d.debug: Include debug symbols in the generated and compiled kernels.verbose: Be extra chatty.dump-phases: Print timing information about each phase of the compiler. Enable GC stats (+RTS -t or otherwise) for memory usage information.dump-sharing: Print information related to sharing recovery.dump-simpl-stats: Print statistics related to fusion & simplification.dump-simpl-iterations: Print a summary after each simplifier iteration.dump-vectorisation: Print information related to the vectoriser.dump-dot: Generate a representation of the program graph in Graphviz DOT format.dump-simpl-dot: Generate a more compact representation of the program graph in Graphviz DOT format. In particular, scalar expressions are elided.dump-gc: Print information related to the Accelerate garbage collector.dump-gc-stats: Print aggregate garbage collection information at the end of program execution.dump-cc: Print information related to kernel code generation/compilation. Print the generated code if verbose.dump-ld: Print information related to runtime linking.dump-asm: Print information related to kernel assembly. Print the assembled code if verbose.dump-exec: Print information related to program execution.dump-sched: Print information related to execution scheduling. | Disabled |
| tracy | Enable kernel profiling using Tracy. This flag requires +debug to also be set. Note: currently only works with accelerate-llvm-native; for PTX profiling, use Nvidia Nsight directly instead.The executables tracy (GUI) and 'tracy-capture' (command line) will be built to collect and view profiling data from supported backends. This requires several external dependencies:cmakepkg-configfreetype2glfw3gtk3 (linux only)TBB (should be part of your compiler toolchain)For example on Debian/Ubuntu you can install all of these via:sudo apt install cmake pkg-config libfreetype-dev libglfw3-dev libgtk-3-dev libtbb-devOr on macOS via:brew install cmake pkg-config freetype glfw | Disabled |
| bounds-checks | Enable bounds checking in the interpreter | Enabled |
| internal-checks | Enable some internal consistency checks | Disabled |
| nofib | Build the nofib test suite (required for backend testing) | Disabled |
Use -f to enable a flag, or -f - to disable that flag. More info
Downloads
- accelerate-1.4.0.0.tar.gz [browse] (Cabal source package)
- Package description (as included in the package)
| Versions [RSS] | 0.4.0, 0.5.0.0, 0.6.0.0, 0.7.1.0, 0.8.0.0, 0.8.1.0, 0.9.0.0, 0.9.0.1, 0.10.0.0, 0.12.0.0, 0.12.1.0, 0.12.2.0, 0.13.0.0, 0.13.0.1, 0.13.0.2, 0.13.0.3, 0.13.0.4, 0.13.0.5, 0.14.0.0, 0.15.0.0, 0.15.1.0, 1.0.0.0, 1.1.0.0, 1.1.1.0, 1.2.0.0, 1.2.0.1, 1.3.0.0, 1.4.0.0 |
|---|---|
| Change log | CHANGELOG.md |
| Dependencies | accelerate, ansi-terminal (>=0.6.2), base (>=4.12 && <4.23), base-orphans (>=0.3), bytestring (>=0.10.2), containers (>=0.3), deepseq (>=1.3), directory (>=1.0), double-conversion (>=2.0), exceptions (>=0.6), filepath (>=1.0), formatting (>=7.0), ghc-prim, half (>=0.3), hashable (>=1.1), hashtables (>=1.2.3), hedgehog (>=0.5), microlens (>=0.4), mtl (>=2.0), prettyprinter (>=1.7), prettyprinter-ansi-terminal (>=1.1.2), primitive (>=0.6.4), tasty (>=0.11), template-haskell (<2.25), terminal-size (>=0.3), text (>=1.2.4), transformers (>=0.3), unique, unix, unordered-containers (>=0.2), vector (>=0.10), Win32 [details] |
| Tested with | ghc >=8.6 |
| License | BSD-3-Clause |
| Author | The Accelerate Team |
| Maintainer | Trevor L. McDonell trevor.mcdonell@gmail.com |
| Uploaded | by tomsmeding at 2026-04-02T16:43:06Z |
| Category | Accelerate, Compilers/Interpreters, Concurrency, Data, Parallelism |
| Home page | https://github.com/AccelerateHS/accelerate/ |
| Bug tracker | https://github.com/AccelerateHS/accelerate/issues |
| Source repo | head: git clone https://github.com/AccelerateHS/accelerate.gitthis: git clone https://github.com/AccelerateHS/accelerate.git(tag v1.4.0.0) |
| Distributions | |
| Reverse Dependencies | 44 direct, 10 indirect [details] |
| Executables | tracy-capture, tracy |
| Downloads | 33922 total (41 in the last 30 days) |
| Rating | 2.5 (votes: 6)[estimated by Bayesian average] |
| Your Rating | λ λ λ |
| Status | Docs available [build log]Last success reported on 2026-04-02 [all 1 reports] |
Readme for accelerate-1.4.0.0
Data.Array.Accelerate defines an embedded language of array computations for high-performance computing in Haskell. Computations on multi-dimensional, regular arrays are expressed in the form of parameterised collective operations (such as maps, reductions, and permutations). These computations are online-compiled and executed on a range of architectures.
For more details, see our papers:
- Accelerating Haskell Array Codes with Multicore GPUs
- Optimising Purely Functional GPU Programs (slides)
- Embedding Foreign Code
- Type-safe Runtime Code Generation: Accelerate to LLVM (slides) (video)
- Streaming Irregular Arrays (video)
There are also slides from some presentations on Accelerate:
- Embedded Languages for High-Performance Computing in Haskell
- GPGPU Programming in Haskell with Accelerate (video) (workshop)
Chapter 6 of Simon Marlow's book Parallel and Concurrent Programming in Haskell contains a tutorial introduction to Accelerate.
Trevor's PhD thesis details the design and implementation of frontend optimisations and CUDA backend.
Table of Contents
A simple example
As a simple example, consider the computation of a dot product of two vectors of single-precision floating-point numbers:
dotp :: Acc (Vector Float) -> Acc (Vector Float) -> Acc (Scalar Float)
dotp xs ys = fold (+) 0 (zipWith (*) xs ys)
Except for the type, this code is almost the same as the corresponding Haskell code on lists of floats. The types indicate that the computation may be online-compiled for performance; for example, using Data.Array.Accelerate.LLVM.PTX.run it may be on-the-fly off-loaded to a GPU.
Availability
Package Accelerate is available from:
- Hackage: accelerate - just add it to your cabal file
- GitHub: AccelerateHS/accelerate - get the source with
git clone https://github.com/AccelerateHS/accelerate.git
To install the Haskell toolchain try GHCup.
Additional components
The following supported add-ons are available as separate packages:
- accelerate-llvm-native: Backend targeting multicore CPUs
- accelerate-llvm-ptx: Backend targeting CUDA-enabled NVIDIA GPUs. Requires a GPU with compute capability 3.0 or greater (see the table on Wikipedia)
- accelerate-examples: Computational kernels and applications showcasing the use of Accelerate as well as a regression test suite (supporting function and performance testing)
- Conversion between various formats:
- accelerate-io: For copying data directly between raw pointers
- accelerate-io-array: Immutable arrays
- accelerate-io-bmp: Uncompressed BMP image files
- accelerate-io-bytestring: Compact, immutable binary data
- accelerate-io-cereal: Binary serialisation of arrays using cereal
- accelerate-io-JuicyPixels: Images in various pixel formats
- accelerate-io-repa: Another Haskell library for high-performance parallel arrays
- accelerate-io-serialise: Binary serialisation of arrays using serialise
- accelerate-io-vector: Efficient boxed and unboxed one-dimensional arrays
- accelerate-fft: Fast Fourier transform implementation, with FFI bindings to optimised implementations
- accelerate-blas: BLAS and LAPACK operations, with FFI bindings to optimised implementations
- accelerate-bignum: Fixed-width large integer arithmetic
- colour-accelerate: Colour representations in Accelerate (RGB, sRGB, HSV, and HSL)
- containers-accelerate: Hashing-based container types
- gloss-accelerate: Generate gloss pictures from Accelerate
- gloss-raster-accelerate: Parallel rendering of raster images and animations
- hashable-accelerate: A class for types which can be converted into a hash value
- lens-accelerate: Lens operators for Accelerate types
- linear-accelerate: Linear vector spaces in Accelerate
- mwc-random-accelerate: Generate Accelerate arrays filled with high quality pseudorandom numbers
- numeric-prelude-accelerate: Lifting the numeric-prelude to Accelerate
- wigner-ville-accelerate: Wigner-Ville time-frequency distribution.
These are all available on Hackage.
Documentation
- Haddock documentation is included and linked with the individual package releases on Hackage.
- The idea behind the HOAS (higher-order abstract syntax) to de-Bruijn conversion used in the library is described separately.
Examples
accelerate-examples
The accelerate-examples package provides a range of computational kernels and a few complete applications. The examples include:
- An implementation of canny edge detection
- An interactive mandelbrot set generator
- An N-body simulation of gravitational attraction between solid particles
- An implementation of the PageRank algorithm
- A simple ray-tracer
- A particle based simulation of stable fluid flows
- A cellular automata simulation
- A "password recovery" tool, for dictionary lookup of MD5 hashes
To run these, either get the source from Hackage using cabal get accelerate-examples or clone the git repository, then use cabal run on the individual executables.
LULESH
LULESH-accelerate is in implementation of the Livermore Unstructured Lagrangian Explicit Shock Hydrodynamics (LULESH) mini-app. LULESH represents a typical hydrodynamics code such as ALE3D, but is a highly simplified application, hard-coded to solve the Sedov blast problem on an unstructured hexahedron mesh.

Additional examples
Accelerate users have also built some substantial applications of their own. Please feel free to add your own examples!
- Jonathan Fraser, GPUVAC: An explicit advection magnetohydrodynamics simulation
- David van Balen, Sudokus: A sudoku solver
- Trevor L. McDonell, lol-accelerate: A backend to the Λ ○ λ (Lol) library for ring-based lattice cryptography
- Henning Thielemann, patch-image: Combine a collage of overlapping images
- apunktbau, bildpunkt: A ray-marching distance field renderer
- klarh, hasdy: Molecular dynamics in Haskell using Accelerate
- Alexandros Gremm used Accelerate as part of the 2014 CSCS summer school (code)
Who are we?
The Accelerate team (past and present) consists of:
- Manuel M T Chakravarty (@mchakravarty)
- Gabriele Keller (@gckeller)
- Trevor L. McDonell (@tmcdonell)
- Robert Clifton-Everest (@robeverest)
- Frederik M. Madsen (@fmma)
- Ryan R. Newton (@rrnewton)
- Joshua Meredith (@JoshMeredith)
- Ben Lever (@blever)
- Sean Seefried (@sseefried)
- Ivo Gabe de Wolff (@ivogabe)
- Tom Smeding (@tomsmeding)
The maintainer and principal developer of Accelerate is Trevor L. McDonell trevor.mcdonell@gmail.com.
- Mailing list: accelerate-haskell@googlegroups.com (discussions on both use and development are welcome)
- Sign up for the mailing list at the Accelerate Google Groups page
- Bug reports and issues tracking: GitHub project page
- Chat with us on gitter
Citing Accelerate
If you use Accelerate for academic research, you are encouraged (though not required) to cite the following papers:
- Manuel M. T. Chakravarty, Gabriele Keller, Sean Lee, Trevor L. McDonell, and Vinod Grover.Accelerating Haskell Array Codes with Multicore GPUs. In DAMP '11: Declarative Aspects of Multicore Programming, ACM, 2011.
- Trevor L. McDonell, Manuel M. T. Chakravarty, Gabriele Keller, and Ben Lippmeier.Optimising Purely Functional GPU Programs. In ICFP '13: The 18th ACM SIGPLAN International Conference on Functional Programming, ACM, 2013.
- Robert Clifton-Everest, Trevor L. McDonell, Manuel M. T. Chakravarty, and Gabriele Keller.Embedding Foreign Code. In PADL '14: The 16th International Symposium on Practical Aspects of Declarative Languages, Springer-Verlag, LNCS, 2014.
- Trevor L. McDonell, Manuel M. T. Chakravarty, Vinod Grover, and Ryan R. Newton.Type-safe Runtime Code Generation: Accelerate to LLVM. In Haskell '15: The 8th ACM SIGPLAN Symposium on Haskell, ACM, 2015.
- Robert Clifton-Everest, Trevor L. McDonell, Manuel M. T. Chakravarty, and Gabriele Keller.Streaming Irregular Arrays. In Haskell '17: The 10th ACM SIGPLAN Symposium on Haskell, ACM, 2017.
Accelerate is primarily developed by academics, so citations matter a lot to us. As an added benefit, you increase Accelerate's exposure and potential user (and developer!) base, which is a benefit to all users of Accelerate. Thanks in advance!
What's missing?
Here is a list of features that are currently missing:
- Preliminary API (parts of the API may still change in subsequent releases)
- Many more features... contact us!

