Apache Arrow — Apache Arrow v20.0.0 (original) (raw)

Apache Arrow is a universal columnar format and multi-language toolbox for fast data interchange and in-memory analytics.

The project specifies a language-independent column-oriented memory format for flat and hierarchical data, organized for efficient analytic operations on modern hardware. The project houses an actively developed collection of libraries in many languages for solving problems related to data transfer and in-memory analytical processing. This includes such topics as:

To learn how to use Arrow refer to the documentation specific to your target environment.

Specifications

Read about the Apache Arrow format and its related specifications and protocols.

Development

Find documentation on building the libraries from source, building the documentation, contributing and code reviews, continuous integration, benchmarking, and the release process.

Implementations#

Examples#