How to use Awkward Arrays in C++ with cppyy — Awkward Array 2.8.2 documentation (original) (raw)

Warning

Awkward Array can only work with cppyy 3.1 or later.

Warning

cppyy must be in a different venv or conda environment from ROOT, if you have installed ROOT, because the two packages define modules with conflicting names.

The cppyy is an automatic, run-time, Python-C++ bindings generator, for calling C++ from Python and Python from C++. cppyy is based on the C++ interpreter Cling.

cppyy can understand Awkward Arrays. When an ak.Array type is passed to a C++ function defined in cppyy, a __cast_cpp__ magic function of an ak.Array is invoked. The function dynamically generates a C++ type and a view of the array, if it has not been generated yet.

The view is a lightweight 40-byte C++ object dynamically allocated on the stack. This view is generated on demand - and only once per Awkward Array, the data are not copied.

import awkward as ak ak.version

import awkward._connect.cling

import cppyy cppyy.version

(Re-)building pre-compiled headers (options: -O2 -march=native); this may take a minute ... ERROR: cannot find etc/dictpch/allHeaders.h file here ./etc/dictpch/allHeaders.h nor here etc/dictpch/allHeaders.h

/opt/hostedtoolcache/Python/3.11.0/x64/lib/python3.11/site-packages/cppyy_backend/loader.py:139: UserWarning: No precompiled header available (failed to build); this may impact performance. warnings.warn('No precompiled header available (%s); this may impact performance.' % msg) input_line_10:2:45: error: explicit instantiation of '_M_use_local_data' does not refer to a function template, variable template, member function, member class, or static data member

template std:🧵:pointer std:🧵:_M_use_local_data(); ^ input_line_10:3:46: error: explicit instantiation of '_M_use_local_data' does not refer to a function template, variable template, member function, member class, or static data member template std::wstring::pointer std::wstring::_M_use_local_data(); ^

/opt/hostedtoolcache/Python/3.11.0/x64/lib/python3.11/site-packages/cppyy/init.py:356: UserWarning: CPyCppyy API not found (tried: /opt/hostedtoolcache/Python/3.11.0/x64/lib/python3.11/site-packages/../../../include/python3.11); set CPPYY_API_PATH envar to the 'CPyCppyy' API directory to fix warnings.warn("CPyCppyy API not found (tried: %s); set CPPYY_API_PATH envar to the 'CPyCppyy' API directory to fix" % apipath_extra)

Let’s define an Awkward Array as a list of records:

array = ak.Array( [ [{"x": 1, "y": [1.1]}, {"x": 2, "y": [2.2, 0.2]}], [], [{"x": 3, "y": [3.0, 0.3, 3.3]}], ] ) array

[[{x: 1, y: [1.1]}, {x: 2, y: [2.2, 0.2]}], [], [{x: 3, y: [3, 0.3, 3.3]}]]

backend: cpu nbytes: 136 B type: 3 * var * { x: int64, y: var * float64 }

This example shows a templated C++ function that takes an Awkward Array and iterates over the list of records:

source_code = """ template double go_fast_cpp(T& awkward_array) { double out = 0.0;

for (auto list : awkward_array) {
    for (auto record : list) {
        for (auto item : record.y()) {
            out += item;
        }
    }
}

return out;

} """

cppyy.cppdef(source_code)

The C++ type of an Awkward Array is a made-up type;awkward::ListArray_hyKwTH3lk1A.

'awkward::ListArray_h2HyX4Ok4'

Awkward Arrays are dynamically typed, so in a C++ context, the type name is hashed. In practice, there is no need to know the type. The C++ code should use a placeholder type specifier auto. The type of the variable that is being declared will be automatically deduced from its initializer.

In a Python contexts, when a templated function requires a C++ type as a Python string, it can use the ak.Array.cpp_type property:

out = cppyy.gbl.go_fast_cpparray.cpp_type

%%timeit

out = cppyy.gbl.go_fast_cpparray.cpp_type

5.45 μs ± 24.2 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

%%timeit

ak.sum(array["y"])

255 μs ± 3.82 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

But the result is the same.

assert out == ak.sum(array["y"])