ak.to_arrow — Awkward Array 2.8.2 documentation (original) (raw)
Defined in awkward.operations.ak_to_arrow on line 15.
ak.to_arrow(array, *, list_to32=False, string_to32=False, bytestring_to32=False, emptyarray_to=None, categorical_as_dictionary=False, extensionarray=True, count_nulls=True)#
Parameters:
- array – Array-like data (anything ak.to_layout recognizes).
- list_to32 (bool) – If True, convert Awkward lists into 32-bit Arrow lists if they’re small enough, even if it means an extra conversion. Otherwise, signed 32-bit ak.types.ListType maps to Arrow
ListType
, signed 64-bit ak.types.ListType maps to ArrowLargeListType
, and unsigned 32-bit ak.types.ListType picks whichever Arrow type its values fit into. - string_to32 (bool) – Same as the above for Arrow
string
andlarge_string
. - bytestring_to32 (bool) – Same as the above for Arrow
binary
andlarge_binary
. - emptyarray_to (None or dtype) – If None, ak.types.UnknownType maps to Arrow’s null type; otherwise, it is converted a given numeric dtype.
- categorical_as_dictionary (bool) – If True, ak.contents.IndexedArray andak.contents.IndexedOptionArray labeled with
__array__ = "categorical"
are mapped to ArrowDictionaryArray
; otherwise, the projection is evaluated before conversion (always the case without__array__ = "categorical"
). - extensionarray (bool) – If True, this function returns extended Arrow arrays (at all levels of nesting), which preserve metadata so that Awkward → Arrow → Awkward preserves the array’s ak.types.Type (though not the ak.forms.Form). If False, this function returns generic Arrow arrays that might be needed for third-party tools that don’t recognize Arrow’s extensions. Even with
extensionarray=False
, the values produced by Arrow’sto_pylist
method are the same as the values produced by Awkward’sak.to_list. - count_nulls (bool) – If True, count the number of missing values at each level and include these in the resulting Arrow array, which makes some downstream applications faster. If False, skip the up-front cost of counting them.
Converts an Awkward Array into an Apache Arrow array.
This produces arrays of type pyarrow.Array
. You might need to further manipulations (using the pyarrow library) to build a pyarrow.ChunkedArray
, a pyarrow.RecordBatch
, or a pyarrow.Table
. For the latter, see ak.to_arrow_table.
This function always preserves the values of a dataset; i.e. the Python objects returned by ak.to_list are identical to the Python objects returned by Arrow’sto_pylist
method. With extensionarray=True
, this function also preserves the data type (high-level ak.types.Type, though not the low-level ak.forms.Form), even through Parquet, making Parquet a good way to save Awkward Arrays for later use. If any third-party tools don’t recognize Arrow’s extension arrays, set this option to False for plain Arrow arrays.
See also ak.from_arrow, ak.to_arrow_table, ak.to_parquet, ak.from_arrow_schema.