ak.to_parquet — Awkward Array 2.8.2 documentation (original) (raw)

Defined in awkward.operations.ak_to_parquet on line 22.

ak.to_parquet(array, destination, *, list_to32=False, string_to32=True, bytestring_to32=True, emptyarray_to=None, categorical_as_dictionary=False, extensionarray=True, count_nulls=True, compression='zstd', compression_level=None, row_group_size=64 * 1024 * 1024, data_page_size=None, parquet_flavor=None, parquet_version='2.4', parquet_page_version='1.0', parquet_metadata_statistics=True, parquet_dictionary_encoding=False, parquet_byte_stream_split=False, parquet_coerce_timestamps=None, parquet_old_int96_timestamps=None, parquet_compliant_nested=False, parquet_extra_options=None, storage_options=None)#

Parameters:

Returns:pyarrow._parquet.FileMetaData instance

Writes an Awkward Array to a Parquet file (through pyarrow).

array1 = ak.Array([[1, 2, 3], [], [4, 5], [], [], [6, 7, 8, 9]]) ak.to_parquet(array1, "array1.parquet") <pyarrow._parquet.FileMetaData object at 0x7f646c38ff40> created_by: parquet-cpp-arrow version 9.0.0 num_columns: 1 num_rows: 6 num_row_groups: 1 format_version: 2.6 serialized_size: 0

If the array does not contain records at top-level, the Arrow table will consist of one field whose name is "" iff. extensionarray is False.

If extensionarray is True``, use a custom Arrow extension to store this array. Otherwise, generic Arrow arrays are used, and if the array does not contain records at top-level, the Arrow table will consist of one field whose name is "". See ak.to_arrow_table for more details.

Parquet files can maintain the distinction between “option-type but no elements are missing” and “not option-type” at all levels, including the top level. However, there is no distinction between ?union[X, Y, Z]] type and union[?X, ?Y, ?Z] type. Be aware of these type distinctions when passing data through Arrow or Parquet.

See also ak.to_arrow, which is used as an intermediate step.