pyarrow.ipc.RecordBatchFileWriter — Apache Arrow v20.0.0 (original) (raw)

class pyarrow.ipc.RecordBatchFileWriter(sink, schema, *, use_legacy_format=None, options=None)[source]#

Bases: _RecordBatchFileWriter

Writer to create the Arrow binary file format

Parameters:

sinkstr, pyarrow.NativeFile, or file-like Python object

Either a file path, or a writable file object.

schemapyarrow.Schema

The Arrow schema for data to be written to the file.

use_legacy_formatbool, default None

Deprecated in favor of setting options. Cannot be provided with options.

If None, False will be used unless this default is overridden by setting the environment variable ARROW_PRE_0_15_IPC_FORMAT=1

optionspyarrow.ipc.IpcWriteOptions

Options for IPC serialization.

If None, default values will be used: the legacy format will not be used unless overridden by setting the environment variable ARROW_PRE_0_15_IPC_FORMAT=1, and the V5 metadata version will be used unless overridden by setting the environment variable ARROW_PRE_1_0_METADATA_VERSION=1.

__init__(sink, schema, *, use_legacy_format=None, options=None)[source]#

Methods

Attributes

close(self)#

Close stream and write end-of-stream 0 marker.

stats#

Current IPC write statistics.

write(self, table_or_batch)#

Write RecordBatch or Table to stream.

Parameters:

table_or_batch{RecordBatch, Table}

write_batch(self, RecordBatch batch, custom_metadata=None)#

Write RecordBatch to stream.

Parameters:

batchRecordBatch

custom_metadatamapping or KeyValueMetadata

Keys and values must be string-like / coercible to bytes

write_table(self, Table table, max_chunksize=None)#

Write Table to stream in (contiguous) RecordBatch objects.

Parameters:

tableTable

max_chunksizeint, default None

Maximum number of rows for RecordBatch chunks. Individual chunks may be smaller depending on the chunk layout of individual columns.