pyarrow.PythonFile — Apache Arrow v20.0.0 (original) (raw)

class pyarrow.PythonFile#

Bases: NativeFile

A stream backed by a Python file object.

This class allows using Python file objects with arbitrary Arrow functions, including functions written in another language than Python.

As a downside, there is a non-zero redirection cost in translating Arrow stream calls to Python method calls. Furthermore, Python’s Global Interpreter Lock may limit parallelism in some situations.

Examples

import io import pyarrow as pa pa.PythonFile(io.BytesIO()) <pyarrow.PythonFile closed=False own_file=False is_seekable=False is_writable=True is_readable=False>

Create a stream for writing:

buf = io.BytesIO() f = pa.PythonFile(buf, mode = 'w') f.writable() True f.write(b'PythonFile') 10 buf.getvalue() b'PythonFile' f.close() f <pyarrow.PythonFile closed=True own_file=False is_seekable=False is_writable=True is_readable=False>

Create a stream for reading:

buf = io.BytesIO(b'PythonFile') f = pa.PythonFile(buf, mode = 'r') f.mode 'rb' f.read() b'PythonFile' f <pyarrow.PythonFile closed=False own_file=False is_seekable=True is_writable=False is_readable=True> f.close() f <pyarrow.PythonFile closed=True own_file=False is_seekable=True is_writable=False is_readable=True>

__init__(*args, **kwargs)#

Methods

Attributes

close(self)#

closed#

download(self, stream_or_path, buffer_size=None)#

Read this file completely to a local path or destination stream.

This method first seeks to the beginning of the file.

Parameters:

stream_or_pathstr or file-like object

If a string, a local file path to write to; otherwise, should be a writable stream.

buffer_sizeint, optional

The buffer size to use for data transfers.

fileno(self)#

NOT IMPLEMENTED

flush(self)#

Flush the stream, if applicable.

An error is raised if stream is not writable.

get_stream(self, file_offset, nbytes)#

Return an input stream that reads a file segment independent of the state of the file.

Allows reading portions of a random access file as an input stream without interfering with each other.

Parameters:

file_offsetint

nbytesint

Returns:

streamNativeFile

isatty(self)#

metadata(self)#

Return file metadata

mode#

The file mode. Currently instances of NativeFile may support:

read(self, nbytes=None)#

Read and return up to n bytes.

If nbytes is None, then the entire remaining file contents are read.

Parameters:

nbytesint, default None

Returns:

databytes

read1(self, nbytes=None)#

Read and return up to n bytes.

Unlike read(), if nbytes is None then a chunk is read, not the entire file.

Parameters:

nbytesint, default None

The maximum number of bytes to read.

Returns:

databytes

read_at(self, nbytes, offset)#

Read indicated number of bytes at offset from the file

Parameters:

nbytesint

offsetint

Returns:

databytes

read_buffer(self, nbytes=None)#

Read from buffer.

Parameters:

nbytesint, optional

maximum number of bytes read

readable(self)#

readall(self)#

readinto(self, b)#

Read into the supplied buffer

Parameters:

bbuffer-like object

A writable buffer object (such as a bytearray).

Returns:

writtenint

number of bytes written

readline(self, size=None)#

Read and return a line of bytes from the file.

If size is specified, read at most size bytes.

Parameters:

sizeint

Maximum number of bytes read

readlines(self, hint=None)#

Read lines of the file.

Parameters:

hintint

Maximum number of bytes read until we stop

seek(self, int64_t position, int whence=0)#

Change current file stream position

Parameters:

positionint

Byte offset, interpreted relative to value of whence argument

whenceint, default 0

Point of reference for seek offset

Returns:

int

The new absolute stream position.

Notes

Values of whence: * 0 – start of stream (the default); offset should be zero or positive * 1 – current stream position; offset may be negative * 2 – end of stream; offset is usually negative

seekable(self)#

size(self)#

Return file size

tell(self)#

Return current stream position

truncate(self, pos=None)#

Parameters:

posint, optional

upload(self, stream, buffer_size=None)#

Write from a source stream to this file.

Parameters:

streamfile-like object

Source stream to pipe to this file.

buffer_sizeint, optional

The buffer size to use for data transfers.

writable(self)#

write(self, data)#

Write data to the file.

Parameters:

databytes-like object or exporter of buffer protocol

Returns:

int

nbytes: number of bytes written

writelines(self, lines)#

Write lines to the file.

Parameters:

linesiterable

Iterable of bytes-like objects or exporters of buffer protocol