pyarrow.PythonFile — Apache Arrow v20.0.0 (original) (raw)
class pyarrow.PythonFile#
Bases: NativeFile
A stream backed by a Python file object.
This class allows using Python file objects with arbitrary Arrow functions, including functions written in another language than Python.
As a downside, there is a non-zero redirection cost in translating Arrow stream calls to Python method calls. Furthermore, Python’s Global Interpreter Lock may limit parallelism in some situations.
Examples
import io import pyarrow as pa pa.PythonFile(io.BytesIO()) <pyarrow.PythonFile closed=False own_file=False is_seekable=False is_writable=True is_readable=False>
Create a stream for writing:
buf = io.BytesIO() f = pa.PythonFile(buf, mode = 'w') f.writable() True f.write(b'PythonFile') 10 buf.getvalue() b'PythonFile' f.close() f <pyarrow.PythonFile closed=True own_file=False is_seekable=False is_writable=True is_readable=False>
Create a stream for reading:
buf = io.BytesIO(b'PythonFile') f = pa.PythonFile(buf, mode = 'r') f.mode 'rb' f.read() b'PythonFile' f <pyarrow.PythonFile closed=False own_file=False is_seekable=True is_writable=False is_readable=True> f.close() f <pyarrow.PythonFile closed=True own_file=False is_seekable=True is_writable=False is_readable=True>
__init__(*args, **kwargs)#
Methods
Attributes
close(self)#
closed#
download(self, stream_or_path, buffer_size=None)#
Read this file completely to a local path or destination stream.
This method first seeks to the beginning of the file.
Parameters:
stream_or_pathstr or file-like object
If a string, a local file path to write to; otherwise, should be a writable stream.
buffer_sizeint, optional
The buffer size to use for data transfers.
fileno(self)#
NOT IMPLEMENTED
flush(self)#
Flush the stream, if applicable.
An error is raised if stream is not writable.
get_stream(self, file_offset, nbytes)#
Return an input stream that reads a file segment independent of the state of the file.
Allows reading portions of a random access file as an input stream without interfering with each other.
Parameters:
file_offsetint
nbytesint
Returns:
streamNativeFile
isatty(self)#
metadata(self)#
Return file metadata
mode#
The file mode. Currently instances of NativeFile may support:
- rb: binary read
- wb: binary write
- rb+: binary read and write
- ab: binary append
read(self, nbytes=None)#
Read and return up to n bytes.
If nbytes is None, then the entire remaining file contents are read.
Parameters:
Returns:
databytes
read1(self, nbytes=None)#
Read and return up to n bytes.
Unlike read(), if nbytes is None then a chunk is read, not the entire file.
Parameters:
The maximum number of bytes to read.
Returns:
databytes
read_at(self, nbytes, offset)#
Read indicated number of bytes at offset from the file
Parameters:
nbytesint
offsetint
Returns:
databytes
read_buffer(self, nbytes=None)#
Read from buffer.
Parameters:
nbytesint, optional
maximum number of bytes read
readable(self)#
readall(self)#
readinto(self, b)#
Read into the supplied buffer
Parameters:
bbuffer-like object
A writable buffer object (such as a bytearray).
Returns:
writtenint
number of bytes written
readline(self, size=None)#
Read and return a line of bytes from the file.
If size is specified, read at most size bytes.
Parameters:
sizeint
Maximum number of bytes read
readlines(self, hint=None)#
Read lines of the file.
Parameters:
hintint
Maximum number of bytes read until we stop
seek(self, int64_t position, int whence=0)#
Change current file stream position
Parameters:
positionint
Byte offset, interpreted relative to value of whence argument
whenceint, default 0
Point of reference for seek offset
Returns:
The new absolute stream position.
Notes
Values of whence: * 0 – start of stream (the default); offset should be zero or positive * 1 – current stream position; offset may be negative * 2 – end of stream; offset is usually negative
seekable(self)#
size(self)#
Return file size
tell(self)#
Return current stream position
truncate(self, pos=None)#
Parameters:
posint, optional
upload(self, stream, buffer_size=None)#
Write from a source stream to this file.
Parameters:
streamfile-like object
Source stream to pipe to this file.
buffer_sizeint, optional
The buffer size to use for data transfers.
writable(self)#
write(self, data)#
Write data to the file.
Parameters:
databytes-like object or exporter
of buffer protocol
Returns:
nbytes: number of bytes written
writelines(self, lines)#
Write lines to the file.
Parameters:
linesiterable
Iterable of bytes-like objects or exporters of buffer protocol