pyarrow.fs.FileSystem — Apache Arrow v20.0.0 (original) (raw)
class pyarrow.fs.FileSystem#
Bases: _Weakrefable
Abstract file system API.
__init__(*args, **kwargs)#
Methods
Attributes
copy_file(self, src, dest)#
Copy a file.
If the destination exists and is a directory, an error is returned. Otherwise, it is replaced.
Parameters:
srcstr
The path of the file to be copied from.
deststr
The destination path where the file is copied to.
Examples
local.copy_file(path, ... local_path + '/pyarrow-fs-example_copy.dat')
Inspect the file info:
local.get_file_info(local_path + '/pyarrow-fs-example_copy.dat') <FileInfo for '/.../pyarrow-fs-example_copy.dat': type=FileType.File, size=4> local.get_file_info(path) <FileInfo for '/.../pyarrow-fs-example.dat': type=FileType.File, size=4>
create_dir(self, path, *, bool recursive=True)#
Create a directory and subdirectories.
This function succeeds if the directory already exists.
Parameters:
pathstr
The path of the new directory.
Create nested directories as well.
delete_dir(self, path)#
Delete a directory and its contents, recursively.
Parameters:
pathstr
The path of the directory to be deleted.
delete_dir_contents(self, path, *, bool accept_root_dir=False, bool missing_dir_ok=False)#
Delete a directory’s contents, recursively.
Like delete_dir, but doesn’t delete the directory itself.
Parameters:
pathstr
The path of the directory to be deleted.
accept_root_dirbool, default False
Allow deleting the root directory’s contents (if path is empty or “/”)
missing_dir_okbool, default False
If False then an error is raised if path does not exist
delete_file(self, path)#
Delete a file.
Parameters:
pathstr
The path of the file to be deleted.
equals(self, FileSystem other)#
Parameters:
Returns:
static from_uri(uri)#
Create a new FileSystem from URI or Path.
Recognized URI schemes are “file”, “mock”, “s3fs”, “gs”, “gcs”, “hdfs” and “viewfs”. In addition, the argument can be a pathlib.Path object, or a string describing an absolute local path.
Parameters:
uristr
URI-based path, for example: file:///some/local/path.
Returns:
tuple of (FileSystem, str path)
With (filesystem, path) tuple where path is the abstract path inside the FileSystem instance.
Examples
Create a new FileSystem subclass from a URI:
uri = 'file:///{}/pyarrow-fs-example.dat'.format(local_path) local_new, path_new = fs.FileSystem.from_uri(uri) local_new <pyarrow._fs.LocalFileSystem object at ... path_new '/.../pyarrow-fs-example.dat'
Or from a s3 bucket:
fs.FileSystem.from_uri("s3://usgs-landsat/collection02/") (<pyarrow._s3fs.S3FileSystem object at ...>, 'usgs-landsat/collection02')
get_file_info(self, paths_or_selector)#
Get info for the given files.
Any symlink is automatically dereferenced, recursively. A non-existing or unreachable file returns a FileStat object and has a FileType of value NotFound. An exception indicates a truly exceptional condition (low-level I/O error, etc.).
Parameters:
paths_or_selectorFileSelector, path-like or list of path-likes
Either a selector object, a path-like object or a list of path-like objects. The selector’s base directory will not be part of the results, even if it exists. If it doesn’t exist, use allow_not_found.
Returns:
Single FileInfo object is returned for a single path, otherwise a list of FileInfo objects is returned.
Examples
local <pyarrow._fs.LocalFileSystem object at ...> local.get_file_info("/{}/pyarrow-fs-example.dat".format(local_path)) <FileInfo for '/.../pyarrow-fs-example.dat': type=FileType.File, size=4>
move(self, src, dest)#
Move / rename a file or directory.
If the destination exists: - if it is a non-empty directory, an error is returned - otherwise, if it has the same type as the source, it is replaced - otherwise, behavior is unspecified (implementation-dependent).
Parameters:
srcstr
The path of the file or the directory to be moved.
deststr
The destination path where the file or directory is moved to.
Examples
Create a new folder with a file:
local.create_dir('/tmp/other_dir') local.copy_file(path,'/tmp/move_example.dat')
Move the file:
local.move('/tmp/move_example.dat', ... '/tmp/other_dir/move_example_2.dat')
Inspect the file info:
local.get_file_info('/tmp/other_dir/move_example_2.dat') <FileInfo for '/tmp/other_dir/move_example_2.dat': type=FileType.File, size=4> local.get_file_info('/tmp/move_example.dat') <FileInfo for '/tmp/move_example.dat': type=FileType.NotFound>
Delete the folder: >>> local.delete_dir(‘/tmp/other_dir’)
normalize_path(self, path)#
Normalize filesystem path.
Parameters:
pathstr
The path to normalize
Returns:
normalized_pathstr
The normalized path
open_append_stream(self, path, compression='detect', buffer_size=None, metadata=None)#
Open an output stream for appending.
If the target doesn’t exist, a new empty file is created.
Note
Some filesystem implementations do not support efficient appending to an existing file, in which case this method will raise NotImplementedError. Consider writing to multiple files (using e.g. the dataset layer) instead.
Parameters:
pathstr
The source to open for writing.
compressionstr optional, default ‘detect’
The compression algorithm to use for on-the-fly compression. If “detect” and source is a file path, then compression will be chosen based on the file extension. If None, no compression will be applied. Otherwise, a well-known algorithm name must be supplied (e.g. “gzip”).
buffer_sizeint optional, default None
If None or 0, no buffering will happen. Otherwise the size of the temporary write buffer.
metadatadict optional, default None
If not None, a mapping of string keys to string values. Some filesystems support storing metadata along the file (such as “Content-Type”). Unsupported metadata keys will be ignored.
Returns:
streamNativeFile
Examples
Append new data to a FileSystem subclass with nonempty file:
with local.open_append_stream(path) as f: ... f.write(b'+newly added') 12
Print out the content to the file:
with local.open_input_file(path) as f: ... print(f.readall()) b'data+newly added'
open_input_file(self, path)#
Open an input file for random access reading.
Parameters:
pathstr
The source to open for reading.
Returns:
streamNativeFile
Examples
Print the data from the file with open_input_file():
with local.open_input_file(path) as f: ... print(f.readall()) b'data'
open_input_stream(self, path, compression='detect', buffer_size=None)#
Open an input stream for sequential reading.
Parameters:
pathstr
The source to open for reading.
compressionstr optional, default ‘detect’
The compression algorithm to use for on-the-fly decompression. If “detect” and source is a file path, then compression will be chosen based on the file extension. If None, no compression will be applied. Otherwise, a well-known algorithm name must be supplied (e.g. “gzip”).
buffer_sizeint optional, default None
If None or 0, no buffering will happen. Otherwise the size of the temporary read buffer.
Returns:
streamNativeFile
Examples
Print the data from the file with open_input_stream():
with local.open_input_stream(path) as f: ... print(f.readall()) b'data'
open_output_stream(self, path, compression='detect', buffer_size=None, metadata=None)#
Open an output stream for sequential writing.
If the target already exists, existing data is truncated.
Parameters:
pathstr
The source to open for writing.
compressionstr optional, default ‘detect’
The compression algorithm to use for on-the-fly compression. If “detect” and source is a file path, then compression will be chosen based on the file extension. If None, no compression will be applied. Otherwise, a well-known algorithm name must be supplied (e.g. “gzip”).
buffer_sizeint optional, default None
If None or 0, no buffering will happen. Otherwise the size of the temporary write buffer.
metadatadict optional, default None
If not None, a mapping of string keys to string values. Some filesystems support storing metadata along the file (such as “Content-Type”). Unsupported metadata keys will be ignored.
Returns:
streamNativeFile
Examples
local = fs.LocalFileSystem() with local.open_output_stream(path) as stream: ... stream.write(b'data') 4
type_name#
The filesystem’s type name.