pandas.read_sas — pandas 2.2.3 documentation (original) (raw)
pandas.read_sas(filepath_or_buffer, *, format=None, index=None, encoding=None, chunksize=None, iterator=False, compression='infer')[source]#
Read SAS files stored as either XPORT or SAS7BDAT format files.
Parameters:
filepath_or_bufferstr, path object, or file-like object
String, path object (implementing os.PathLike[str]
), or file-like object implementing a binary read()
function. The string could be a URL. Valid URL schemes include http, ftp, s3, and file. For file URLs, a host is expected. A local file could be:file://localhost/path/to/table.sas7bdat
.
formatstr {‘xport’, ‘sas7bdat’} or None
If None, file format is inferred from file extension. If ‘xport’ or ‘sas7bdat’, uses the corresponding format.
indexidentifier of index column, defaults to None
Identifier of column that should be used as index of the DataFrame.
encodingstr, default is None
Encoding for text data. If None, text data are stored as raw bytes.
chunksizeint
Read file chunksize lines at a time, returns iterator.
iteratorbool, defaults to False
If True, returns an iterator for reading the file incrementally.
compressionstr or dict, default ‘infer’
For on-the-fly decompression of on-disk data. If ‘infer’ and ‘filepath_or_buffer’ is path-like, then detect compression from the following extensions: ‘.gz’, ‘.bz2’, ‘.zip’, ‘.xz’, ‘.zst’, ‘.tar’, ‘.tar.gz’, ‘.tar.xz’ or ‘.tar.bz2’ (otherwise no compression). If using ‘zip’ or ‘tar’, the ZIP file must contain only one data file to be read in. Set to None
for no decompression. Can also be a dict with key 'method'
set to one of {'zip'
, 'gzip'
, 'bz2'
, 'zstd'
, 'xz'
, 'tar'
} and other key-value pairs are forwarded tozipfile.ZipFile
, gzip.GzipFile
,bz2.BZ2File
, zstandard.ZstdDecompressor
, lzma.LZMAFile
ortarfile.TarFile
, respectively. As an example, the following could be passed for Zstandard decompression using a custom compression dictionary:compression={'method': 'zstd', 'dict_data': my_compression_dict}
.
Added in version 1.5.0: Added support for .tar files.
Returns:
DataFrame if iterator=False and chunksize=None, else SAS7BDATReader
or XportReader
Examples
df = pd.read_sas("sas_data.sas7bdat")