anndata.AnnData.write (original) (raw)

anndata.AnnData.write#

AnnData.write(filename=None, compression=None, compression_opts=None, as_dense=(), *, convert_strings_to_categoricals=True)[source]#

Write .h5ad-formatted hdf5 file.

Note

Setting compression to 'gzip' can save disk space but will slow down writing and subsequent reading. Prior to v0.6.16, this was the default for parameter compression.

Generally, if you have sparse data that are stored as a dense matrix, you can dramatically improve performance and reduce disk space by converting to a csr_matrix:

from scipy.sparse import csr_matrix adata.X = csr_matrix(adata.X)

Parameters:

filename PathLike[str] | str | None (default: None)

Filename of data file. Defaults to backing file.

convert_strings_to_categoricals bool (default: True)

Convert string columns to categorical.

compression Optional[Literal['gzip', 'lzf']] (default: None)

For [lzf, gzip], see the h5py Filter pipeline.

Alternative compression filters such as zstd can be passed from the hdf5plugin library. Experimental.

Usage example:

import hdf5plugin adata.write_h5ad( filename, compression=hdf5plugin.FILTERS["zstd"] )

Note

Datasets written with hdf5plugin-provided compressors cannot be opened without first loading the hdf5plugin library using import hdf5plugin. When using alternative compression filters such as zstd, consider writing tozarr format instead of h5ad, as the zarr library provides a more transparent compression pipeline.

compression_opts int | Any (default: None)

For [lzf, gzip], see the h5py Filter pipeline.

Alternative compression filters such as zstd can be configured using helpers from the hdf5pluginlibrary. Experimental.

Usage example (setting zstd compression level to 5):

import hdf5plugin adata.write_h5ad( filename, compression=hdf5plugin.FILTERS["zstd"], compression_opts=hdf5plugin.Zstd(clevel=5).filter_options )

as_dense Sequence[str] (default: ())

Sparse arrays in AnnData object to write as dense. Currently only supports X and raw/X.