Groups (zarr.hierarchy) — zarr 2.18.4 documentation (original) (raw)

zarr.hierarchy.group(store=None, overwrite=False, chunk_store=None, cache_attrs=True, synchronizer=None, path=None, *, zarr_version=None, meta_array=None)[source]#

Create a group.

Parameters:

storeMutableMapping or string, optional

Store or path to directory in file system.

overwritebool, optional

If True, delete any pre-existing data in store at path before creating the group.

chunk_storeMutableMapping, optional

Separate storage for chunks. If not provided, store will be used for storage of both chunks and metadata.

cache_attrsbool, optional

If True (default), user attributes will be cached for attribute read operations. If False, user attributes are reloaded from the store prior to all attribute read operations.

synchronizerobject, optional

Array synchronizer.

pathstring, optional

Group path within store.

meta_arrayarray-like, optional

An array instance to use for determining arrays to create and return to users. Use numpy.empty(()) by default.

Added in version 2.16.1.

Returns:

gzarr.hierarchy.Group

Examples

Create a group in memory:

import zarr g = zarr.group() g <zarr.hierarchy.Group '/'>

Create a group with a different store:

store = zarr.DirectoryStore('data/example.zarr') g = zarr.group(store=store, overwrite=True) g <zarr.hierarchy.Group '/'>

zarr.hierarchy.open_group(store=None, mode='a', cache_attrs=True, synchronizer=None, path=None, chunk_store=None, storage_options=None, *, zarr_version=None, meta_array=None)[source]#

Open a group using file-mode-like semantics.

Parameters:

storeMutableMapping or string, optional

Store or path to directory in file system or name of zip file.

mode{‘r’, ‘r+’, ‘a’, ‘w’, ‘w-‘}, optional

Persistence mode: ‘r’ means read only (must exist); ‘r+’ means read/write (must exist); ‘a’ means read/write (create if doesn’t exist); ‘w’ means create (overwrite if exists); ‘w-’ means create (fail if exists).

cache_attrsbool, optional

If True (default), user attributes will be cached for attribute read operations. If False, user attributes are reloaded from the store prior to all attribute read operations.

synchronizerobject, optional

Array synchronizer.

pathstring, optional

Group path within store.

chunk_storeMutableMapping or string, optional

Store or path to directory in file system or name of zip file.

storage_optionsdict

If using an fsspec URL to create the store, these will be passed to the backend implementation. Ignored otherwise.

meta_arrayarray-like, optional

An array instance to use for determining arrays to create and return to users. Use numpy.empty(()) by default.

Added in version 2.13.

Returns:

gzarr.hierarchy.Group

Examples

import zarr root = zarr.open_group('data/example.zarr', mode='w') foo = root.create_group('foo') bar = root.create_group('bar') root <zarr.hierarchy.Group '/'> root2 = zarr.open_group('data/example.zarr', mode='a') root2 <zarr.hierarchy.Group '/'> root == root2 True

class zarr.hierarchy.Group(store, path=None, read_only=False, chunk_store=None, cache_attrs=True, synchronizer=None, zarr_version=None, *, meta_array=None)[source]#

Instantiate a group from an initialized store.

Parameters:

storeMutableMapping

Group store, already initialized. If the Group is used in a context manager, and the store has a close method, it will be called on exit.

pathstring, optional

Group path.

read_onlybool, optional

True if group should be protected against modification.

chunk_storeMutableMapping, optional

Separate storage for chunks. If not provided, store will be used for storage of both chunks and metadata.

cache_attrsbool, optional

If True (default), user attributes will be cached for attribute read operations. If False, user attributes are reloaded from the store prior to all attribute read operations.

synchronizerobject, optional

Array synchronizer.

meta_arrayarray-like, optional

An array instance to use for determining arrays to create and return to users. Use numpy.empty(()) by default.

Added in version 2.13.

Attributes:

store

A MutableMapping providing the underlying storage for the group.

path

Storage path.

name

Group name following h5py convention.

read_only

A boolean, True if modification operations are not permitted.

chunk_store

A MutableMapping providing the underlying storage for array chunks.

synchronizer

Object used to synchronize write access to groups and arrays.

attrs

A MutableMapping containing user-defined attributes.

info

Return diagnostic information about the group.

meta_array

An array-like instance to use for determining arrays to create and return to users.

Methods

__len__()[source]#

Number of members.

__iter__()[source]#

Return an iterator over group member names.

Examples

import zarr g1 = zarr.group() g2 = g1.create_group('foo') g3 = g1.create_group('bar') d1 = g1.create_dataset('baz', shape=100, chunks=10) d2 = g1.create_dataset('quux', shape=200, chunks=20) for name in g1: ... print(name) bar baz foo quux

__contains__(item)[source]#

Test for group membership.

Examples

import zarr g1 = zarr.group() g2 = g1.create_group('foo') d1 = g1.create_dataset('bar', shape=100, chunks=10) 'foo' in g1 True 'bar' in g1 True 'baz' in g1 False

__getitem__(item)[source]#

Obtain a group member.

Parameters:

itemstring

Member name or path.

Examples

import zarr g1 = zarr.group() d1 = g1.create_dataset('foo/bar/baz', shape=100, chunks=10) g1['foo'] <zarr.hierarchy.Group '/foo'> g1['foo/bar'] <zarr.hierarchy.Group '/foo/bar'> g1['foo/bar/baz'] <zarr.core.Array '/foo/bar/baz' (100,) float64>

__enter__()[source]#

Return the Group for use as a context manager.

__exit__(exc_type, exc_val, exc_tb)[source]#

Call the close method of the underlying Store.

group_keys()[source]#

Return an iterator over member names for groups only.

Examples

import zarr g1 = zarr.group() g2 = g1.create_group('foo') g3 = g1.create_group('bar') d1 = g1.create_dataset('baz', shape=100, chunks=10) d2 = g1.create_dataset('quux', shape=200, chunks=20) sorted(g1.group_keys()) ['bar', 'foo']

groups()[source]#

Return an iterator over (name, value) pairs for groups only.

Examples

import zarr g1 = zarr.group() g2 = g1.create_group('foo') g3 = g1.create_group('bar') d1 = g1.create_dataset('baz', shape=100, chunks=10) d2 = g1.create_dataset('quux', shape=200, chunks=20) for n, v in g1.groups(): ... print(n, type(v)) bar <class 'zarr.hierarchy.Group'> foo <class 'zarr.hierarchy.Group'>

array_keys(recurse=False)[source]#

Return an iterator over member names for arrays only.

Parameters:

recurserecurse, optional

Option to return member names for all arrays, even from groups below the current one. If False, only member names for arrays in the current group will be returned. Default value is False.

Examples

import zarr g1 = zarr.group() g2 = g1.create_group('foo') g3 = g1.create_group('bar') d1 = g1.create_dataset('baz', shape=100, chunks=10) d2 = g1.create_dataset('quux', shape=200, chunks=20) sorted(g1.array_keys()) ['baz', 'quux']

arrays(recurse=False)[source]#

Return an iterator over (name, value) pairs for arrays only.

Parameters:

recurserecurse, optional

Option to return (name, value) pairs for all arrays, even from groups below the current one. If False, only (name, value) pairs for arrays in the current group will be returned. Default value is False.

Examples

import zarr g1 = zarr.group() g2 = g1.create_group('foo') g3 = g1.create_group('bar') d1 = g1.create_dataset('baz', shape=100, chunks=10) d2 = g1.create_dataset('quux', shape=200, chunks=20) for n, v in g1.arrays(): ... print(n, type(v)) baz <class 'zarr.core.Array'> quux <class 'zarr.core.Array'>

visit(func)[source]#

Run func on each object’s path.

Note: If func returns None (or doesn’t return),

iteration continues. However, if func returns anything else, it ceases and returns that value.

Examples

import zarr g1 = zarr.group() g2 = g1.create_group('foo') g3 = g1.create_group('bar') g4 = g3.create_group('baz') g5 = g3.create_group('quux') def print_visitor(name): ... print(name) g1.visit(print_visitor) bar bar/baz bar/quux foo g3.visit(print_visitor) baz quux

Search for members matching some name query can be implemented usingvisit that is, find and findall. Consider the following tree:

/ ├── aaa │ └── bbb │ └── ccc │ └── aaa ├── bar └── foo

It is created as follows:

root = zarr.group() foo = root.create_group("foo") bar = root.create_group("bar") root.create_group("aaa").create_group("bbb").create_group("ccc").create_group("aaa") <zarr.hierarchy.Group '/aaa/bbb/ccc/aaa'>

For find, the first path that matches a given pattern (for example “aaa”) is returned. Note that a non-None value is returned in the visit function to stop further iteration.

import re pattern = re.compile("aaa") found = None def find(path): ... global found ... if pattern.search(path) is not None: ... found = path ... return True ... root.visit(find) True print(found) aaa

For findall, all the results are gathered into a list

pattern = re.compile("aaa") found = [] def findall(path): ... if pattern.search(path) is not None: ... found.append(path) ... root.visit(findall) print(found) ['aaa', 'aaa/bbb', 'aaa/bbb/ccc', 'aaa/bbb/ccc/aaa']

To match only on the last part of the path, use a greedy regex to filter out the prefix:

prefix_pattern = re.compile(r".*/") pattern = re.compile("aaa") found = [] def findall(path): ... match = prefix_pattern.match(path) ... if match is None: ... name = path ... else: ... _, end = match.span() ... name = path[end:] ... if pattern.search(name) is not None: ... found.append(path) ... return None ... root.visit(findall) print(found) ['aaa', 'aaa/bbb/ccc/aaa']

visitkeys(func)[source]#

An alias for visit().

visitvalues(func)[source]#

Run func on each object.

Note: If func returns None (or doesn’t return),

iteration continues. However, if func returns anything else, it ceases and returns that value.

Examples

import zarr g1 = zarr.group() g2 = g1.create_group('foo') g3 = g1.create_group('bar') g4 = g3.create_group('baz') g5 = g3.create_group('quux') def print_visitor(obj): ... print(obj) g1.visitvalues(print_visitor) <zarr.hierarchy.Group '/bar'> <zarr.hierarchy.Group '/bar/baz'> <zarr.hierarchy.Group '/bar/quux'> <zarr.hierarchy.Group '/foo'> g3.visitvalues(print_visitor) <zarr.hierarchy.Group '/bar/baz'> <zarr.hierarchy.Group '/bar/quux'>

visititems(func)[source]#

Run func on each object’s path and the object itself.

Note: If func returns None (or doesn’t return),

iteration continues. However, if func returns anything else, it ceases and returns that value.

Examples

import zarr g1 = zarr.group() g2 = g1.create_group('foo') g3 = g1.create_group('bar') g4 = g3.create_group('baz') g5 = g3.create_group('quux') def print_visitor(name, obj): ... print((name, obj)) g1.visititems(print_visitor) ('bar', <zarr.hierarchy.Group '/bar'>) ('bar/baz', <zarr.hierarchy.Group '/bar/baz'>) ('bar/quux', <zarr.hierarchy.Group '/bar/quux'>) ('foo', <zarr.hierarchy.Group '/foo'>) g3.visititems(print_visitor) ('baz', <zarr.hierarchy.Group '/bar/baz'>) ('quux', <zarr.hierarchy.Group '/bar/quux'>)

tree(expand=False, level=None)[source]#

Provide a print-able display of the hierarchy.

Parameters:

expandbool, optional

Only relevant for HTML representation. If True, tree will be fully expanded.

levelint, optional

Maximum depth to descend into hierarchy.

Notes

Please note that this is an experimental feature. The behaviour of this function is still evolving and the default output and/or parameters may change in future versions.

Examples

import zarr g1 = zarr.group() g2 = g1.create_group('foo') g3 = g1.create_group('bar') g4 = g3.create_group('baz') g5 = g3.create_group('quux') d1 = g5.create_dataset('baz', shape=100, chunks=10) g1.tree() / ├── bar │ ├── baz │ └── quux │ └── baz (100,) float64 └── foo g1.tree(level=2) / ├── bar │ ├── baz │ └── quux └── foo g3.tree() bar ├── baz └── quux └── baz (100,) float64

create_group(name, overwrite=False)[source]#

Create a sub-group.

Parameters:

namestring

Group name.

overwritebool, optional

If True, overwrite any existing array with the given name.

Returns:

gzarr.hierarchy.Group

Examples

import zarr g1 = zarr.group() g2 = g1.create_group('foo') g3 = g1.create_group('bar') g4 = g1.create_group('baz/quux')

require_group(name, overwrite=False)[source]#

Obtain a sub-group, creating one if it doesn’t exist.

Parameters:

namestring

Group name.

overwritebool, optional

Overwrite any existing array with given name if present.

Returns:

gzarr.hierarchy.Group

Examples

import zarr g1 = zarr.group() g2 = g1.require_group('foo') g3 = g1.require_group('foo') g2 == g3 True

create_groups(*names, **kwargs)[source]#

Convenience method to create multiple groups in a single call.

require_groups(*names)[source]#

Convenience method to require multiple groups in a single call.

create_dataset(name, **kwargs)[source]#

Create an array.

Arrays are known as “datasets” in HDF5 terminology. For compatibility with h5py, Zarr groups also implement the require_dataset() method.

Parameters:

namestring

Array name.

dataarray-like, optional

Initial data.

shapeint or tuple of ints

Array shape.

chunksint or tuple of ints, optional

Chunk shape. If not provided, will be guessed from shape anddtype.

dtypestring or dtype, optional

NumPy dtype.

compressorCodec, optional

Primary compressor.

fill_valueobject

Default value to use for uninitialized portions of the array.

order{‘C’, ‘F’}, optional

Memory layout to be used within each chunk.

synchronizerzarr.sync.ArraySynchronizer, optional

Array synchronizer.

filterssequence of Codecs, optional

Sequence of filters to use to encode chunk data prior to compression.

overwritebool, optional

If True, replace any existing array or group with the given name.

cache_metadatabool, optional

If True, array configuration metadata will be cached for the lifetime of the object. If False, array metadata will be reloaded prior to all data access and modification operations (may incur overhead depending on storage and data access pattern).

dimension_separator{‘.’, ‘/’}, optional

Separator placed between the dimensions of a chunk.

Returns:

azarr.core.Array

Examples

import zarr g1 = zarr.group() d1 = g1.create_dataset('foo', shape=(10000, 10000), ... chunks=(1000, 1000)) d1 <zarr.core.Array '/foo' (10000, 10000) float64> d2 = g1.create_dataset('bar/baz/qux', shape=(100, 100, 100), ... chunks=(100, 10, 10)) d2 <zarr.core.Array '/bar/baz/qux' (100, 100, 100) float64>

require_dataset(name, shape, dtype=None, exact=False, **kwargs)[source]#

Obtain an array, creating if it doesn’t exist.

Arrays are known as “datasets” in HDF5 terminology. For compatibility with h5py, Zarr groups also implement the create_dataset() method.

Other kwargs are as per zarr.hierarchy.Group.create_dataset().

Parameters:

namestring

Array name.

shapeint or tuple of ints

Array shape.

dtypestring or dtype, optional

NumPy dtype.

exactbool, optional

If True, require dtype to match exactly. If false, requiredtype can be cast from array dtype.

create(name, **kwargs)[source]#

Create an array. Keyword arguments as perzarr.creation.create().

empty(name, **kwargs)[source]#

Create an array. Keyword arguments as perzarr.creation.empty().

zeros(name, **kwargs)[source]#

Create an array. Keyword arguments as perzarr.creation.zeros().

ones(name, **kwargs)[source]#

Create an array. Keyword arguments as perzarr.creation.ones().

full(name, fill_value, **kwargs)[source]#

Create an array. Keyword arguments as perzarr.creation.full().

array(name, data, **kwargs)[source]#

Create an array. Keyword arguments as perzarr.creation.array().

empty_like(name, data, **kwargs)[source]#

Create an array. Keyword arguments as perzarr.creation.empty_like().

zeros_like(name, data, **kwargs)[source]#

Create an array. Keyword arguments as perzarr.creation.zeros_like().

ones_like(name, data, **kwargs)[source]#

Create an array. Keyword arguments as perzarr.creation.ones_like().

full_like(name, data, **kwargs)[source]#

Create an array. Keyword arguments as perzarr.creation.full_like().

move(source, dest)[source]#

Move contents from one path to another relative to the Group.

Parameters:

sourcestring

Name or path to a Zarr object to move.

deststring

New name or path of the Zarr object.