xarray.Dataset (original) (raw)

A multi-dimensional, in memory, array database.

A dataset resembles an in-memory representation of a NetCDF file, and consists of variables, coordinates and attributes which together form a self describing dataset.

Dataset implements the mapping interface with keys given by variable names and values given by DataArray objects for each variable name.

By default, pandas indexes are created for one dimensional variables with name equal to their dimension (i.e., Dimension coordinate) so those variables can be readily used as coordinates for label based indexing. When aCoordinates object is passed to coords, any existing index(es) built from those coordinates will be added to the Dataset.

To load data from a file or file-like object, use the open_datasetfunction.

In this example dataset, we will represent measurements of the temperature and pressure that were made under various conditions:

Here, we initialize the dataset with multiple dimensions. We use the string“loc” to represent the location dimension of the data, the string“instrument” to represent the instrument manufacturer dimension, and the string “time” for the time dimension.

ds = xr.Dataset( ... data_vars=dict( ... temperature=(["loc", "instrument", "time"], temperature), ... precipitation=(["loc", "instrument", "time"], precipitation), ... ), ... coords=dict( ... lon=("loc", lon), ... lat=("loc", lat), ... instrument=instruments, ... time=time, ... reference_time=reference_time, ... ), ... attrs=dict(description="Weather related data."), ... ) ds <xarray.Dataset> Size: 552B Dimensions: (loc: 2, instrument: 3, time: 4) Coordinates: lon (loc) float64 16B -99.83 -99.32 lat (loc) float64 16B 42.25 42.21

Find out where the coldest temperature was and what values the other variables had: