xarray.Dataset.reindex (original) (raw)

Dataset.reindex(indexers=None, method=None, tolerance=None, copy=True, fill_value=, **indexers_kwargs)[source]#

Conform this object onto a new set of indexes, filling in missing values with fill_value. The default fill value is NaN.

Parameters:

Returns:

reindexed (Dataset) – Another dataset, with this dataset’s data but replaced coordinates.

Examples

Create a dataset with some fictional data.

x = xr.Dataset( ... { ... "temperature": ("station", 20 * np.random.rand(4)), ... "pressure": ("station", 500 * np.random.rand(4)), ... }, ... coords={"station": ["boston", "nyc", "seattle", "denver"]}, ... ) x <xarray.Dataset> Size: 176B Dimensions: (station: 4) Coordinates:

Create a new index and reindex the dataset. By default values in the new index that do not have corresponding records in the dataset are assigned NaN.

new_index = ["boston", "austin", "seattle", "lincoln"] x.reindex({"station": new_index}) <xarray.Dataset> Size: 176B Dimensions: (station: 4) Coordinates:

We can fill in the missing values by passing a value to the keyword fill_value.

x.reindex({"station": new_index}, fill_value=0) <xarray.Dataset> Size: 176B Dimensions: (station: 4) Coordinates:

We can also use different fill values for each variable.

x.reindex( ... {"station": new_index}, fill_value={"temperature": 0, "pressure": 100} ... ) <xarray.Dataset> Size: 176B Dimensions: (station: 4) Coordinates:

Because the index is not monotonically increasing or decreasing, we cannot use arguments to the keyword method to fill the NaN values.

x.reindex({"station": new_index}, method="nearest") Traceback (most recent call last): ... raise ValueError('index must be monotonic increasing or decreasing') ValueError: index must be monotonic increasing or decreasing

To further illustrate the filling functionality in reindex, we will create a dataset with a monotonically increasing index (for example, a sequence of dates).

x2 = xr.Dataset( ... { ... "temperature": ( ... "time", ... [15.57, 12.77, np.nan, 0.3081, 16.59, 15.12], ... ), ... "pressure": ("time", 500 * np.random.rand(6)), ... }, ... coords={"time": pd.date_range("01/01/2019", periods=6, freq="D")}, ... ) x2 <xarray.Dataset> Size: 144B Dimensions: (time: 6) Coordinates:

Suppose we decide to expand the dataset to cover a wider date range.

time_index2 = pd.date_range("12/29/2018", periods=10, freq="D") x2.reindex({"time": time_index2}) <xarray.Dataset> Size: 240B Dimensions: (time: 10) Coordinates:

The index entries that did not have a value in the original data frame (for example, 2018-12-29) are by default filled with NaN. If desired, we can fill in the missing values using one of several options.

For example, to back-propagate the last valid value to fill the NaN values, pass bfill as an argument to the method keyword.

x3 = x2.reindex({"time": time_index2}, method="bfill") x3 <xarray.Dataset> Size: 240B Dimensions: (time: 10) Coordinates:

Please note that the NaN value present in the original dataset (at index value 2019-01-03) will not be filled by any of the value propagation schemes.

x2.where(x2.temperature.isnull(), drop=True) <xarray.Dataset> Size: 24B Dimensions: (time: 1) Coordinates:

This is because filling while reindexing does not look at dataset values, but only compares the original and desired indexes. If you do want to fill in the NaN values present in the original dataset, use the fillna() method.