🍾 Xarray is now 10 years old! πŸŽ‰

xarray.Dataset.dropna

Contents

xarray.Dataset.dropna#

Dataset.dropna(dim, *, how='any', thresh=None, subset=None)[source]#

Returns a new dataset with dropped labels for missing values along the provided dimension.

Parameters:
  • dim (hashable) – Dimension along which to drop missing values. Dropping along multiple dimensions simultaneously is not yet supported.

  • how ({"any", "all"}, default: "any") –

    • any : if any NA values are present, drop that label

    • all : if all values are NA, drop that label

  • thresh (int or None, optional) – If supplied, require this many non-NA values (summed over all the subset variables).

  • subset (iterable of hashable or None, optional) – Which variables to check for missing values. By default, all variables in the dataset are checked.

Examples

>>> dataset = xr.Dataset(
...     {
...         "temperature": (
...             ["time", "location"],
...             [[23.4, 24.1], [np.nan, 22.1], [21.8, 24.2], [20.5, 25.3]],
...         )
...     },
...     coords={"time": [1, 2, 3, 4], "location": ["A", "B"]},
... )
>>> dataset
<xarray.Dataset> Size: 104B
Dimensions:      (time: 4, location: 2)
Coordinates:
  * time         (time) int64 32B 1 2 3 4
  * location     (location) <U1 8B 'A' 'B'
Data variables:
    temperature  (time, location) float64 64B 23.4 24.1 nan ... 24.2 20.5 25.3

# Drop NaN values from the dataset

>>> dataset.dropna(dim="time")
<xarray.Dataset> Size: 80B
Dimensions:      (time: 3, location: 2)
Coordinates:
  * time         (time) int64 24B 1 3 4
  * location     (location) <U1 8B 'A' 'B'
Data variables:
    temperature  (time, location) float64 48B 23.4 24.1 21.8 24.2 20.5 25.3

# Drop labels with any NAN values

>>> dataset.dropna(dim="time", how="any")
<xarray.Dataset> Size: 80B
Dimensions:      (time: 3, location: 2)
Coordinates:
  * time         (time) int64 24B 1 3 4
  * location     (location) <U1 8B 'A' 'B'
Data variables:
    temperature  (time, location) float64 48B 23.4 24.1 21.8 24.2 20.5 25.3

# Drop labels with all NAN values

>>> dataset.dropna(dim="time", how="all")
<xarray.Dataset> Size: 104B
Dimensions:      (time: 4, location: 2)
Coordinates:
  * time         (time) int64 32B 1 2 3 4
  * location     (location) <U1 8B 'A' 'B'
Data variables:
    temperature  (time, location) float64 64B 23.4 24.1 nan ... 24.2 20.5 25.3

# Drop labels with less than 2 non-NA values

>>> dataset.dropna(dim="time", thresh=2)
<xarray.Dataset> Size: 80B
Dimensions:      (time: 3, location: 2)
Coordinates:
  * time         (time) int64 24B 1 3 4
  * location     (location) <U1 8B 'A' 'B'
Data variables:
    temperature  (time, location) float64 48B 23.4 24.1 21.8 24.2 20.5 25.3
Returns:

Dataset