🍾 Xarray is now 10 years old! 🎉

xarray.Dataset.query

Contents

xarray.Dataset.query#

Dataset.query(queries=None, parser='pandas', engine=None, missing_dims='raise', **queries_kwargs)[source]#

Return a new dataset with each array indexed along the specified dimension(s), where the indexers are given as strings containing Python expressions to be evaluated against the data variables in the dataset.

Parameters:
  • queries (dict-like, optional) – A dict-like with keys matching dimensions and values given by strings containing Python expressions to be evaluated against the data variables in the dataset. The expressions will be evaluated using the pandas eval() function, and can contain any valid Python expressions but cannot contain any Python statements.

  • parser ({"pandas", "python"}, default: "pandas") – The parser to use to construct the syntax tree from the expression. The default of ‘pandas’ parses code slightly different than standard Python. Alternatively, you can parse an expression using the ‘python’ parser to retain strict Python semantics.

  • engine ({"python", "numexpr", None}, default: None) – The engine used to evaluate the expression. Supported engines are:

    • None: tries to use numexpr, falls back to python

    • “numexpr”: evaluates expressions using numexpr

    • “python”: performs operations as if you had eval’d in top level python

  • missing_dims ({"raise", "warn", "ignore"}, default: "raise") – What to do if dimensions that should be selected from are not present in the Dataset:

    • “raise”: raise an exception

    • “warn”: raise a warning, and ignore the missing dimensions

    • “ignore”: ignore the missing dimensions

  • **queries_kwargs ({dim: query, ...}, optional) – The keyword arguments form of queries. One of queries or queries_kwargs must be provided.

Returns:

obj (Dataset) – A new Dataset with the same contents as this dataset, except each array and dimension is indexed by the results of the appropriate queries.

Examples

>>> a = np.arange(0, 5, 1)
>>> b = np.linspace(0, 1, 5)
>>> ds = xr.Dataset({"a": ("x", a), "b": ("x", b)})
>>> ds
<xarray.Dataset> Size: 80B
Dimensions:  (x: 5)
Dimensions without coordinates: x
Data variables:
    a        (x) int64 40B 0 1 2 3 4
    b        (x) float64 40B 0.0 0.25 0.5 0.75 1.0
>>> ds.query(x="a > 2")
<xarray.Dataset> Size: 32B
Dimensions:  (x: 2)
Dimensions without coordinates: x
Data variables:
    a        (x) int64 16B 3 4
    b        (x) float64 16B 0.75 1.0