Indexing and selecting data¶
xarray offers extremely flexible indexing routines that combine the best features of NumPy and pandas for data selection.
The most basic way to access elements of a DataArray
object is to use Python’s []
syntax, such as array[i, j]
, where
i
and j
are both integers.
As xarray objects can store coordinates corresponding to each dimension of an
array, label-based indexing similar to pandas.DataFrame.loc
is also possible.
In label-based indexing, the element position i
is automatically
looked-up from the coordinate values.
Dimensions of xarray objects have names, so you can also lookup the dimensions by name, instead of remembering their positional order.
Thus in total, xarray supports four different kinds of indexing, as described below and summarized in this table:
Dimension lookup | Index lookup | DataArray syntax |
Dataset syntax |
---|---|---|---|
Positional | By integer | arr[:, 0] |
not available |
Positional | By label | arr.loc[:, 'IA'] |
not available |
By name | By integer | arr.isel(space=0) or arr[dict(space=0)] |
ds.isel(space=0) or ds[dict(space=0)] |
By name | By label | arr.sel(space='IA') or arr.loc[dict(space='IA')] |
ds.sel(space='IA') or ds.loc[dict(space='IA')] |
More advanced indexing is also possible for all the methods by
supplying DataArray
objects as indexer.
See Vectorized Indexing for the details.
Positional indexing¶
Indexing a DataArray
directly works (mostly) just like it
does for numpy arrays, except that the returned object is always another
DataArray:
In [1]: arr = xr.DataArray(np.random.rand(4, 3),
...: [('time', pd.date_range('2000-01-01', periods=4)),
...: ('space', ['IA', 'IL', 'IN'])])
...:
In [2]: arr[:2]
Out[2]:
<xarray.DataArray (time: 2, space: 3)>
array([[ 0.12697 , 0.966718, 0.260476],
[ 0.897237, 0.37675 , 0.336222]])
Coordinates:
* time (time) datetime64[ns] 2000-01-01 2000-01-02
* space (space) <U2 'IA' 'IL' 'IN'
In [3]: arr[0, 0]