Selecting dates with .sel() doesn't work when time index is in cftime
What happened?
When I try to select a subset of the data in a dataset/array with a list containing dates it fails when the time index is in cftime, and I get the following error message:
KeyError: "not all values found in index 'time'"
What did you expect to happen?
I expect selecting a set of dates with a list to work the same way as when the time index is in datetime64.
Minimal Complete Verifiable Example
import xarray as xr
import numpy as np
ds = xr.open_dataset("https://thredds.met.no/thredds/dodsC/osisaf/met.no/ice/index/v2p1/nh/osisaf_nh_sie_daily.nc")
# Time coordinates are in datetime64, and selecting dates with a list works.
print(ds.time)
print(ds.sel(time=["2023-01-01", "2023-01-02"]))
# Converting the calendar to all_leap changes the time coordinates to use cftime instead of datetime64.
ds = ds.convert_calendar("all_leap", missing=np.nan).interpolate_na()
# Time coordinates are in cftime, and selecting dates with a list fails.
print(ds.time)
print(ds.sel(time=["2023-01-01", "2023-01-02"]))
MVCE confirmation
- [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
- [X] Complete example — the example is self-contained, including all data and the text of any traceback.
- [X] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
- [X] New issue — a search of GitHub Issues suggests this is not a duplicate.
Relevant log output
(geoscience) [michael@localhost ~]$ python minimal.py
<xarray.DataArray 'time' (time: 16107)>
array(['1979-01-01T00:00:00.000000000', '1979-01-02T00:00:00.000000000',
'1979-01-03T00:00:00.000000000', ..., '2023-02-03T00:00:00.000000000',
'2023-02-04T00:00:00.000000000', '2023-02-05T00:00:00.000000000'],
dtype='datetime64[ns]')
Coordinates:
* time (time) datetime64[ns] 1979-01-01 1979-01-02 ... 2023-02-05
sic_threshold float32 ...
lat float32 ...
lon float32 ...
Attributes:
standard_name: time
long_name: time of the observation (centered)
coverage_content_type: auxiliaryInformation
axis: T
<xarray.Dataset>
Dimensions: (time: 2, nv: 2)
Coordinates:
* time (time) datetime64[ns] 2023-01-01 2023-01-02
sic_threshold float32 ...
lat float32 ...
lon float32 ...
Dimensions without coordinates: nv
Data variables:
lat_bounds (nv) float32 ...
lon_bounds (nv) float32 ...
area |S64 ...
sie (time) float64 ...
source (time) float64 ...
Attributes: (12/35)
title: Daily Northern Hemisphere Sea Ice Extent from EU...
product_id: OSI-420
product_name: OSI SAF Sea Ice Index
product_status: demonstration
version: v2p1
summary: Time series of Daily Sea Ice Extent (SIE) for No...
... ...
distribution_statement: Free
copyright_statement: Copyright 2023 EUMETSAT
references: Product User Manual for OSI-420, Lavergne et al....
featureType: timeSeries
DODS.strlen: 2
DODS.dimName: nchar
<xarray.DataArray 'time' (time: 16140)>
array([cftime.DatetimeAllLeap(1979, 1, 1, 0, 0, 0, 0, has_year_zero=True),
cftime.DatetimeAllLeap(1979, 1, 2, 0, 0, 0, 0, has_year_zero=True),
cftime.DatetimeAllLeap(1979, 1, 3, 0, 0, 0, 0, has_year_zero=True), ...,
cftime.DatetimeAllLeap(2023, 2, 3, 0, 0, 0, 0, has_year_zero=True),
cftime.DatetimeAllLeap(2023, 2, 4, 0, 0, 0, 0, has_year_zero=True),
cftime.DatetimeAllLeap(2023, 2, 5, 0, 0, 0, 0, has_year_zero=True)],
dtype=object)
Coordinates:
* time (time) object 1979-01-01 00:00:00 ... 2023-02-05 00:00:00
lat float32 90.0
lon float32 0.0
sic_threshold float32 0.15
Attributes:
standard_name: time
long_name: time of the observation (centered)
coverage_content_type: auxiliaryInformation
axis: T
Traceback (most recent call last):
File "/var/home/michael/minimal.py", line 15, in <module>
print(ds.sel(time=["2023-01-01", "2023-01-02"]))
File "/var/home/michael/mambaforge/envs/geoscience/lib/python3.10/site-packages/xarray/core/dataset.py", line 2554, in sel
query_results = map_index_queries(
File "/var/home/michael/mambaforge/envs/geoscience/lib/python3.10/site-packages/xarray/core/indexing.py", line 183, in map_index_queries
results.append(index.sel(labels, **options)) # type: ignore[call-arg]
File "/var/home/michael/mambaforge/envs/geoscience/lib/python3.10/site-packages/xarray/core/indexes.py", line 480, in sel
raise KeyError(f"not all values found in index {coord_name!r}")
KeyError: "not all values found in index 'time'"
Anything else we need to know?
No response
Environment
/var/home/michael/mambaforge/envs/geoscience/lib/python3.10/site-packages/_distutils_hack/init.py:33: UserWarning: Setuptools is replacing distutils. warnings.warn("Setuptools is replacing distutils.")
INSTALLED VERSIONS
commit: None python: 3.10.9 | packaged by conda-forge | (main, Feb 2 2023, 20:20:04) [GCC 11.3.0] python-bits: 64 OS: Linux OS-release: 6.1.9-200.fc37.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_GB.UTF-8 LOCALE: ('en_GB', 'UTF-8') libhdf5: 1.12.2 libnetcdf: 4.8.1
xarray: 2022.11.0 pandas: 1.5.1 numpy: 1.23.4 scipy: 1.9.3 netCDF4: 1.6.1 pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: 1.6.2 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.3.6 dask: None distributed: None matplotlib: 3.6.2 cartopy: 0.21.0 seaborn: 0.12.1 numbagg: None fsspec: None cupy: None pint: None sparse: None flox: None numpy_groupies: None setuptools: 65.5.1 pip: 22.3.1 conda: None pytest: None IPython: 8.6.0 sphinx: None
Indeed currently we do not support indexing a CFTimeIndex-backed array using a list of strings, but that's something I think we would be happy to change (e.g. we do accept a list of strings to interp for CFTimeIndex-backed arrays).
For the time being you should be able to use cftime.DatetimeAllLeap values themselves:
ds.sel(time=[cftime.DatetimeAllLeap(2023, 1, 1), cftime.DatetimeAllLeap(2023, 1, 2)])
Ok, great! Thanks for the tip.
On Mon, Feb 6, 2023 at 13:51, Spencer Clark @.***> wrote:
Indeed currently we do not support indexing a CFTimeIndex-backed array using a list of strings, but that's something I think we would be happy to change (e.g. we do accept a list of strings to interp for CFTimeIndex-backed arrays).
For the time being you should be able to use cftime.DatetimeAllLeap values themselves:
ds
.
sel
(
time
=
[
cftime
.
DatetimeAllLeap
(
2023
,
1
,
1
),
cftime
.
DatetimeAllLeap
(
2023
,
1
,
2
)])
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>