Description
What happened?
When performing a DataArray resampling on a time dimension, the metadata attributes of non-affected coordinate variables are dropped. This behaviour breaks compatibility with cf_xarray
as the coordinate metadata is needed to identify the X
, Y
, Z
coordinates.
What did you expect to happen?
Metadata fields of unaffected coordinates (lat
, lon
, height
) to be preserved.
Minimal Complete Verifiable Example
import xarray as xr
import cf_xarray
ds = xr.open_dataset("my_dataset_that_has_lat_and_lon_coordinates.nc")
tas = ds.tas.resample(time="MS").mean(dim="time")
tas.cf["latitude"]
MVCE confirmation
- Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
- Complete example — the example is self-contained, including all data and the text of any traceback.
- Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
- New issue — a search of GitHub Issues suggests this is not a duplicate.
Relevant log output
KeyError Traceback (most recent call last)
File ~/mambaforge/envs/xclim310/lib/python3.10/site-packages/xarray/core/dataarray.py:760, in DataArray._getitem_coord(self, key)
759 try:
--> 760 var = self._coords[key]
761 except KeyError:
KeyError: 'latitude'
During handling of the above exception, another exception occurred:
KeyError Traceback (most recent call last)
File ~/mambaforge/envs/xclim310/lib/python3.10/site-packages/cf_xarray/accessor.py:706, in _getitem(accessor, key, skip)
705 for name in allnames:
--> 706 extravars = accessor.get_associated_variable_names(
707 name, skip_bounds=scalar_key, error=False
708 )
709 coords.extend(itertools.chain(*extravars.values()))
File ~/mambaforge/envs/xclim310/lib/python3.10/site-packages/cf_xarray/accessor.py:1597, in CFAccessor.get_associated_variable_names(self, name, skip_bounds, error)
1596 coords: dict[str, list[str]] = {k: [] for k in keys}
-> 1597 attrs_or_encoding = ChainMap(self._obj[name].attrs, self._obj[name].encoding)
1599 if "coordinates" in attrs_or_encoding:
File ~/mambaforge/envs/xclim310/lib/python3.10/site-packages/xarray/core/dataarray.py:769, in DataArray.__getitem__(self, key)
768 if isinstance(key, str):
--> 769 return self._getitem_coord(key)
770 else:
771 # xarray-style array indexing
File ~/mambaforge/envs/xclim310/lib/python3.10/site-packages/xarray/core/dataarray.py:763, in DataArray._getitem_coord(self, key)
762 dim_sizes = dict(zip(self.dims, self.shape))
--> 763 _, key, var = _get_virtual_variable(self._coords, key, dim_sizes)
765 return self._replace_maybe_drop_dims(var, name=key)
File ~/mambaforge/envs/xclim310/lib/python3.10/site-packages/xarray/core/dataset.py:175, in _get_virtual_variable(variables, key, dim_sizes)
174 if len(split_key) != 2:
--> 175 raise KeyError(key)
177 ref_name, var_name = split_key
KeyError: 'latitude'
During handling of the above exception, another exception occurred:
KeyError Traceback (most recent call last)
Input In [7], in <cell line: 1>()
----> 1 tas.cf["latitude"]
File ~/mambaforge/envs/xclim310/lib/python3.10/site-packages/cf_xarray/accessor.py:2526, in CFDataArrayAccessor.__getitem__(self, key)
2521 if not isinstance(key, str):
2522 raise KeyError(
2523 f"Cannot use a list of keys with DataArrays. Expected a single string. Received {key!r} instead."
2524 )
-> 2526 return _getitem(self, key)
File ~/mambaforge/envs/xclim310/lib/python3.10/site-packages/cf_xarray/accessor.py:749, in _getitem(accessor, key, skip)
746 return ds.set_coords(coords)
748 except KeyError:
--> 749 raise KeyError(
750 f"{kind}.cf does not understand the key {k!r}. "
751 f"Use 'repr({kind}.cf)' (or '{kind}.cf' in a Jupyter environment) to see a list of key names that can be interpreted."
752 )
KeyError: "DataArray.cf does not understand the key 'latitude'. Use 'repr(DataArray.cf)' (or 'DataArray.cf' in a Jupyter environment) to see a list of key names that can be interpreted."
Anything else we need to know?
Before
netcdf tas_Amon_CanESM2_rcp85_r1i1p1_200701-200712 {
dimensions:
time = UNLIMITED ; // (12 currently)
bnds = 2 ;
lat = 64 ;
lon = 128 ;
variables:
double time(time) ;
time:_FillValue = NaN ;
time:bounds = "time_bnds" ;
time:axis = "T" ;
time:long_name = "time" ;
time:standard_name = "time" ;
time:units = "days since 1850-01-01" ;
time:calendar = "365_day" ;
double time_bnds(time, bnds) ;
time_bnds:_FillValue = NaN ;
time_bnds:coordinates = "height" ;
double lat(lat) ;
lat:_FillValue = NaN ;
lat:bounds = "lat_bnds" ;
lat:units = "degrees_north" ;
lat:axis = "Y" ;
lat:long_name = "latitude" ;
lat:standard_name = "latitude" ;
double lat_bnds(lat, bnds) ;
lat_bnds:_FillValue = NaN ;
lat_bnds:coordinates = "height" ;
double lon(lon) ;
lon:_FillValue = NaN ;
lon:bounds = "lon_bnds" ;
lon:units = "degrees_east" ;
lon:axis = "X" ;
lon:long_name = "longitude" ;
lon:standard_name = "longitude" ;
double lon_bnds(lon, bnds) ;
lon_bnds:_FillValue = NaN ;
lon_bnds:coordinates = "height" ;
double height ;
height:_FillValue = NaN ;
height:units = "m" ;
height:axis = "Z" ;
height:positive = "up" ;
height:long_name = "height" ;
height:standard_name = "height" ;
float tas(time, lat, lon) ;
tas:_FillValue = 1.e+20f ;
tas:standard_name = "air_temperature" ;
tas:long_name = "Near-Surface Air Temperature" ;
tas:units = "K" ;
tas:original_name = "ST" ;
tas:cell_methods = "time: mean (interval: 15 minutes)" ;
tas:cell_measures = "area: areacella" ;
tas:history = "2011-03-10T05:13:26Z altered by CMOR: Treated scalar dimension: \'height\'. 2011-03-10T05:13:26Z altered by CMOR: replaced missing value flag (1e+38) with standard missing value (1e+20)." ;
tas:associated_files = "baseURL: http://cmip-pcmdi.llnl.gov/CMIP5/dataLocation gridspecFile: gridspec_atmos_fx_CanESM2_rcp85_r0i0p0.nc areacella: areacella_fx_CanESM2_rcp85_r0i0p0.nc" ;
tas:coordinates = "height" ;
tas:missing_value = 1.e+20f ;
After
netcdf test_cf_lat_new {
dimensions:
lat = 64 ;
lon = 128 ;
time = 11 ;
variables:
double lat(lat) ;
lat:_FillValue = NaN ;
double lon(lon) ;
lon:_FillValue = NaN ;
double height ;
height:_FillValue = NaN ;
height:units = "m" ;
height:axis = "Z" ;
height:positive = "up" ;
height:long_name = "height" ;
height:standard_name = "height" ;
int64 time(time) ;
time:units = "days since 2007-01-01 00:00:00.000000" ;
time:calendar = "noleap" ;
float tas(time, lat, lon) ;
tas:_FillValue = NaNf ;
tas:coordinates = "height" ;
Environment
INSTALLED VERSIONS
commit: None
python: 3.10.5 | packaged by conda-forge | (main, Jun 14 2022, 07:06:46) [GCC 10.3.0]
python-bits: 64
OS: Linux
OS-release: 5.19.6-200.fc36.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_CA.UTF-8
LOCALE: ('en_CA', 'UTF-8')
libhdf5: 1.12.1
libnetcdf: 4.8.1
xarray: 2022.6.0
pandas: 1.3.5
numpy: 1.22.4
scipy: 1.8.1
netCDF4: 1.6.0
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: 1.6.2
nc_time_axis: 1.4.1
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: 1.3.5
dask: 2022.6.1
distributed: 2022.6.1
matplotlib: 3.5.2
cartopy: None
seaborn: None
numbagg: None
fsspec: 2022.7.1
cupy: None
pint: 0.19.2
sparse: 0.13.0
flox: 0.5.10.dev8+gfbc2af8
numpy_groupies: 0.9.19
setuptools: 59.8.0
pip: 22.2.1
conda: None
pytest: 7.1.2
IPython: 8.4.0
sphinx: 5.1.1