Skip to content

Calling .isel() on a timezone-aware dimension/index causes it to lose timezone information #9307

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
5 tasks done
JamieTaylor-TUOS opened this issue Aug 2, 2024 · 2 comments

Comments

@JamieTaylor-TUOS
Copy link

What happened?

With a Dataset/DataArray containing a time dimension whose index uses the datetime64[ns, utc] (timezone-aware) dtype, if one calls .isel() to slice say the first element in this dimension, the time coordinate in the resulting Dataset/DataArray will have reverted to datetime64[ns] (i.e. timezone-naive).

What did you expect to happen?

Resulting Dataset/DataArray should retain the timezone-awareness on the coordinate of the sliced time dimension/index and still use datetime64[ns, utc] dtype

Minimal Complete Verifiable Example

import numpy as np
import pandas as pd
import xarray as xr

mydata = xr.DataArray(
    data=np.array([
        [0, 1, 2, 3],
        [4, 5, 6, 7],
        [8, 9, 10, 11]
    ]),
    coords={
        "category": ["A", "B", "C"],
        "time": pd.to_datetime([
            "2024-08-02T11:00:00+00:00",
            "2024-08-02T12:00:00+00:00",
            "2024-08-02T13:00:00+00:00",
            "2024-08-02T14:00:00+00:00"
        ])
    },
    name="volume"
)
print(mydata)
print("---------------------------")
print(f"time index dtype before calling `.isel()`: {mydata.indexes['time'].dtype}")
print(f"time coord dtype before calling `.isel()`: {mydata.coords['time'].dtype}")
print("---------------------------")
# The following will slice the zeroth index in the time dimension - the time index will cease to exist but the corresponding coordinate will remain
subset = mydata.isel(time=0, drop=False)
print("---------------------------")
print(subset)
print("---------------------------")
print(f"time coord dtype after  calling `.isel()`: {subset.coords['time'].dtype}")

MVCE confirmation

  • Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • Complete example — the example is self-contained, including all data and the text of any traceback.
  • Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • New issue — a search of GitHub Issues suggests this is not a duplicate.
  • Recent environment — the issue occurs with the latest version of xarray and its dependencies.

Relevant log output

<xarray.DataArray 'volume' (category: 3, time: 4)> Size: 96B
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])
Coordinates:
  * category  (category) <U1 12B 'A' 'B' 'C'
  * time      (time) object 32B 1722596400000000000 ... 1722607200000000000
---------------------------
time index dtype before calling `.isel()`: datetime64[ns, UTC]
time coord dtype before calling `.isel()`: object
---------------------------
---------------------------
<xarray.DataArray 'volume' (category: 3)> Size: 24B
array([0, 4, 8])
Coordinates:
  * category  (category) <U1 12B 'A' 'B' 'C'
    time      datetime64[ns] 8B 2024-08-02T11:00:00
---------------------------
time coord dtype after  calling `.isel()`: datetime64[ns]

Anything else we need to know?

Tested with version 2024.3.0 and also 2024.7.0.

Similar to #6416

Environment

INSTALLED VERSIONS

commit: None
python: 3.11.0 (main, Mar 1 2023, 18:26:19) [GCC 11.2.0]
python-bits: 64
OS: Linux
OS-release: 5.15.0-1064-azure
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: C.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: None
libnetcdf: None

xarray: 2024.7.0
pandas: 2.2.2
numpy: 2.0.1
scipy: None
netCDF4: None
pydap: None
h5netcdf: None
h5py: None
zarr: None
cftime: None
nc_time_axis: None
iris: None
bottleneck: None
dask: None
distributed: None
matplotlib: None
cartopy: None
seaborn: None
numbagg: None
fsspec: None
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: None
setuptools: 69.5.1
pip: 24.0
conda: None
pytest: None
mypy: None
IPython: 8.26.0
sphinx: None

@JamieTaylor-TUOS JamieTaylor-TUOS added bug needs triage Issue that has not been reviewed by xarray team member labels Aug 2, 2024
Copy link

welcome bot commented Aug 2, 2024

Thanks for opening your first issue here at xarray! Be sure to follow the issue template!
If you have an idea for a solution, we would really welcome a Pull Request with proposed changes.
See the Contributing Guide for more.
It may take us a while to respond here, but we really value your contribution. Contributors like you help make xarray better.
Thank you!

@max-sixty max-sixty added topic-cftime and removed needs triage Issue that has not been reviewed by xarray team member labels Aug 2, 2024
@max-sixty
Copy link
Collaborator

max-sixty commented Aug 2, 2024

Thanks for the excellent issue @JamieTaylor-TUOS

(I labeled this as "topic-cftime" as I don't think we have a "time but not necessarily cftime" label; tell me if this is not what we're intending, xarray team)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants