-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
What happened?
When reading nanosecond precision time data from netcdf the precision is lost. This happens because CFMaskCoder will convert the variable to floating point and insert "NaN". In CFDatetimeCoder the floating point is cast back to int64 to transform into datetime64. This casting is sometimes undefined, hence #7098.
What did you expect to happen?
Precision should be preserved. The transformation to floating point should be omitted.
Minimal Complete Verifiable Example
import xarray as xr
import numpy as np
import netCDF4 as nc
import matplotlib.pyplot as plt
# create time array and fillvalue
min_ns = -9223372036854775808
max_ns = 9223372036854775807
cnt = 2000
time_arr = np.arange(min_ns, min_ns + cnt, dtype=np.int64).astype("M8[ns]")
fill_value = np.datetime64("1900-01-01", "ns")
# create ncfile with time with attached _FillValue
with nc.Dataset("test.nc", mode="w") as ds:
ds.createDimension("x", cnt)
time = ds.createVariable("time", "<i8", ("x",), fill_value=fill_value)
time[:] = time_arr
time.units = "nanoseconds since 1970-01-01"
# normal decoding
with xr.open_dataset("test.nc").load() as xr_ds:
print("--- normal decoding ----------------------")
print(xr_ds["time"])
plt.plot(xr_ds["time"].values.astype(np.int64) + max_ns, color="g", label="normal")
# no decoding
with xr.open_dataset("test.nc", decode_cf=False).load() as xr_ds:
print("--- no decoding ----------------------")
print(xr_ds["time"])
plt.plot(xr_ds["time"].values + max_ns, lw=5, color="b", label="raw")
# do not decode times, this shows how the CFMaskCoder converts
# the array to floating point before it would run CFDatetimeCoder
with xr.open_dataset("test.nc", decode_times=False).load() as xr_ds:
print("--- no time decoding ----------------------")
print(xr_ds["time"])
# do not run CFMaskCoder to show that times will be converted nicely
# with CFDatetimeCoder
with xr.open_dataset("test.nc", mask_and_scale=False).load() as xr_ds:
print("--- no masking ------------------------------")
print(xr_ds["time"])
plt.plot(xr_ds["time"].values.astype(np.int64) + max_ns, lw=2, color="r", label="nomask")
plt.legend()
MVCE confirmation
- Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
- Complete example — the example is self-contained, including all data and the text of any traceback.
- Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
- New issue — a search of GitHub Issues suggests this is not a duplicate.
Relevant log output
--- normal decoding ----------------------
<xarray.DataArray 'time' (x: 2000)>
array([ 'NaT', 'NaT',
'NaT', ...,
'1677-09-21T00:12:43.145226240', '1677-09-21T00:12:43.145226240',
'1677-09-21T00:12:43.145226240'], dtype='datetime64[ns]')
Dimensions without coordinates: x
--- no decoding ----------------------
<xarray.DataArray 'time' (x: 2000)>
array([-9223372036854775808, -9223372036854775807, -9223372036854775806,
..., -9223372036854773811, -9223372036854773810,
-9223372036854773809])
Dimensions without coordinates: x
Attributes:
_FillValue: -2208988800000000000
units: nanoseconds since 1970-01-01
--- no time decoding ----------------------
<xarray.DataArray 'time' (x: 2000)>
array([-9.22337204e+18, -9.22337204e+18, -9.22337204e+18, ...,
-9.22337204e+18, -9.22337204e+18, -9.22337204e+18])
Dimensions without coordinates: x
Attributes:
units: nanoseconds since 1970-01-01
--- no masking ------------------------------
<xarray.DataArray 'time' (x: 2000)>
array([ 'NaT', '1677-09-21T00:12:43.145224193',
'1677-09-21T00:12:43.145224194', ...,
'1677-09-21T00:12:43.145226189', '1677-09-21T00:12:43.145226190',
'1677-09-21T00:12:43.145226191'], dtype='datetime64[ns]')
Dimensions without coordinates: x
Attributes:
_FillValue: -2208988800000000000
Anything else we need to know?
Plot from above code:
Xref: #7098, #7790 (comment)
Environment
xarray: 2023.4.2
pandas: 2.0.1
numpy: 1.24.2
scipy: 1.10.1
netCDF4: 1.6.3
pydap: None
h5netcdf: 1.1.0
h5py: 3.8.0
Nio: None
zarr: 2.14.2
cftime: 1.6.2
nc_time_axis: None
PseudoNetCDF: None
iris: None
bottleneck: 1.3.7
dask: 2023.3.1
distributed: 2023.3.1
matplotlib: 3.7.1
cartopy: 0.21.1
seaborn: None
numbagg: None
fsspec: 2023.3.0
cupy: 11.6.0
pint: 0.20.1
sparse: None
flox: None
numpy_groupies: None
setuptools: 67.6.0
pip: 23.0.1
conda: None
pytest: 7.2.2
mypy: None
IPython: 8.11.0
sphinx: None