-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
What happened?
I have nan values in a date vector stored in a netCDF. When I read on my ARM Apple computer with xr.open_dataset()
, it is not properly recognized.
For example, the following data is stored in a NetCDF:
date = pd.date_range(...)
date[4] = nan
Then when I read the file:
date[4]
is set to date[0]
, which is the first date of the range instead of a 'NaT'.
I understand that this issue is quite weird and it doesn't seem to happen on other OS. Actually, I try on MacOS (with an intel processor) and on two different Linux computers, and in those configurations, date[4]
is properly set to 'NaT' after opening the netCDF with xr.open_dataset()
. Note that I tried with the same version of xarray as well as with different versions, and I just can't seem to reproduce this issue on any machine except on the M1 ARM chip.
What did you expect to happen?
I expect the following result after running the minimal example:
array(['2022-01-01T00:00:00.000000000', '2022-01-02T00:00:00.000000000',
'2022-01-03T00:00:00.000000000', '2022-01-04T00:00:00.000000000',
'NaT', '2022-01-06T00:00:00.000000000',
'2022-01-07T00:00:00.000000000', '2022-01-08T00:00:00.000000000',
'2022-01-09T00:00:00.000000000', '2022-01-10T00:00:00.000000000'],
dtype='datetime64[ns]')
Minimal Complete Verifiable Example
import xarray as xr
import pandas as pd
import numpy as np
time = pd.date_range(start="2022-01-01",end="2022-01-10").to_pydatetime()
time[4] = np.datetime64("NaT")
ds = xr.Dataset(
data_vars=dict(
time=(["nt"], time),
),
)
ds.to_netcdf('test.nc')
ds_r = xr.open_dataset('test.nc')
ds_r.time
Relevant log output
array(['2022-01-01T00:00:00.000000000', '2022-01-02T00:00:00.000000000',
'2022-01-03T00:00:00.000000000', '2022-01-04T00:00:00.000000000',
'2022-01-01T00:00:00.000000000', '2022-01-06T00:00:00.000000000',
'2022-01-07T00:00:00.000000000', '2022-01-08T00:00:00.000000000',
'2022-01-09T00:00:00.000000000', '2022-01-10T00:00:00.000000000'],
dtype='datetime64[ns]')
Anything else we need to know?
No response
Environment
INSTALLED VERSIONS
commit: None
python: 3.10.1 | packaged by conda-forge | (main, Dec 22 2021, 01:38:36) [Clang 11.1.0 ]
python-bits: 64
OS: Darwin
OS-release: 21.2.0
machine: arm64
processor: arm
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.12.1
libnetcdf: 4.8.1
xarray: 0.20.2
pandas: 1.3.5
numpy: 1.21.5
scipy: 1.7.3
netCDF4: 1.5.8
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: 1.5.1.1
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: 2021.12.0
distributed: 2021.12.0
matplotlib: 3.5.1
cartopy: 0.20.1
seaborn: None
numbagg: None
fsspec: 2021.11.1
cupy: None
pint: None
sparse: None
setuptools: 60.0.4
pip: 21.3.1
conda: None
pytest: None
IPython: 8.0.0
sphinx: None