Skip to content

SerializationWarning for coordinate variables goes against CF conventions #10305

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
5 tasks done
djhoese opened this issue May 9, 2025 · 2 comments
Open
5 tasks done

Comments

@djhoese
Copy link
Contributor

djhoese commented May 9, 2025

What happened?

When saving a variable as a NetCDF coordinate variable (in the CF sense a variable with the same name as its dimension) you get a serialization warning:

SerializationWarning: saving variable x with floating point data as an integer dtype without any _FillValue to use for NaNs

CF coordinate variables are not supposed to have a _FillValue defined:

https://cfconventions.org/Data/cf-conventions/cf-conventions-1.7/cf-conventions.html#missing-data

Missing data is not allowed in coordinate variables.

What did you expect to happen?

For variables that are CF coordinate variables I expect no warning. Or if I explicitly set the _FillValue to None then I would expect no warning.

Minimal Complete Verifiable Example

import xarray as xr
import numpy as np

data = xr.DataArray(np.zeros((2,), dtype=np.float32), dims=("x",))
data.encoding["scale_factor"] = 1.0
data.encoding["add_offset"] = 0.0
data.encoding["dtype"] = "uint16"
data.encoding["_FillValue"] = 0.0

x = xr.DataArray(np.arange(2, dtype=np.float32), dims=("x",))
x.encoding["scale_factor"] = 1.0
x.encoding["add_offset"] = 0.0
x.encoding["dtype"] = "uint16"
x.encoding["_FillValue"] = None

ds = xr.Dataset({
    "data": data,
    "x": x,
})
ds.to_netcdf("test.nc")

MVCE confirmation

  • Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • Complete example — the example is self-contained, including all data and the text of any traceback.
  • Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • New issue — a search of GitHub Issues suggests this is not a duplicate.
  • Recent environment — the issue occurs with the latest version of xarray and its dependencies.

Relevant log output

SerializationWarning: saving variable x with floating point data as an integer dtype without any _FillValue to use for NaNs

Anything else we need to know?

@kmuehlbauer added this check in #7719

This stackoverflow answer says I can set fill value to None to not have a fill value for the variable, but it doesn't seem to change anything regarding the warning:

https://stackoverflow.com/a/45696423/433202

Environment

INSTALLED VERSIONS

commit: None
python: 3.12.8 | packaged by conda-forge | (main, Dec 5 2024, 14:24:40) [GCC 13.3.0]
python-bits: 64
OS: Linux
OS-release: 6.12.10-76061203-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.14.3
libnetcdf: 4.9.2

xarray: 2025.4.0
pandas: 2.2.3
numpy: 2.2.5
scipy: 1.15.2
netCDF4: 1.7.2
pydap: None
h5netcdf: 1.6.1
h5py: 3.13.0
zarr: 2.18.7
cftime: 1.6.4
nc_time_axis: None
iris: None
bottleneck: 1.4.2
dask: 2025.4.1
distributed: 2025.4.1
matplotlib: 3.10.1
cartopy: 0.24.0
seaborn: None
numbagg: None
fsspec: 2025.3.2
cupy: None
pint: 0.24.4
sparse: None
flox: None
numpy_groupies: None
setuptools: 80.1.0
pip: 25.1
conda: 25.3.1
pytest: 8.3.5
mypy: 1.15.0
IPython: 9.2.0
sphinx: 8.2.3

@djhoese djhoese added bug needs triage Issue that has not been reviewed by xarray team member labels May 9, 2025
@kmuehlbauer
Copy link
Contributor

Hi @djhoese, thanks for the well written report.

First, although I've committed the changes in #7719, this has merely been a refactor of already existing code. This behaviour has been around for almost 10 years now, dating back to @shoyer's PR #494.

I agree, that a warning in the case of CF coordinate variable is not needed (and is probably misleading in that sense). As a simple check for coordinate in NonStringCoder we could test if the variable name is the same as the variables only dimensions.

For the case of x.encoding["_FillValue"] = None this isn't as trivial as it might seem. In that case any nan would be converted to whatever astype returns in that case (which might different on different architectures). I'm not sure what's the best way forward here. We could catch any invalid cast RuntimeWarning and re-raise with added information. Thoughts?

@kmuehlbauer kmuehlbauer added topic-CF conventions and removed needs triage Issue that has not been reviewed by xarray team member labels May 12, 2025
@djhoese
Copy link
Contributor Author

djhoese commented May 12, 2025

Oh I see. Thanks. It is possible I updated my code to fix a warning from a CF checker (remove the fill value) and then ignored the warnings created by xarray for not having the fill value.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants