-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
clarify lazy behaviour and eager loading chunks=None in open_*-functions #10627
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks brilliant to me, cheers very much @kmuehlbauer 🍻
@keewis It would be great if you could have a look on the wording here, too before I get this in. Thanks! |
Thanks very much for this @kmuehlbauer and cheers for reviewing @dcherian 🍻 I was talking to @pp-mo about this in pp-mo/ncdata#145 where we are trying to pop in a user warning if any of the Zarr-loaded Xarray variables have realized data ie Numpy ndarrays, and it's actually rather difficult for me to have a robust warning that doesn't emit false alarms or sits cushy on true positives, so I thought I'd ask you if you'd be willing to pop such a warning in Xarray. I can most definitely open a new issue on this, but I thought I'd first touch base with you - something a la: import warnings
def _raise_warning(var):
"""Raise a warnings.warning if variable data not lazy."""
warn_msg = (
f"Variable {var.name}{var.dims} has fully realized "
"data, if you need lazy data, then add "
"chunks={} or chunks="auto" as argument to Xarray open_dataset."
)
warnings.warn(warn_msg, UserWarning, stacklevel=2)
in func: or class:
if isinstance(var.data, np.ndarray): |
This only happens when we create a Pandas index, I don't think a warning is needed. Certainly it would be good to make our test quite better about checking these things |
Co-authored-by: Deepak Cherian <[email protected]>
well, it happens with Hopefully with this in now, no more numpties like I was 🍺 |
* main: (46 commits) use the new syntax of ignoring bots (pydata#10668) modification methods on `Coordinates` (pydata#10318) Silence warnings from test_tutorial.py (pydata#10661) test: update write_empty test for zarr 3.1.2 (pydata#10665) Bump actions/checkout from 4 to 5 in the actions group (pydata#10652) Add load_datatree function (pydata#10649) Support compute=False from DataTree.to_netcdf (pydata#10625) Fix typos (pydata#10655) In case of misconfiguration of dataset.encoding `unlimited_dims` warn instead of raise (pydata#10648) fix ``auto_complex`` for ``open_datatree`` (pydata#10632) Fix bug indexing with boolean scalars (pydata#10635) Improve DataTree typing (pydata#10644) Update Cartopy and Iris references (pydata#10645) Empty release notes (pydata#10642) release notes for v2025.08.0 (pydata#10641) Fix `ds.merge` to prevent altering original object depending on join value (pydata#10596) Add asynchronous load method (pydata#10327) Add DataTree.prune() method … (pydata#10598) Avoid refining parent dimensions in NetCDF files (pydata#10623) clarify lazy behaviour and eager loading chunks=None in open_*-functions (pydata#10627) ...
WritableCFDataStore
realizes variable data when loading with object stored Zarr store #10612whats-new.rst