-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
fix auto_complex
for open_datatree
#10632
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
To test this successfully I added a |
@TomNicholas One quick question, would this work? Or do you see another way of testing this? Thanks!
|
The failing test seems to reveal an issue with aligning after the merge of #10623. (cc) @shoyer Redefinition like below can be written. But can't be loaded again using datatree. MCVE: base = xr.Dataset(coords={"x": [1, 2]})
child = xr.Dataset(coords={"x": [1, 2, 3]})
base.to_netcdf("test.nc", mode="w")
child.to_netcdf("test.nc", group="child", mode="a")
ds = xr.open_datatree("test.nc") Error Traceback---------------------------------------------------------------------------
AlignmentError Traceback (most recent call last)
File [/home/kai/data/mambaforge/envs/xr_312_np2/lib/python3.12/site-packages/xarray/core/datatree.py:168](http://localhost:8888/lab/tree/home/kai/python/gists/xarray/triage/10636/home/kai/data/mambaforge/envs/xr_312_np2/lib/python3.12/site-packages/xarray/core/datatree.py#line=167), in check_alignment(path, node_ds, parent_ds, children)
167 try:
--> 168 align(node_ds, parent_ds, join="exact", copy=False)
169 except ValueError as e:
File [/home/kai/data/mambaforge/envs/xr_312_np2/lib/python3.12/site-packages/xarray/structure/alignment.py:968](http://localhost:8888/lab/tree/home/kai/python/gists/xarray/triage/10636/home/kai/data/mambaforge/envs/xr_312_np2/lib/python3.12/site-packages/xarray/structure/alignment.py#line=967), in align(join, copy, indexes, exclude, fill_value, *objects)
960 aligner = Aligner(
961 objects,
962 join=join,
(...)
966 fill_value=fill_value,
967 )
--> 968 aligner.align()
969 return aligner.results
File [/home/kai/data/mambaforge/envs/xr_312_np2/lib/python3.12/site-packages/xarray/structure/alignment.py:661](http://localhost:8888/lab/tree/home/kai/python/gists/xarray/triage/10636/home/kai/data/mambaforge/envs/xr_312_np2/lib/python3.12/site-packages/xarray/structure/alignment.py#line=660), in Aligner.align(self)
660 self.align_indexes()
--> 661 self.assert_unindexed_dim_sizes_equal()
663 if self.join == "override":
File [/home/kai/data/mambaforge/envs/xr_312_np2/lib/python3.12/site-packages/xarray/structure/alignment.py:523](http://localhost:8888/lab/tree/home/kai/python/gists/xarray/triage/10636/home/kai/data/mambaforge/envs/xr_312_np2/lib/python3.12/site-packages/xarray/structure/alignment.py#line=522), in Aligner.assert_unindexed_dim_sizes_equal(self)
522 if len(sizes) > 1:
--> 523 raise AlignmentError(
524 f"cannot reindex or align along dimension {dim!r} "
525 f"because of conflicting dimension sizes: {sizes!r}" + add_err_msg
526 )
AlignmentError: cannot reindex or align along dimension 'x' because of conflicting dimension sizes: {2, 3}
The above exception was the direct cause of the following exception:
ValueError Traceback (most recent call last)
Cell In[51], line 1
----> 1 ds = xr.open_datatree("test.nc")
File [/home/kai/data/mambaforge/envs/xr_312_np2/lib/python3.12/site-packages/xarray/backends/api.py:1220](http://localhost:8888/lab/tree/home/kai/python/gists/xarray/triage/10636/home/kai/data/mambaforge/envs/xr_312_np2/lib/python3.12/site-packages/xarray/backends/api.py#line=1219), in open_datatree(filename_or_obj, engine, chunks, cache, decode_cf, mask_and_scale, decode_times, decode_timedelta, use_cftime, concat_characters, decode_coords, drop_variables, create_default_indexes, inline_array, chunked_array_type, from_array_kwargs, backend_kwargs, **kwargs)
1208 decoders = _resolve_decoders_kwargs(
1209 decode_cf,
1210 open_backend_dataset_parameters=backend.open_dataset_parameters,
(...)
1216 decode_coords=decode_coords,
1217 )
1218 overwrite_encoded_chunks = kwargs.pop("overwrite_encoded_chunks", None)
-> 1220 backend_tree = backend.open_datatree(
1221 filename_or_obj,
1222 drop_variables=drop_variables,
1223 **decoders,
1224 **kwargs,
1225 )
1227 tree = _datatree_from_backend_datatree(
1228 backend_tree,
1229 filename_or_obj,
(...)
1240 **kwargs,
1241 )
1243 return tree
File [/home/kai/data/mambaforge/envs/xr_312_np2/lib/python3.12/site-packages/xarray/backends/netCDF4_.py:738](http://localhost:8888/lab/tree/home/kai/python/gists/xarray/triage/10636/home/kai/data/mambaforge/envs/xr_312_np2/lib/python3.12/site-packages/xarray/backends/netCDF4_.py#line=737), in NetCDF4BackendEntrypoint.open_datatree(self, filename_or_obj, mask_and_scale, decode_times, concat_characters, decode_coords, drop_variables, use_cftime, decode_timedelta, group, format, clobber, diskless, persist, auto_complex, lock, autoclose, **kwargs)
698 def open_datatree(
699 self,
700 filename_or_obj: T_PathFileOrDataStore,
(...)
717 **kwargs,
718 ) -> DataTree:
719 groups_dict = self.open_groups_as_dict(
720 filename_or_obj,
721 mask_and_scale=mask_and_scale,
(...)
735 **kwargs,
736 )
--> 738 return datatree_from_dict_with_io_cleanup(groups_dict)
File [/home/kai/data/mambaforge/envs/xr_312_np2/lib/python3.12/site-packages/xarray/backends/common.py:276](http://localhost:8888/lab/tree/home/kai/python/gists/xarray/triage/10636/home/kai/data/mambaforge/envs/xr_312_np2/lib/python3.12/site-packages/xarray/backends/common.py#line=275), in datatree_from_dict_with_io_cleanup(groups_dict)
274 """DataTree.from_dict with file clean-up."""
275 try:
--> 276 tree = DataTree.from_dict(groups_dict)
277 except Exception:
278 for ds in groups_dict.values():
File [/home/kai/data/mambaforge/envs/xr_312_np2/lib/python3.12/site-packages/xarray/core/datatree.py:1221](http://localhost:8888/lab/tree/home/kai/python/gists/xarray/triage/10636/home/kai/data/mambaforge/envs/xr_312_np2/lib/python3.12/site-packages/xarray/core/datatree.py#line=1220), in DataTree.from_dict(cls, d, name)
1219 else:
1220 raise TypeError(f"invalid values: {data}")
-> 1221 obj._set_item(
1222 path,
1223 new_node,
1224 allow_overwrite=False,
1225 new_nodes_along_path=True,
1226 )
1228 # TODO: figure out why mypy is raising an error here, likely something
1229 # to do with the return type of Dataset.copy()
1230 return obj
File [/home/kai/data/mambaforge/envs/xr_312_np2/lib/python3.12/site-packages/xarray/core/treenode.py:650](http://localhost:8888/lab/tree/home/kai/python/gists/xarray/triage/10636/home/kai/data/mambaforge/envs/xr_312_np2/lib/python3.12/site-packages/xarray/core/treenode.py#line=649), in TreeNode._set_item(self, path, item, new_nodes_along_path, allow_overwrite)
648 raise KeyError(f"Already a node object at path {path}")
649 else:
--> 650 current_node._set(name, item)
File [/home/kai/data/mambaforge/envs/xr_312_np2/lib/python3.12/site-packages/xarray/core/datatree.py:967](http://localhost:8888/lab/tree/home/kai/python/gists/xarray/triage/10636/home/kai/data/mambaforge/envs/xr_312_np2/lib/python3.12/site-packages/xarray/core/datatree.py#line=966), in DataTree._set(self, key, val)
965 new_node = val.copy(deep=False)
966 new_node.name = key
--> 967 new_node._set_parent(new_parent=self, child_name=key)
968 else:
969 if not isinstance(val, DataArray | Variable):
970 # accommodate other types that can be coerced into Variables
File [/home/kai/data/mambaforge/envs/xr_312_np2/lib/python3.12/site-packages/xarray/core/treenode.py:115](http://localhost:8888/lab/tree/home/kai/python/gists/xarray/triage/10636/home/kai/data/mambaforge/envs/xr_312_np2/lib/python3.12/site-packages/xarray/core/treenode.py#line=114), in TreeNode._set_parent(self, new_parent, child_name)
113 self._check_loop(new_parent)
114 self._detach(old_parent)
--> 115 self._attach(new_parent, child_name)
File [/home/kai/data/mambaforge/envs/xr_312_np2/lib/python3.12/site-packages/xarray/core/treenode.py:152](http://localhost:8888/lab/tree/home/kai/python/gists/xarray/triage/10636/home/kai/data/mambaforge/envs/xr_312_np2/lib/python3.12/site-packages/xarray/core/treenode.py#line=151), in TreeNode._attach(self, parent, child_name)
147 if child_name is None:
148 raise ValueError(
149 "To directly set parent, child needs a name, but child is unnamed"
150 )
--> 152 self._pre_attach(parent, child_name)
153 parentchildren = parent._children
154 assert not any(child is self for child in parentchildren), (
155 "Tree is corrupt."
156 )
File [/home/kai/data/mambaforge/envs/xr_312_np2/lib/python3.12/site-packages/xarray/core/datatree.py:551](http://localhost:8888/lab/tree/home/kai/python/gists/xarray/triage/10636/home/kai/data/mambaforge/envs/xr_312_np2/lib/python3.12/site-packages/xarray/core/datatree.py#line=550), in DataTree._pre_attach(self, parent, name)
549 node_ds = self.to_dataset(inherit=False)
550 parent_ds = parent._to_dataset_view(rebuild_dims=False, inherit=True)
--> 551 check_alignment(path, node_ds, parent_ds, self.children)
552 _deduplicate_inherited_coordinates(self, parent)
File [/home/kai/data/mambaforge/envs/xr_312_np2/lib/python3.12/site-packages/xarray/core/datatree.py:172](http://localhost:8888/lab/tree/home/kai/python/gists/xarray/triage/10636/home/kai/data/mambaforge/envs/xr_312_np2/lib/python3.12/site-packages/xarray/core/datatree.py#line=171), in check_alignment(path, node_ds, parent_ds, children)
170 node_repr = _indented(_without_header(repr(node_ds)))
171 parent_repr = _indented(dims_and_coords_repr(parent_ds))
--> 172 raise ValueError(
173 f"group {path!r} is not aligned with its parents:\n"
174 f"Group:\n{node_repr}\nFrom parents:\n{parent_repr}"
175 ) from e
177 if children:
178 if parent_ds is not None:
ValueError: group '/child' is not aligned with its parents:
Group:
Dimensions: (x: 3)
Coordinates:
x (x) int64 24B ...
Data variables:
*empty*
From parents:
Dimensions: (x: 2)
Coordinates:
x (x) int64 16B ... |
Nevermind, I was wrongly assuming redefinitions were possible I've consulted the docs: The constraint that this puts on a DataTree is that dimensions and indices that are inherited must be aligned with any direct descendant node’s existing dimension or index. This allows descendants to use dimensions defined in ancestor nodes, without duplicating that information. But as a consequence, if a dimension-name is defined in on a node and that same dimension-name exists in one of its ancestors, they must align (have the same index and size). So all good here, I'll think how to fix or skip that particular test. Suggestions welcome. |
This is ready for review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Kai, looks great!
Thanks @shoyer for the review! |
* main: (46 commits) use the new syntax of ignoring bots (pydata#10668) modification methods on `Coordinates` (pydata#10318) Silence warnings from test_tutorial.py (pydata#10661) test: update write_empty test for zarr 3.1.2 (pydata#10665) Bump actions/checkout from 4 to 5 in the actions group (pydata#10652) Add load_datatree function (pydata#10649) Support compute=False from DataTree.to_netcdf (pydata#10625) Fix typos (pydata#10655) In case of misconfiguration of dataset.encoding `unlimited_dims` warn instead of raise (pydata#10648) fix ``auto_complex`` for ``open_datatree`` (pydata#10632) Fix bug indexing with boolean scalars (pydata#10635) Improve DataTree typing (pydata#10644) Update Cartopy and Iris references (pydata#10645) Empty release notes (pydata#10642) release notes for v2025.08.0 (pydata#10641) Fix `ds.merge` to prevent altering original object depending on join value (pydata#10596) Add asynchronous load method (pydata#10327) Add DataTree.prune() method … (pydata#10598) Avoid refining parent dimensions in NetCDF files (pydata#10623) clarify lazy behaviour and eager loading chunks=None in open_*-functions (pydata#10627) ...
whats-new.rst
Additionally to the fix, this pipes
TestNetCDF4Data
throughopen_datatree
as regression test on that code-path.