Skip to content

Conversation

shoyer
Copy link
Member

@shoyer shoyer commented Sep 8, 2024

Previously, we were copying parent coordinates/dimensions onto all child nodes. This is not obvious in the current repr, but you can see it from looking at the private ._node_coord_variables and ._node_dims.

To make the use of _to_dataset_view() little more obvious, I've added a required boolean inherited argument.

Fixes pydata#9454

Previously, we were copying parent coordinates/dimensions onto all
child nodes. This is not obvious in the current repr, but you can see
it from looking at the private `._node_coord_variables` and
`._node_dims`.

To make the use of `_to_dataset_view()` little more obvious, I've added
a required boolean `inherited` argument.
@TomNicholas TomNicholas added the topic-DataTree Related to the implementation of a DataTree class label Sep 9, 2024
Copy link
Member

@TomNicholas TomNicholas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @shoyer ! That was quite a subtle bug.

My review comments are just about clarifying code comments.

@TomNicholas TomNicholas added the plan to merge Final call for comments label Sep 9, 2024
@shoyer shoyer merged commit 40666b2 into pydata:main Sep 9, 2024
34 checks passed
@shoyer shoyer deleted the copy-inheritance branch September 9, 2024 16:23
@shoyer
Copy link
Member Author

shoyer commented Sep 9, 2024

Thanks Tom for the comment suggestions!

dcherian added a commit to dcherian/xarray that referenced this pull request Sep 17, 2024
* main: (26 commits)
  Forbid modifying names of DataTree objects with parents (pydata#9494)
  DAS-2155 - Merge datatree documentation into main docs. (pydata#9033)
  Make illegal path-like variable names when constructing a DataTree from a Dataset (pydata#9378)
  Ensure TreeNode doesn't copy in-place (pydata#9482)
  `open_groups` for zarr backends (pydata#9469)
  Update pyproject.toml (pydata#9484)
  New whatsnew section (pydata#9483)
  Release notes for v2024.09.0 (pydata#9480)
  Fix `DataTree.coords.__setitem__` by adding `DataTreeCoordinates` class (pydata#9451)
  Rename DataTree's "ds" and "data" to "dataset" (pydata#9476)
  Update DataTree repr to indicate inheritance (pydata#9470)
  Bump pypa/gh-action-pypi-publish in the actions group (pydata#9460)
  Repo checker (pydata#9450)
  Add days_in_year and decimal_year to dt accessor (pydata#9105)
  remove parent argument from DataTree.__init__ (pydata#9465)
  Fix inheritance in DataTree.copy() (pydata#9457)
  Implement `DataTree.__delitem__` (pydata#9453)
  Add ASV for datatree.from_dict (pydata#9459)
  Make the first argument in DataTree.from_dict positional only (pydata#9446)
  Fix typos across the code, doc and comments (pydata#9443)
  ...
dcherian added a commit to dcherian/xarray that referenced this pull request Sep 17, 2024
* main: (29 commits)
  Release notes for v2024.09.0 (pydata#9480)
  Fix `DataTree.coords.__setitem__` by adding `DataTreeCoordinates` class (pydata#9451)
  Rename DataTree's "ds" and "data" to "dataset" (pydata#9476)
  Update DataTree repr to indicate inheritance (pydata#9470)
  Bump pypa/gh-action-pypi-publish in the actions group (pydata#9460)
  Repo checker (pydata#9450)
  Add days_in_year and decimal_year to dt accessor (pydata#9105)
  remove parent argument from DataTree.__init__ (pydata#9465)
  Fix inheritance in DataTree.copy() (pydata#9457)
  Implement `DataTree.__delitem__` (pydata#9453)
  Add ASV for datatree.from_dict (pydata#9459)
  Make the first argument in DataTree.from_dict positional only (pydata#9446)
  Fix typos across the code, doc and comments (pydata#9443)
  DataTree should not be "Generic" (pydata#9445)
  Disallow passing a DataArray as data into the DataTree constructor (pydata#9444)
  Support additional dtypes in `resample` (pydata#9413)
  Shallow copy parent and children in DataTree constructor (pydata#9297)
  Bump minimum versions for dependencies (pydata#9434)
  Always include at least one category in random test data (pydata#9436)
  Avoid deep-copy when constructing groupby codes (pydata#9429)
  ...
hollymandel pushed a commit to hollymandel/xarray that referenced this pull request Sep 23, 2024
* Fix inheritance in DataTree.copy()

Fixes pydata#9454

Previously, we were copying parent coordinates/dimensions onto all
child nodes. This is not obvious in the current repr, but you can see
it from looking at the private `._node_coord_variables` and
`._node_dims`.

To make the use of `_to_dataset_view()` little more obvious, I've added
a required boolean `inherited` argument.

* typing error

* add missing inherited argument

* Apply suggestions from code review

Co-authored-by: Tom Nicholas <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* tweaks to from_dict

* add issue link

---------

Co-authored-by: Tom Nicholas <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
plan to merge Final call for comments topic-DataTree Related to the implementation of a DataTree class
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Deep copying a datatree somehow duplicates data onto child nodes
2 participants