Skip to content

Deep copy not deep? #9775

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
5 tasks done
whophil opened this issue Nov 13, 2024 · 2 comments
Closed
5 tasks done

Deep copy not deep? #9775

whophil opened this issue Nov 13, 2024 · 2 comments
Labels

Comments

@whophil
Copy link

whophil commented Nov 13, 2024

What happened?

I have a NetCDF file produced by an upstream process. After loading this into an xarray dataset and calling DataSet.copy(deep=True), I find that some coordinates still share memory with the source of the copy.

The NetCDF file in question:
data_clean.nc.zip

What did you expect to happen?

I expected a deep copy to be deep, with no shared memory between.

Minimal Complete Verifiable Example

import xarray as xr
import numpy as np

ds = xr.load_dataset("/Users/philchiu/repos/spheres/tests/data_clean.nc")
ds2 = ds.copy(deep=True)

print(ds.variables["FREQUENCY"].data)
ds2.variables["FREQUENCY"].data[:] = 99.99

# This should be the original array, but instead it's all 99.99
print(ds.variables["FREQUENCY"].data)

# I would expect both of these to print False, but it prints True, False
print(np.shares_memory(ds.variables["FREQUENCY"].data, ds2.variables["FREQUENCY"].data))
print(np.shares_memory(ds.variables["AMPLITUDE"].data, ds2.variables["AMPLITUDE"].data))

MVCE confirmation

  • Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • Complete example — the example is self-contained, including all data and the text of any traceback.
  • Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • New issue — a search of GitHub Issues suggests this is not a duplicate.
  • Recent environment — the issue occurs with the latest version of xarray and its dependencies.

Relevant log output

import xarray as xr
import numpy as np

ds = xr.load_dataset("data_clean.nc")
ds2 = ds.copy(deep=True)

print(ds.variables["FREQUENCY"].data)
ds2.variables["FREQUENCY"].data[:] = 99.99

# This should be the original array, but instead it's all 99.99
print(ds.variables["FREQUENCY"].data)

# I would expect both of these to print False, but it prints True, False
print(np.shares_memory(ds.variables["FREQUENCY"].data, ds2.variables["FREQUENCY"].data))
print(np.shares_memory(ds.variables["AMPLITUDE"].data, ds2.variables["AMPLITUDE"].data))

Anything else we need to know?

No response

Environment

>>> xr.show_versions()

INSTALLED VERSIONS
------------------
commit: None
python: 3.10.15 | packaged by conda-forge | (main, Sep 20 2024, 16:31:41) [Clang 17.0.6 ]
python-bits: 64
OS: Darwin
OS-release: 21.6.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: None
LOCALE: (None, 'UTF-8')
libhdf5: 1.14.3
libnetcdf: 4.9.2

xarray: 2024.9.0
pandas: 2.2.3
numpy: 2.1.1
scipy: 1.14.1
netCDF4: 1.7.1
pydap: None
h5netcdf: None
h5py: None
zarr: None
cftime: 1.6.4
nc_time_axis: None
iris: None
bottleneck: None
dask: None
distributed: None
matplotlib: 3.9.2
cartopy: None
seaborn: None
numbagg: None
fsspec: 2024.9.0
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: None
setuptools: 75.1.0
pip: 24.2
conda: None
pytest: 8.3.3
mypy: None
IPython: None
sphinx: None
@whophil whophil added bug needs triage Issue that has not been reviewed by xarray team member labels Nov 13, 2024
@kmuehlbauer
Copy link
Contributor

kmuehlbauer commented Nov 15, 2024

Dupe of #7463, your FREQUENCY is also a Coordinate.

@kmuehlbauer kmuehlbauer removed the needs triage Issue that has not been reviewed by xarray team member label Nov 15, 2024
@kmuehlbauer
Copy link
Contributor

One more detail here:

ds2.variables["FREQUENCY"].data[:] = 99.99

With this you are sidestepping all security measures and you need to know what you are doing.

This would be the canonical invocation:

ds2["FREQUENCY"][:] = 99.99
[  100.   125.   160.   200.   250.   315.   400.   500.   630.   800.
  1000.  1250.  1600.  2000.  2500.  3150.  4000.  5000.  6300.  8000.
 10000.]

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[10], line 8
      5 ds2 = ds.copy(deep=True)
      7 print(ds.variables["FREQUENCY"].data)
----> 8 ds2.variables["FREQUENCY"][:] = 99.99
     10 # This should be the original array, but instead it's all 99.99
     11 print(ds.variables["FREQUENCY"].data)

File [/home/kai/data/mambaforge/envs/xr_312_np2/lib/python3.12/site-packages/xarray/core/variable.py:2710](http://localhost:8888/lab/tree/automount/hubhome/k.muehlbauer/home/kai/data/mambaforge/envs/xr_312_np2/lib/python3.12/site-packages/xarray/core/variable.py#line=2709), in IndexVariable.__setitem__(self, key, value)
   2709 def __setitem__(self, key, value):
-> 2710     raise TypeError(f"{type(self).__name__} values cannot be modified")

TypeError: IndexVariable values cannot be modified

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants