Skip to content

DataArray with multiple (Pandas)Indexes on the same dimension is impossible to align #8236

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
4 tasks done
headtr1ck opened this issue Sep 27, 2023 · 3 comments
Closed
4 tasks done

Comments

@headtr1ck
Copy link
Collaborator

headtr1ck commented Sep 27, 2023

What happened?

I have a DataArray with a single dimension and multiple (Pandas)Indexes assigned to various coordinates for efficient indexing using sel.

Edit: the problem is even worse than originally described below: such a DataArray breaks all alignment and it's basically unusable...


When I try to add an additional coordinate without any index (I simply use the tuple[dimension, values] way) I get a ValueError about aligning with conflicting indexes.

If the original DataArray only has a single (Pandas)Index everything works as expected.

What did you expect to happen?

I expected that I can simply assign new coordinates without an index.

Minimal Complete Verifiable Example

import xarray as xr

da = xr.DataArray(
    [1, 2, 3],
    dims="t",
    coords={
        "a": ("t", [3, 4, 5]),
        "b": ("t", [5, 6, 7])
    }
)

# set one index
da2 = da.set_xindex("a")

# set second index (same dimension, maybe thats a problem?)
da3 = da2.set_xindex("b")

# this works
da2.coords["c"] = ("t", [2, 3, 4])

# this does not
da3.coords["c"] = ("t", [2, 3, 4])

MVCE confirmation

  • Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • Complete example — the example is self-contained, including all data and the text of any traceback.
  • Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • New issue — a search of GitHub Issues suggests this is not a duplicate.

Relevant log output

ValueError: cannot re-index or align objects with conflicting indexes found for the following dimensions: 't' (2 conflicting indexes)
Conflicting indexes may occur when

  • they relate to different sets of coordinate and/or dimension names
  • they don't have the same type
  • they may be used to reindex data along common dimensions

Anything else we need to know?

No response

Environment

INSTALLED VERSIONS

commit: None
python: 3.9.10 (main, Mar 21 2022, 13:08:11)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-44)]
python-bits: 64
OS: Linux
OS-release: 3.10.0-1160.66.1.el7.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.12.2
libnetcdf: 4.9.0

xarray: 2022.12.0
pandas: 2.0.2
numpy: 1.24.3
scipy: 1.10.0
netCDF4: 1.6.2
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: 1.6.2
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: None
distributed: None
matplotlib: 3.6.3
cartopy: None
seaborn: None
numbagg: None
fsspec: None
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: None
setuptools: 58.1.0
pip: 21.2.4
conda: None
pytest: 7.3.2
mypy: 1.0.0
IPython: 8.8.0
sphinx: None

I have not yet tried this with a newer version of xarray....

@headtr1ck headtr1ck added bug needs triage Issue that has not been reviewed by xarray team member labels Sep 27, 2023
@dcherian dcherian added topic-indexing and removed needs triage Issue that has not been reviewed by xarray team member labels Sep 27, 2023
@headtr1ck
Copy link
Collaborator Author

Actually, I cannot do any operation that requires alignment.
Things like adding another DataArray do not work.

@headtr1ck headtr1ck changed the title Cannot assign new coordinates on DataArray with multiple PandasIndexes Cannot align DataArray with multiple PandasIndexes Sep 28, 2023
@headtr1ck headtr1ck changed the title Cannot align DataArray with multiple PandasIndexes DataArray with multiple (Pandas)Indexes on the same dimension is impossible to align Sep 30, 2023
@benbovy
Copy link
Member

benbovy commented Oct 1, 2023

This is duplicate of #7695. It is not really a bug but rather a current severe limitation of using multiple indexes along the same dimension(s).

We should at least document it. I think a reasonable solution to unlock many cases like assigning new unindexed coordinates would be to relax a bit this constraint by looking at the results returned by Index.reindex_like() for all indexes: if several indexers are found along the same dimension, check if they are equal and raise otherwise.

@headtr1ck
Copy link
Collaborator Author

Ok, with my original error I did not find the duplicate issue.
Closing this now, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

No branches or pull requests

3 participants