Skip to content

Improve error handling in zarr.open_group #2821

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
aladinor opened this issue Feb 12, 2025 · 8 comments
Open

Improve error handling in zarr.open_group #2821

aladinor opened this issue Feb 12, 2025 · 8 comments
Labels
bug Potential issues with the zarr-python library

Comments

@aladinor
Copy link

Zarr version

v3.0.1

Numcodecs version

v0.15.0

Python Version

3.12

Operating System

Linux

Installation

conda

Description

Based on this comment the error message should be improved when attempting to open a Zarr group.

Steps to reproduce

import zarr

# Create the root group in a directory
root = zarr.open('example.zarr', mode='w')
# Create variable 'w' at the root level
root.create_array('w', shape=(128,), dtype='float64')
# Create group 'a' and add variable 'A'
group_a = root.create_group('a')
group_a.create_array('A', shape=(128, 256), dtype='float64', chunks=(64, 256))

# Create group 'c' and add variables 'w' and 'z'
group_c = root.create_group('c')
group_c.create_array('w', shape=(128,), dtype='float64')
group_c.create_array('z', shape=(128, 256), dtype='float64', chunks=(64, 256))

# Create subgroup 'd' under group 'c' and add variable 'G'
group_d = group_c.create_group('d')
group_d.create_array('G', shape=(128, 256), dtype='float64', chunks=(64, 256))

# Consolidate metadata
zarr.consolidate_metadata('example.zarr')

print(root.tree())
/
├── a
│   └── A (128, 256) float64
├── c
│   ├── d
│   │   └── G (128, 256) float64
│   ├── w (128,) float64
│   └── z (128, 256) float64
└── w (128,) float64

when reading a subgroup that doesn't exist,

open_kwargs = {'mode': 'r', 'path': '/w', 'storage_options': None, 'synchronizer': None, 'zarr_format': None}
  root = zarr.open_group(
      'example.zarr',
      **open_kwargs
  )

Traceback (most recent call last):
  File "/snap/pycharm-community/439/plugins/python-ce/helpers/pydev/pydevd.py", line 1570, in _exec
    pydev_imports.execfile(file, globals, locals)  # execute the script
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/snap/pycharm-community/439/plugins/python-ce/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "/media/alfonso/drive/Alfonso/python/xarray/delete_Zarr.py", line 40, in <module>
    open_zarr_store()
  File "/media/alfonso/drive/Alfonso/python/xarray/delete_Zarr.py", line 31, in open_zarr_store
    root = zarr.open_group(
           ^^^^^^^^^^^^^^^^
  File "/home/alfonso/mambaforge/envs/xarray-tests/lib/python3.12/site-packages/zarr/_compat.py", line 43, in inner_f
    return f(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^
  File "/home/alfonso/mambaforge/envs/xarray-tests/lib/python3.12/site-packages/zarr/api/synchronous.py", line 524, in open_group
    sync(
  File "/home/alfonso/mambaforge/envs/xarray-tests/lib/python3.12/site-packages/zarr/core/sync.py", line 142, in sync
    raise return_result
  File "/home/alfonso/mambaforge/envs/xarray-tests/lib/python3.12/site-packages/zarr/core/sync.py", line 98, in _runner
    return await coro
           ^^^^^^^^^^
  File "/home/alfonso/mambaforge/envs/xarray-tests/lib/python3.12/site-packages/zarr/api/asynchronous.py", line 807, in open_group
    return await AsyncGroup.open(
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/alfonso/mambaforge/envs/xarray-tests/lib/python3.12/site-packages/zarr/core/group.py", line 553, in open
    return cls._from_bytes_v3(
           ^^^^^^^^^^^^^^^^^^^
  File "/home/alfonso/mambaforge/envs/xarray-tests/lib/python3.12/site-packages/zarr/core/group.py", line 617, in _from_bytes_v3
    return cls.from_dict(store_path, group_metadata)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/alfonso/mambaforge/envs/xarray-tests/lib/python3.12/site-packages/zarr/core/group.py", line 626, in from_dict
    metadata=GroupMetadata.from_dict(data),
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/alfonso/mambaforge/envs/xarray-tests/lib/python3.12/site-packages/zarr/core/group.py", line 388, in from_dict
    assert data.pop("node_type", None) in ("group", None)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError

Additional output

No response

@aladinor aladinor added the bug Potential issues with the zarr-python library label Feb 12, 2025
@d-v-b
Copy link
Contributor

d-v-b commented Feb 12, 2025

thanks for the report, we should definitely fix this!

@TomNicholas
Copy link
Member

What should the error be here? KeyError? ValueError? Something more zarr-specific?

@d-v-b
Copy link
Contributor

d-v-b commented Mar 5, 2025

At this level of the API something KeyError-like seems correct here.

@jhamman
Copy link
Member

jhamman commented Mar 5, 2025

IIRC, 2.18 raises a KeyError, so like @d-v-b said, that is a good place to start. If we want to go beyond that, I would say a Zarr-specific error that inherits from KeyError would be prudent.

@TomNicholas
Copy link
Member

Turns out you can reproduce a similar problem with a much simpler example:

store = MemoryStore

root = zarr.open(store=store, mode='w')
root.create_group('foo')

# open non-existent group
zarr.open_group(
    store=store,
    path='/bar',
    mode='r',
)
FileNotFoundError: Unable to find group: memory://4434113664/bar

Presumably this should also raise a KeyError - I'm going to try and fix this.

@TomNicholas
Copy link
Member

@d-v-b @jhamman I'm not sure how deep in the code this should be fixed. There are many places in the generic store-implementation-agnostic API that raise FileNotFoundError, which already seems wrong to me: e.g. MemoryStores don't have files! Shall I just change them all to KeyError?

@d-v-b
Copy link
Contributor

d-v-b commented Mar 5, 2025

I think we should consider this carefully, in part because if any of these exceptions can bubble up the user then they are effectively public API.

If we ignore off-target effects of this change, I don't see what FileNotFoundError gives us over KeyError in the context of our store API, so I'd be fine with the proposed change. But I would really like some careful analysis of this first.

@TomNicholas
Copy link
Member

TomNicholas commented Mar 5, 2025

I think there's two separate issues. One is that the code can get into an inconsistent state, which is caught by this assert:

assert data.pop("node_type", None) in ("group", None)

The second is that the code raises FileNotFoundErrors in various places that shouldn't even be aware of the concept of a file (because they could be used to interact with Store implementations that don't use files, such as MemoryStore).

raise FileNotFoundError(f"Unable to find group: {store_path}")

raise FileNotFoundError(store_path)

This should either be replaced with KeyError or possibly raise KeyError from FileNotFoundError.

I think both should be fixed, but I might see if I can fix the former separately from fixing the latter, as I can see that changing the latter might require some discussion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Potential issues with the zarr-python library
Projects
None yet
Development

No branches or pull requests

4 participants