-
-
Notifications
You must be signed in to change notification settings - Fork 356
chore/handle numcodecs codecs #3376
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3376 +/- ##
==========================================
+ Coverage 94.55% 94.60% +0.04%
==========================================
Files 79 81 +2
Lines 9447 9641 +194
==========================================
+ Hits 8933 9121 +188
- Misses 514 520 +6
🚀 New features to boost your workflow:
|
src/zarr/codecs/_numcodecs.py
Outdated
|
||
def _encode(self, chunk_data: Buffer, prototype: BufferPrototype) -> Buffer: | ||
encoded = self._codec.encode(chunk_data.as_array_like()) | ||
if isinstance(encoded, np.ndarray): # Required for checksum codecs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we know statically which are checksum codecs without the isinstance check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
n.b., this was copy + pasted from numcodecs, but I think the answer is "no"
src/zarr/codecs/_numcodecs.py
Outdated
codec_name: str | ||
codec_config: dict[str, JSON] | ||
|
||
def __init_subclass__(cls, *, codec_name: str | None = None, **kwargs: Any) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would a codec definition look like without this magic? I'd be fine with repeating a few things if it meant we could avoid this (and IIUC some of the complexity in __repr__
and __init__
would go away too?).
src/zarr/codecs/_numcodecs.py
Outdated
def __init__(self, **codec_config: JSON) -> None: | ||
super().__init__(**codec_config) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's this do?
|
||
@pytest.mark.parametrize("codec_class", [_numcodecs.PCodec, _numcodecs.ZFPY]) | ||
def test_generic_bytes_codec(codec_class: type[_numcodecs._NumcodecsArrayBytesCodec]) -> None: | ||
try: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain the different cases here? Do the pcodec zfpy codecs depend on optional numcodesc dependencies? And if so, is that the only reason we might not be able to run this test?
If so, can we maybe do pytest.importorskip(dependency_name)
and then assume we have it and avoid the xfails.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I guess we (or numcodecs?) raises a ValueError
...? If that's numcodecs, then whatever. But I don't think we should raise a ValueError if a dependency is missing.
> Can you explain the different cases here? Spinning this question out into the main thread -- from me, the general answer to questions like this will be "no", since I am only copy+pasting stuff from numcodecs. I haven't spent too much time figuring out what this code is doing. I do think @normanrz and @TomNicholas might be able to answer some of these questions though. |
Ah, I didn't realize this was mostly from numcodecs. I think that moots most of my comments aside from where in the public API we put these. |
yeah I should have made more clear that this is nearly all directly copy + pasted from |
…v-b/zarr-python into chore/handle-numcodecs-codecs
This PR brings in all the codecs defined in
numcodecs.zarr3
. After this PR is merged, we can safely replace thenumcodecs.zarr3
module with reexports from zarr python, or removenumcodecs.zarr3
entirely, thereby fixing our circular dependency problem.This PR also changes the default config to ensure that the locally-defined codecs take priority over the same codec found in the numcodecs registry.