-
Notifications
You must be signed in to change notification settings - Fork 42
When opening a Zarr store, the chunks='auto'` kwarg seems to be ignored #276
Comments
Thanks for raising this @etienneschalk !
Regardless, the behaviour you describe does sound desirable, so we should fix that somewhere in the stack. cc @jhamman |
Hello @TomNicholas ,
After testing, the behaviour of
which is consistent with your statement:
In that case, I would suggest to:
Do you think this would be a good idea? Thanks! |
Thank you for testing! Again, this is an upstream xarray issue. Datatree should follow whatever xarray's behaviour is.
This might be a good idea, but xarray currently has both |
Hi @TomNicholas
So, this means, while this discussion is not settled, implementing an Do you have a link to this discussion, by any chance? I would be interested to learn more about this. Thanks, have a nice day! |
the difference between I believe the whole Edit: this means that to get the on-disk chunking you can use |
Thanks @keewis .
100%. This kind of thing really trips up users. Do we have an open issue for that in xarray or should we make one now? |
I think the "deprecate |
Thanks for the This is really important when trying to open chunked large Zarr data with datatree to keep the original chunks. I updated my test notebook: https://github.com/etienneschalk/datatree-experimentation/blob/main/notebooks/bug-chunk-auto-not-considered.ipynb section "With chunks={} kwarg 🆗" |
I tested this locally with a different example and got the same results. Here is reproducible example:
If this is the expected result can we close this issue? |
Thanks for looking into this rabbit hole @eni-awowale ! I think we're getting a bit off track here though.
I suggest we close this issue and as we notice problems we raise dedicated new issues on the upstream repo. |
Hi,
I noticed a discrepency between the behaviour of xarray's
open_zarr
and datatree'sopen_datatree
withengine='zarr'
.I documented it in a pre-executed notebook available at https://github.com/etienneschalk/datatree-experimentation/blob/main/notebooks/bug-chunk-auto-not-considered.ipynb (the whole project can be cloned and executed locally if needed, it requires poetry)
To summarize:
Actual:
open_zarr
chunks='auto'
: Stored chunks are used.open_datatree
withengine='zarr'
chunks='auto'
: A chunk identical to the shape of the data is used. This means chunking is useless as there is only a single chunk representing the whole datasetExpected:
I expected a similar behaviour from datatree as the one from xarray. Since Zarr is format that natively handle chunks, I would have expected that when opening a Zarr store with no chunks kwarg or
chunks='auto'
, the stored chunks were to be used.Thanks!
The text was updated successfully, but these errors were encountered: