You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
/srv/conda/envs/notebook/lib/python3.8/site-packages/xarray/backends/zarr.py in _determine_zarr_chunks(enc_chunks, var_chunks, ndim, name)
148 for dchunk in dchunks[:-1]:
149 if dchunk % zchunk:
--> 150 raise NotImplementedError(
151 f"Specified zarr chunks encoding['chunks']={enc_chunks_tuple!r} for "
152 f"variable named {name!r} would overlap multiple dask chunks {var_chunks!r}. "
NotImplementedError: Specified zarr chunks encoding['chunks']=(3,) for variable named 'foo' would overlap multiple dask chunks ((1, 1, 1),). This is not implemented in xarray yet. Consider either rechunking using `chunk()` or instead deleting or modifying `encoding['chunks']`.
In this case, the error is particularly frustrating because I'm not even writing any data yet. (Also related to #2300, #4046, #4380).
There are at least two scenarios in which we might want to have more flexibility.
The case above, when we want to lazily initialize a Zarr array based on a Dataset, without actually computing anything.
The more general case, where we actually write arrays with many-to-many dask-chunk <-> zarr-chunk relationships
For 1, I propose we add a new option like safe_chunks=True to to_zarr. safe_chunks=False would permit just bypassing this chunk.
For 2, we could consider implementing locks. This probably has to be done at the Dask level. But is actually not super hard to deterministically figure out which chunks need to share a lock.
The text was updated successfully, but these errors were encountered:
So far, I've been happy working around (1) by constructing synthetic dask arrays with the desired final chunks. I suspect that's even pretty efficient on the dask side, as long as everything uses Dask's HighLevelGraph for representing the underlying tasks.
Curently,
Dataset.to_zarr
will only write Zarr datasets in cases in whichIf I try to violate the one-to-many condition, I get an error
In this case, the error is particularly frustrating because I'm not even writing any data yet. (Also related to #2300, #4046, #4380).
There are at least two scenarios in which we might want to have more flexibility.
For 1, I propose we add a new option like
safe_chunks=True
toto_zarr
.safe_chunks=False
would permit just bypassing this chunk.For 2, we could consider implementing locks. This probably has to be done at the Dask level. But is actually not super hard to deterministically figure out which chunks need to share a lock.
The text was updated successfully, but these errors were encountered: