Description
In the future, zarr files/stores will be opened with xarray.open_dataset
as follows
ds = xarray.open_dataset(store, engine="zarr", ...)
Thus, (eventually) there needs to be a change on how intake-xarray
will open zarr stores. Naively, this can be done by specifying open_dataset
rather than open_zarr
. This is, something like:
store = 'directoryA/subdirectory/zarr_store
self._mapper = fsspec.get_mapper(store, ...)
_open = xr.open_dataset
self._ds = open(self._mapper, engine="zarr", ...)
However, xarray.open_dataset
does not recognize output from fsspec.get_mapper
. It works if store
(as defined above) is passed.
On xarray.open_zarr
, _mapper
gets transformed into a ZarrStore
and later decoded. This is, given the _mapper
, the following will open the zarr store:
from xarray.backends.zarr import ZarrStore
from xarray import conventions
zarr_store = ZarrStore.open_group(_mapper, ...)
ds = conventions.decode_cf(zarr_store,...)
This brings two options IMHO:
- Drop using
fsspec.get_mapper
(not likely) and just pass the url/path as argument toxarray.open_dataset.
(very unlikely), or - Follow along the lines of the pseudo code above, and rather than import/use
xarray.open_dataset
directly, importZarrStore.open_group
andconvenctions.decode_cf
to open zarr stores.
It is my understanding, that zarr will potentially depend more on fsspec
as it gets more developed, and thus No. 2 seems more likely.
Or, is there another secret option number 3 I fail to see?
pydata/xarray#4003
fsspec/filesystem_spec#286
zarr-developers/zarr-python#546