Skip to content

[WIP] Use zarr internal LRU caching #2814

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 13 additions & 4 deletions xarray/backends/zarr.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
from ..core.pycompat import integer_types
from ..core.utils import FrozenOrderedDict, HiddenKeyDict
from .common import AbstractWritableDataStore, BackendArray
from .api import _protect_dataset_variables_inplace

# need some special secret attributes to tell us the dimensions
_DIMENSION_KEY = '_ARRAY_DIMENSIONS'
Expand Down Expand Up @@ -355,7 +356,8 @@ def close(self):
def open_zarr(store, group=None, synchronizer=None, auto_chunk=True,
decode_cf=True, mask_and_scale=True, decode_times=True,
concat_characters=True, decode_coords=True,
drop_variables=None, consolidated=False):
drop_variables=None, consolidated=False, cache=False,
max_cache_size=None):
"""Load and decode a dataset from a Zarr store.

.. note:: Experimental
Expand Down Expand Up @@ -408,6 +410,12 @@ def open_zarr(store, group=None, synchronizer=None, auto_chunk=True,
consolidated : bool, optional
Whether to open the store using zarr's consolidated metadata
capability. Only works for stores that have already been consolidated.
cache : bool, optional
If True, the zarr store is wrapped with a
``zarr.storage.LRUStoreCache``.
max_cache_size : int, optional
The maximum size that the cache may grow to, in number of bytes.
Provide `None` if you would like the cache to have unlimited size.

Returns
-------
Expand All @@ -434,11 +442,12 @@ def maybe_decode_store(store, lock=False):
store, mask_and_scale=mask_and_scale, decode_times=decode_times,
concat_characters=concat_characters, decode_coords=decode_coords,
drop_variables=drop_variables)

# TODO: this is where we would apply caching

return ds

if cache:
import zarr
store = zarr.LRUStoreCache(store, max_size=max_cache_size)

# Zarr supports a wide range of access modes, but for now xarray either
# reads or writes from a store, never both. For open_zarr, we only read
mode = 'r'
Expand Down