Skip to content
This repository was archived by the owner on Sep 11, 2023. It is now read-only.

Try partial_decompress=True #21

Closed
JackKelly opened this issue Jun 19, 2021 · 1 comment
Closed

Try partial_decompress=True #21

JackKelly opened this issue Jun 19, 2021 · 1 comment
Assignees

Comments

@JackKelly
Copy link
Member

Especially for NWPs, where we often only want a single value per chunk.

Also try in combination with uncompressed chunks

See zarr-developers/zarr-python#667

@JackKelly JackKelly self-assigned this Jun 19, 2021
@JackKelly
Copy link
Member Author

JackKelly commented Jun 21, 2021

Tried it. Doesn't seem to speed things up much (if at all). But I'll leave it in the code, just in case it speeds things up a bit!

Using consolidated metadata seems to significantly (double?) the read speed though.

Code:

# Pretty sure this sets partial_decompress to True:
import zarr
array = zarr.open_array(
    FILENAME, path='stacked_eumetsat_data', partial_decompress=False, mode='r'
    storage_options=dict(consolidated=True))

%%time
data = array[:12, :128, :128, 0]  # takes about 35 ms with or without partial_decompress.

This code might set partial_decompress=True, but I'm not certain!

ds = xr.open_dataset(
    FILENAME, 
    engine='zarr',
    consolidated=True,
    storage_options=dict(partial_decompress=True)
)

JackKelly added a commit that referenced this issue Jun 21, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant