-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Add asynchronous load method #10327
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add asynchronous load method #10327
Conversation
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
These failing tests from the CI do not fail when I run them locally, which is interesting. FAILED xarray/tests/test_backends.py::TestH5NetCDFViaDaskData::test_outer_indexing_reversed - ValueError: dimensions ('t', 'y', 'x') must have the same length as the number of data dimensions, ndim=4
FAILED xarray/tests/test_backends.py::TestNetCDF4ViaDaskData::test_outer_indexing_reversed - ValueError: dimensions ('t', 'y', 'x') must have the same length as the number of data dimensions, ndim=4
FAILED xarray/tests/test_backends.py::TestDask::test_outer_indexing_reversed - ValueError: dimensions ('t', 'y', 'x') must have the same length as the number of data dimensions, ndim=4
= 3 failed, 18235 passed, 1269 skipped, 77 xfailed, 15 xpassed, 2555 warnings in 487.15s (0:08:07) =
Error: Process completed with exit code 1. |
@@ -267,13 +268,23 @@ def robust_getitem(array, key, catch=Exception, max_retries=6, initial_delay=500 | |||
time.sleep(1e-3 * next_delay) | |||
|
|||
|
|||
class BackendArray(NdimSizeLenMixin, indexing.ExplicitlyIndexed): | |||
class BackendArray(ABC, NdimSizeLenMixin, indexing.ExplicitlyIndexed): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As __getitem__
is required, I feel like BackendArray
should always have been an ABC.
async def async_getitem(key: indexing.ExplicitIndexer) -> np.typing.ArrayLike: | ||
raise NotImplementedError("Backend does not not support asynchronous loading") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've implemented this for the ZarrArray
class but in theory it could be supported by other backends too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might not be the desired behaviour though - this currently means if you opened a dataset from netCDF and called ds.load_async
you would get a NotImplementedError
. Would it be better to quietly just block instead?
Adds an
.async_load()
method toVariable
, which works by plumbing asyncget_duck_array
all the way down until it finally gets to the async methods zarr v3 exposes.Needs a lot of refactoring before it could be merged, but it works.
whats-new.rst
api.rst
API:
Variable.load_async
DataArray.load_async
Dataset.load_async
DataTree.load_async
load_dataset
?load_dataarray
?