Skip to content

TypeError: Filesystem needs to support async operations #2554

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
norlandrhagen opened this issue Dec 12, 2024 · 7 comments
Closed

TypeError: Filesystem needs to support async operations #2554

norlandrhagen opened this issue Dec 12, 2024 · 7 comments
Labels
bug Potential issues with the zarr-python library

Comments

@norlandrhagen
Copy link

Zarr version

3.0.0b3

Numcodecs version

0.14.1

Python Version

3.12

Operating System

Mac

Installation

pip

Description

I'm having issues with using zarr.open_group for a local Zarr store.

calling zarr.open_group with a full uri:

zg = zarr.open_group('file:///air.zarr')
gives the error: TypeError: Filesystem needs to support async operations.

calling open_group on a relative local path or an s3 path seems to work fine.
zg = zarr.open_group('air.zarr')
zg = zarr.open_group('s3://carbonplan-share/air_temp.zarr')

Comments from @d-v-b in the Zarr Zulip chat:

I'm guessing this is a bug -- RemoteStore is not accurately named, it should be called FSSpecStore, because it basically just wraps fsspec. We require that the fsspec file system wrapped by RemoteStore support async operations, but it seems like the local file-flavored file system does not. Could you open an issue with this reproducer in it?

there are two possible solutions to this, and we should do both:

associate the file:// protocol with LocalStore instead of RemoteStore
Ensure that invoking RemoteStore('file:///path') works properly

Steps to reproduce

# zarr pooch xarray

import zarr 
import xarray as xr 

ds = xr.tutorial.open_dataset('air_temperature')
ds.to_zarr('air.zarr',mode='w')

# ex: file:///Users/../../../air.zarr
filepath_uri = (Path.cwd() / 'air.zarr').as_uri()
zg = zarr.open_group(filepath_uri)

Error: TypeError: Filesystem needs to support async operations.

Traceback:

{
	"name": "TypeError",
	"message": "Filesystem needs to support async operations.",
	"stack": "---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[95], line 12
      9 # ex: file:///Users/../../../air.zarr
     11 filepath_uri = (Path.cwd() / 'air.zarr').as_uri()
---> 12 zg = zarr.open_group(filepath_uri)

File ~/miniforge3/envs/virtualizarr-min-deps/lib/python3.13/site-packages/zarr/_compat.py:43, in _deprecate_positional_args.<locals>._inner_deprecate_positional_args.<locals>.inner_f(*args, **kwargs)
     41 extra_args = len(args) - len(all_args)
     42 if extra_args <= 0:
---> 43     return f(*args, **kwargs)
     45 # extra_args > 0
     46 args_msg = [
     47     f\"{name}={arg}\"
     48     for name, arg in zip(kwonly_args[:extra_args], args[-extra_args:], strict=False)
     49 ]

File ~/miniforge3/envs/virtualizarr-min-deps/lib/python3.13/site-packages/zarr/api/synchronous.py:216, in open_group(store, mode, cache_attrs, synchronizer, path, chunk_store, storage_options, zarr_version, zarr_format, meta_array, attributes, use_consolidated)
    199 @_deprecate_positional_args
    200 def open_group(
    201     store: StoreLike | None = None,
   (...)
    213     use_consolidated: bool | str | None = None,
    214 ) -> Group:
    215     return Group(
--> 216         sync(
    217             async_api.open_group(
    218                 store=store,
    219                 mode=mode,
    220                 cache_attrs=cache_attrs,
    221                 synchronizer=synchronizer,
    222                 path=path,
    223                 chunk_store=chunk_store,
    224                 storage_options=storage_options,
    225                 zarr_version=zarr_version,
    226                 zarr_format=zarr_format,
    227                 meta_array=meta_array,
    228                 attributes=attributes,
    229                 use_consolidated=use_consolidated,
    230             )
    231         )
    232     )

File ~/miniforge3/envs/virtualizarr-min-deps/lib/python3.13/site-packages/zarr/core/sync.py:141, in sync(coro, loop, timeout)
    138 return_result = next(iter(finished)).result()
    140 if isinstance(return_result, BaseException):
--> 141     raise return_result
    142 else:
    143     return return_result

File ~/miniforge3/envs/virtualizarr-min-deps/lib/python3.13/site-packages/zarr/core/sync.py:100, in _runner(coro)
     95 \"\"\"
     96 Await a coroutine and return the result of running it. If awaiting the coroutine raises an
     97 exception, the exception will be returned.
     98 \"\"\"
     99 try:
--> 100     return await coro
    101 except Exception as ex:
    102     return ex

File ~/miniforge3/envs/virtualizarr-min-deps/lib/python3.13/site-packages/zarr/api/asynchronous.py:721, in open_group(store, mode, cache_attrs, synchronizer, path, chunk_store, storage_options, zarr_version, zarr_format, meta_array, attributes, use_consolidated)
    718 if chunk_store is not None:
    719     warnings.warn(\"chunk_store is not yet implemented\", RuntimeWarning, stacklevel=2)
--> 721 store_path = await make_store_path(store, mode=mode, storage_options=storage_options, path=path)
    723 if attributes is None:
    724     attributes = {}

File ~/miniforge3/envs/virtualizarr-min-deps/lib/python3.13/site-packages/zarr/storage/common.py:305, in make_store_path(store_like, path, mode, storage_options)
    303 if _is_fsspec_uri(store_like):
    304     used_storage_options = True
--> 305     store = RemoteStore.from_url(
    306         store_like, storage_options=storage_options, read_only=_read_only
    307     )
    308 else:
    309     store = await LocalStore.open(root=Path(store_like), read_only=_read_only)

File ~/miniforge3/envs/virtualizarr-min-deps/lib/python3.13/site-packages/zarr/storage/remote.py:176, in RemoteStore.from_url(cls, url, storage_options, read_only, allowed_exceptions)
    172 if \"://\" in path and not path.startswith(\"http\"):
    173     # `not path.startswith(\"http\")` is a special case for the http filesystem (¯\\_(ツ)_/¯)
    174     path = fs._strip_protocol(path)
--> 176 return cls(fs=fs, path=path, read_only=read_only, allowed_exceptions=allowed_exceptions)

File ~/miniforge3/envs/virtualizarr-min-deps/lib/python3.13/site-packages/zarr/storage/remote.py:90, in RemoteStore.__init__(self, fs, read_only, path, allowed_exceptions)
     87 self.allowed_exceptions = allowed_exceptions
     89 if not self.fs.async_impl:
---> 90     raise TypeError(\"Filesystem needs to support async operations.\")
     91 if not self.fs.asynchronous:
     92     warnings.warn(
     93         f\"fs ({fs}) was not created with `asynchronous=True`, this may lead to surprising behavior\",
     94         stacklevel=2,
     95     )

TypeError: Filesystem needs to support async operations."
}

Additional output

No response

@norlandrhagen norlandrhagen added the bug Potential issues with the zarr-python library label Dec 12, 2024
@jhamman
Copy link
Member

jhamman commented Dec 13, 2024

I think this has been fixed upstream -- fsspec/filesystem_spec#1755

@norlandrhagen - would you mind trying with fsspec@main an report back?

@norlandrhagen
Copy link
Author

Shall do! I'll give it a spin.

@norlandrhagen
Copy link
Author

I installed the latest from fsspec:
pip install git+https://github.com/fsspec/filesystem_spec

xr.__version__
'2024.11.0'

zarr.__version__
'3.0.0b3'


fsspec.__version__
'2024.10.0.post24+gc36066c'

and getting the same error: TypeError: Filesystem needs to support async operations. 😕

TypeError                                 Traceback (most recent call last)
Cell In[8], line 3
      1 # ex: file:///Users/../../../air.zarr
      2 filepath_uri = (Path.cwd() / 'air.zarr').as_uri()
----> 3 zg = zarr.open_group(filepath_uri)

File ~/miniforge3/envs/zarr_v3_debug/lib/python3.12/site-packages/zarr/_compat.py:43, in _deprecate_positional_args.<locals>._inner_deprecate_positional_args.<locals>.inner_f(*args, **kwargs)
     41 extra_args = len(args) - len(all_args)
     42 if extra_args <= 0:
---> 43     return f(*args, **kwargs)
     45 # extra_args > 0
     46 args_msg = [
     47     f"{name}={arg}"
     48     for name, arg in zip(kwonly_args[:extra_args], args[-extra_args:], strict=False)
     49 ]

File ~/miniforge3/envs/zarr_v3_debug/lib/python3.12/site-packages/zarr/api/synchronous.py:216, in open_group(store, mode, cache_attrs, synchronizer, path, chunk_store, storage_options, zarr_version, zarr_format, meta_array, attributes, use_consolidated)
    199 @_deprecate_positional_args
    200 def open_group(
    201     store: StoreLike | None = None,
   (...)
    213     use_consolidated: bool | str | None = None,
    214 ) -> Group:
    215     return Group(
--> 216         sync(
    217             async_api.open_group(
    218                 store=store,
    219                 mode=mode,
    220                 cache_attrs=cache_attrs,
    221                 synchronizer=synchronizer,
    222                 path=path,
    223                 chunk_store=chunk_store,
    224                 storage_options=storage_options,
    225                 zarr_version=zarr_version,
    226                 zarr_format=zarr_format,
    227                 meta_array=meta_array,
    228                 attributes=attributes,
    229                 use_consolidated=use_consolidated,
    230             )
    231         )
    232     )

File ~/miniforge3/envs/zarr_v3_debug/lib/python3.12/site-packages/zarr/core/sync.py:141, in sync(coro, loop, timeout)
    138 return_result = next(iter(finished)).result()
    140 if isinstance(return_result, BaseException):
--> 141     raise return_result
    142 else:
    143     return return_result

File ~/miniforge3/envs/zarr_v3_debug/lib/python3.12/site-packages/zarr/core/sync.py:100, in _runner(coro)
     95 """
     96 Await a coroutine and return the result of running it. If awaiting the coroutine raises an
     97 exception, the exception will be returned.
     98 """
     99 try:
--> 100     return await coro
    101 except Exception as ex:
    102     return ex

File ~/miniforge3/envs/zarr_v3_debug/lib/python3.12/site-packages/zarr/api/asynchronous.py:721, in open_group(store, mode, cache_attrs, synchronizer, path, chunk_store, storage_options, zarr_version, zarr_format, meta_array, attributes, use_consolidated)
    718 if chunk_store is not None:
    719     warnings.warn("chunk_store is not yet implemented", RuntimeWarning, stacklevel=2)
--> 721 store_path = await make_store_path(store, mode=mode, storage_options=storage_options, path=path)
    723 if attributes is None:
    724     attributes = {}

File ~/miniforge3/envs/zarr_v3_debug/lib/python3.12/site-packages/zarr/storage/common.py:305, in make_store_path(store_like, path, mode, storage_options)
    303 if _is_fsspec_uri(store_like):
    304     used_storage_options = True
--> 305     store = RemoteStore.from_url(
    306         store_like, storage_options=storage_options, read_only=_read_only
    307     )
    308 else:
    309     store = await LocalStore.open(root=Path(store_like), read_only=_read_only)

File ~/miniforge3/envs/zarr_v3_debug/lib/python3.12/site-packages/zarr/storage/remote.py:176, in RemoteStore.from_url(cls, url, storage_options, read_only, allowed_exceptions)
    172 if "://" in path and not path.startswith("http"):
    173     # `not path.startswith("http")` is a special case for the http filesystem (¯\_(ツ)_/¯)
    174     path = fs._strip_protocol(path)
--> 176 return cls(fs=fs, path=path, read_only=read_only, allowed_exceptions=allowed_exceptions)

File ~/miniforge3/envs/zarr_v3_debug/lib/python3.12/site-packages/zarr/storage/remote.py:90, in RemoteStore.__init__(self, fs, read_only, path, allowed_exceptions)
     87 self.allowed_exceptions = allowed_exceptions
     89 if not self.fs.async_impl:
---> 90     raise TypeError("Filesystem needs to support async operations.")
     91 if not self.fs.asynchronous:
     92     warnings.warn(
     93         f"fs ({fs}) was not created with `asynchronous=True`, this may lead to surprising behavior",
     94         stacklevel=2,
     95     )

TypeError: Filesystem needs to support async operations.

norlandrhagen added a commit to zarr-developers/VirtualiZarr that referenced this issue Dec 16, 2024
@jhamman
Copy link
Member

jhamman commented Dec 18, 2024

@martindurant / @moradology - any idea why we're not getting the async wrapper here?

@moradology
Copy link
Contributor

Seems familiar... I'm not 100% certain, but it looks to me like the issue for which this draft PR was cut: #2533

@martindurant
Copy link
Member

Funny that this "needs to be async" happens during a sync() call :)

I think checking async_impl and auto-wrapping is the right thing to do here. Re-interpreting the URL isn't ideal unless we can guarantee that fsspec and objstore configs are identical, which for local might be OK but in general is not.

@norlandrhagen
Copy link
Author

Closing this as this PR fixed this issue. Thanks @moradology!

norlandrhagen added a commit to zarr-developers/VirtualiZarr that referenced this issue Apr 24, 2025
* wip toward zarr v2 reader

* removed _ARRAY_DIMENSIONS and trimmed down attrs

* WIP for zarr reader

* adding in the key piece, the reader

* virtual dataset is returned! Now to deal with fill_value

* Update virtualizarr/readers/zarr.py

Co-authored-by: Tom Nicholas <[email protected]>

* replace fsspec ls with zarr.getsize

* lint

* wip test_zarr

* removed pdb

* zarr import in type checking

* moved get_chunk_paths & get_chunk_size async funcs outside of construct_chunk_key_mapping func

* added a few notes from PR review.

* removed array encoding

* v2 passing, v3 skipped for now

* added missed staged files

* missing return

* add network

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* conftest fix

* naming

* comment out integration test for now

* refactored test_dataset_from_zarr ZArray tests

* adds zarr v3 req opt

* zarr_v3 decorator

* add more tests

* wip

* adds missing await

* more tests

* wip

* wip on v3

* add note + xfail v3

* tmp run network

* revert

* update construct_virtual_array ordering

* updated ABC after merge

* wip

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* working for v2 and v3, but only local

* cleanup test_zarr reader test

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* cleanup after zarr-python issue report

* temp disabled validate_and_normalize_path_to_uri due to issue in zarr-python v3: zarr-developers/zarr-python#2554

* marked zarr integration test skipped b/c of zarr-v3 and kerchunk incompatability

* fixes some async behavior, reading from s3 seems to work

* lint + uri_fmt

* adds to releases.rst

* nit

* cleanup, comments and nits

* progress on mypy

* make mypy happy

* adds option for AsyncArray to _is_zarr_array

* big async rewrite

* fixes merge conflict

* bit of restructure

* nit

* WIP on ChunkManifest.from_arrays

* v2/v3 c chunk fix + build ChunkManifest from numpy arrays

* removed method of creating ChunkManifests from dicts

* cleanup

* adds xfails to TestOpenVirtualDatasetZarr due to local filesystem zarr issue

* some nits after merging w/ main

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updates zarr v3 req

* lint

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove build_chunk_manifest_from_dict_mapping function since manifest are build from np.ndarrays

* tmp ignore lint

* remove zarr fill_value skip

* fixes network req import in test_integration

* bump xarray to 2025.1.1 and icechunk to 0.1.0a10 in upstream

* move zarr import into type checking

* move zarr import in test_zarr

* adding back in missing nbytes property

* typing

* tmp testing & removing old xfail

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* adds back in validate_and_normalize_path_to_uri after upstream zarr fix & vendors concurrent map from zarr-python

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removing kerchunk from zarr integration test

* removed zarr manifest + lint

* wip on testing

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* revert min-deps change

* merge

* revert environment.yaml

* removed zarr manifest writing

* cleanup and consolidation in zarr reader

* typing

* test_unsupported_zarr_python to zarr v3

* rel path issue?

* revert accidental icechunk commit

* wip on fixing codecs

* cleaup of tests + codecs

* renived test_zarr writer

* bumping icechunk for now

* typing lint

* remove zarr writer test

* adds Zarr V2 reader not supported exception

* updates usage and releases and lints upstream.yaml

* lint + clarified some todo/comments

* quick nit, removed duplicated entry in ci

* removed some comments and reverted pyproject

* pyproj de-dup

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* util fpaht

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* adding test to check zarr key format in manifest

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* switched Manifest creation back to dict

* cleaned up zarr reader ArrayV3Metadata reading

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* vendor cleanup

* merge w/ develop and update construct_virtual_dataset

* added _zstd_codec check in get_codec_config to fix numcodecs complaint

* mypy lint

* mypy lint 2

* lint

* typing

* adds check for filepath

* spelling nit + revert hdf int

* removed virtualizarr.zarr + cleanup nits

* cleanup + note

* updates docs/faq.md data table

* revert leading slash

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix bad merge commit

* Use ManifestStore in Zarr reader (#554)

* Use ManifestStore in Zarr reader

* Update virtualizarr/readers/zarr.py

Co-authored-by: Raphael Hagen <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Co-authored-by: Raphael Hagen <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* filepath slash nit

* Update docs/faq.md

Co-authored-by: Tom Nicholas <[email protected]>

* Update virtualizarr/readers/zarr.py

Co-authored-by: Tom Nicholas <[email protected]>

* Update virtualizarr/readers/zarr.py

Co-authored-by: Tom Nicholas <[email protected]>

* Update virtualizarr/readers/zarr.py

Co-authored-by: Tom Nicholas <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* adds back in todo

* adds wip test for scalar chunk testing

* adds test for scalar zarr + modifies get_chunk_mapping_prefix to accomdate

* update localstore to memorystore

---------

Co-authored-by: Tom Nicholas <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Max Jones <[email protected]>
Co-authored-by: Tom Nicholas <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Potential issues with the zarr-python library
Projects
None yet
Development

No branches or pull requests

4 participants