-
-
Notifications
You must be signed in to change notification settings - Fork 363
Description
Zarr version
3.1.2
Numcodecs version
0.16.1
Python Version
3.11
Operating System
Linux
Installation
pixi / conda-forge
Description
Attempting to load legacy OME-Zarr datasets (e.g., https://s3.embl.de/i2k-2020/platy-raw.ome.zarr) with zarr-python v3.x using zarr.open_array results in a TypeError if the fill_value is encoded as 0.0 (float) while the dtype is uint8. This previously worked in zarr-python v2.x but fails in the latest versions, breaking compatibility with many published datasets.
Error message
TypeError: Invalid type: 0.0. Expected an integer.
Steps to reproduce
# /// script
# requires-python = ">=3.11"
# dependencies = [
# "requests",
# "aiohttp",
# "fsspec",
# "zarr@git+https://github.com/zarr-developers/zarr-python.git@main",
# ]
# ///
#
# This script automatically imports the development branch of zarr to check for issues
import zarr
url = "https://s3.embl.de/i2k-2020/platy-raw.ome.zarr"
component = "s0" # Try s0, but other resolutions likely affected
# Attempt to open remote array
z = zarr.open_array(store=url, path=component)
print(z)
zarr.print_debug_info()
Additional output
In the metadata for affected arrays (e.g., https://s3.embl.de/i2k-2020/platy-raw.ome.zarr/s0/.zarray), the "fill_value" field is a float (0.0), but the "dtype" is "uint8". This is arguably non-standard, but many OME-Zarr arrays in the wild use this encoding.
Older zarr-python releases (e.g., v2.x) accepted this value. The latest (v3.x) fails, making many public datasets unreadable without editing their metadata.
Questions and Suggestions
-
Should zarr-python accept 0.0 as a compatible fill_value for integer arrays and transparently cast it?
-
Would it be possible to warn (not error) and interpret common cases (e.g., float-encoded zero for integer types)?
-
If strict typing is desired, could an opt-in backward compatibility flag be provided?