Skip to content

Header of zarr3 array with bytes codec without configuration cannot be parsed  #8233

@amotta

Description

@amotta

I have used TensorStore and its zarr3 driver to create and write out a sharded, three-dimensional array of unsigned 8 bit integers. This resulted in the following zarr.json metadata file (shown is the output of cat zarr.json | python -m json.tool)

{
    "chunk_grid": {
        "configuration": {
            "chunk_shape": [
                1024,
                1024,
                1024
            ]
        },
        "name": "regular"
    },
    "chunk_key_encoding": {
        "name": "default"
    },
    "codecs": [
        {
            "configuration": {
                "chunk_shape": [
                    32,
                    32,
                    32
                ],
                "codecs": [
                    {
                        "name": "bytes"
                    }
                ],
                "index_codecs": [
                    {
                        "configuration": {
                            "endian": "little"
                        },
                        "name": "bytes"
                    },
                    {
                        "name": "crc32c"
                    }
                ]
            },
            "name": "sharding_indexed"
        }
    ],
    "data_type": "uint8",
    "fill_value": 0,
    "node_type": "array",
    "shape": [
        5400,
        2000,
        4000
    ],
    "zarr_format": 3
}

Note that the inner "codecs" contains a "bytes" codec without "configuration". This is not currently handled by webKnossos. The dataset with the problematic layer may be imported, but no data can be loaded and errors are reported on the console.

According to the Zarr v3 specification, the "endian" configuration value is optional for byte-sized values. This is already handled by webKnossos. However, it's unclear to me whether this means that the entire "configuration" may be omitted. The specification does seem to imply that the "configuration" is optional:

The codec object may also contain a configuration object which consists of the parameter names and values as defined by the corresponding codec specification.

As a workaround, I have added an empty "configuration" to the problematic bytes codec. This way, reading from the Zarr3 layer works.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions