Skip to content

Conversation

MichaelBuessemeyer
Copy link
Contributor

@MichaelBuessemeyer MichaelBuessemeyer commented Aug 14, 2024

Revision of Zarr 3 streaming.
The error was caused due to expecting a prefixed "c." when addressing an array. I now changed the chunk key encoding of zarr 3 to "v2" to omit this prepended "c." just as it was done in zarr 2 streaming. Now the coordinate parsing code is slightly easier and works with v2 and v3.

Please just review the following commits. They are the only changes compared to the first version of this pr:

URL of deployed dev instance (used for testing):

Steps to test:

TODOs:

  • Deploy dev instance and create zarr2 & zarr3 streaming test datasets for view only and annotations

Issues:


(Please delete unneeded items, merge only when none are left open)

  • Updated changelog
  • Needs datastore update after deployment

MichaelBuessemeyer and others added 30 commits July 22, 2024 19:00
- Added mag/zarr.json & mag/coords route for datasets (not annotations)
- wk seems to send c.0.x.y.z requests and the c at the start does not match with the regex resulting in invalid coordinate parsing
Comment on lines -4 to -13
def parseDotCoordinates(
cxyz: String,
): Option[(Int, Int, Int, Int)] = {
val singleRx = "\\s*([0-9]+).([0-9]+).([0-9]+).([0-9]+)\\s*".r

cxyz match {
case singleRx(c, x, y, z) =>
Some(Integer.parseInt(c), Integer.parseInt(x), Integer.parseInt(y), Integer.parseInt(z))
case _ => None
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was removed as it was fully replaced by parseNDimensionalDotCoordinates. This was already the case in the first version of this pr

@MichaelBuessemeyer
Copy link
Contributor Author

@valentin-pinkau You can test the streaming here to check whether zarr v2 streaming works again in this version of the pr adding zarr 3 streaming :)

@valentin-pinkau
Copy link
Member

I tested it with the dataset that did not work before and now it works.

@MichaelBuessemeyer MichaelBuessemeyer merged commit 9ad330e into master Aug 19, 2024
@MichaelBuessemeyer MichaelBuessemeyer deleted the add-zarr3-streaming-v0 branch August 19, 2024 08:02
knollengewaechs pushed a commit that referenced this pull request Sep 2, 2024
* WIP add zarr3 streaming
- Added mag/zarr.json & mag/coords route for datasets (not annotations)

* manage minimal zarr3 support for viewing a dataset and viewing an annotation

* add basic zarr 3 ngff v2 group header route

* add zarr 3 ngff v2 group header route for annotations

* merge Zarr3StreamController into ZarrStreamingController

* ensure full match when parsing zarr coordinates to avoid parsing any non-numerical characters

* make NgffMetadataV2 nd compatible

* refactor code

* remove Zarr3StreamingController.scala

* minor code fixes; mostly remove unused imports

* remove fixed full length match of zarr coordinate regex parsing;
- wk seems to send c.0.x.y.z requests and the c at the start does not match with the regex resulting in invalid coordinate parsing

* add c. as a prefix to the coordinate parsing regex as defined by the zarr spec

* exclude leading c from the zarr3 coordinate parsing

* apply pr feedback

* remove unused imports and format code

* add datasource-properties.json and dir listing routes to zarr 3 streaming

* format code

* add changelog entry

* fix Zarr3GroupHeader json serialization format

* fix backend format

* Update webknossos-datastore/app/com/scalableminds/webknossos/datastore/datareaders/zarr/NgffMetadataV0_5.scala

* add comments about why manual json serializer is needed

* use chunk key encoding version 2 as zarr 2 stream also has no prefix of "c." when addressing an array

* do not include a prefixed "c." when parsing coordinates -> works with zarr 2 and 3 streaming

* remove outdated comment

* update error message (remove hint to include a prepending "c."

---------

Co-authored-by: Norman Rzepka <[email protected]>
fm3 added a commit that referenced this pull request Jul 29, 2025
Fixes a regression introduced in #7995 – the volume buckets are blosc
compressed, but that was no longer mentioned in the zarr header.

Also, the chunks had the `c.` name prefix, but `chunk_key_encoding` was
already `v2` in the zarr header. Now I actually remove the `c.` name
prefix so that it is actually `v2` compliant.

Also, fixes passing the voxel size to the volume zip download so that it
is correct in the included datasource-properties.json

### Steps to test:
- Create volume annotation, download it in zarr format
- Unpack the zip and the contained data_Volume.zip
- place the result in the binaryData folder, should be readable as
standalone dataset
- reupload the downloaded annotation, should still be uploadable as an
annotation
- Also, upload volume annotation downloaded on the *master* code, should
also still be uploadable as annotation

### Issues:
- fixes #7961 

------
- [x] Added changelog entry (create a `$PR_NUMBER.md` file in
`unreleased_changes` or use `./tools/create-changelog-entry.py`)
- [x] Considered [common edge
cases](../blob/master/.github/common_edge_cases.md)

---------

Co-authored-by: MichaelBuessemeyer <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add zarr3 streaming

3 participants