Skip to content

split up metadata #7

Open
Open
@d70-t

Description

@d70-t

Currently the entire metadata tree is part of a single output object. Due to the 2MB size limit, this will quickly cause trouble for larger datasets. A natural approach would be to split the metadata object up into multiple objects based on the hierarchy inherent to zarr (e.g. one DAG-CBOR block per variable and dataset instead of only one object per dataset).

We might still need to resort to HAMT if the dictionaries on a single hierarchical level become too large, but that might still be quite far away. We probably might also want to introduce further hierarchical levels within the chunk keys of a single zarr variable as proposed here demonstrated here. This would reduce the number of items per dictionary while aligning IPLD objects to (to-be-introduced) zarr-shards, which in turn may lead to better locality within block requests.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions