Skip to content

GeoZarr as a Serverless Alternative to GeoDataCube #18

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
christophenoel opened this issue Apr 12, 2023 · 1 comment
Closed

GeoZarr as a Serverless Alternative to GeoDataCube #18

christophenoel opened this issue Apr 12, 2023 · 1 comment

Comments

@christophenoel
Copy link

christophenoel commented Apr 12, 2023

Dear GeoZarr enthusiasts,

Beyong the required conventions for encording data, I would like to delve deeper into the rationale behind developing GeoZarr as a serverless alternative to GeoDataCube and Geospatial services (XYZ Tiling, OGC WMS, OGC API Coverages, etc.)

Indeed, our decision (in HDSA project) of creating a new GeoZarr convention (instead of using NCZarr or alternate format) was motivated from our desire to design a Cloud-native (i.e. serverless) format which provides the usual capabilities of GeoDataCube, Tiling Service, and OGC APIs: (just as STAC serves as a serverless alternative to traditional data catalogs.

The development of GeoZarr aimed to address several challenges that are not met by conventional data formats, including:

  • Holding multiple projections of the data: for geospatial access service (e.g. OGC API Coverage) it is crucial to store data in various projections, similar to how data is cached in a web map service. GeoZarr is designed to accommodate different projections, enabling seamless access and processing of geospatial data without the need for server-side reprojection.
  • Holding multiple scales of the data: GeoZarr allows for storage of data at various resolutions, which is essential for tiling services, as it enables clients to request data at their desired scale without relying on server-side resampling. This capability improves the efficiency of data retrieval and reduces the latency in accessing data at different scales.
  • Holding multiple optimizations: GeoZarr supports multiple optimizations tailored to specific use cases or access patterns, such as data temporal re-chunking. This feature enables users to store data in a format optimized for their particular needs, reducing the time to generate time series or hyperspectral overviews.

Note that holding multiple scales of the data is a requirement for supporting Web browser visualisation equivalent to COG capabilities. Furthermore, additional recommendations might be helpful to provide guideline for example about how to encode multiple bands values in a single array, or how to compose time series when the

The question is how such aspects might be addressed ?

  • In an Implementation Standard which would includes both section for providing conventions, and other sections to describe the recommendations or profiles.
  • In a set of best practices documents addressing a particular "profile"
  • In a separate process from the core data & metadata encoding specification for Zarr
@dblodgett-usgs
Copy link

@christophenoel -- maybe we should open a new issue to explore this, but you state:

instead of using NCZarr or alternate format

What was it about NCZarr that makes it incompatible with the goal of creating a serverless alternative to GeoDataCube? Is it possible that GeoZarr could extend the NCZarr convention or are there things about it that make that impossible?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants