Skip to content

Conversation

frcroth
Copy link
Contributor

@frcroth frcroth commented May 12, 2025

URL of deployed dev instance (used for testing):

  • https://___.webknossos.xyz

This PR aims to have all properties that may occur in the datasource-properties.json file of a dataset mirrored in the DB. The things that were missing in the DB are:

  • layer/numChannels
  • layer/dataFormat
  • wkwResolution/cubeLength -> stored in mag table
  • mag/axisOrder
  • mag/channelIndex
  • mag/credentialId (mag also supports legacy credentials, which I did not move to the DB here. We already have credentials in the db which are referenced via credentialId, it would not make sense to create another credential type)

AxisOrder is done with a string in the form of "x:4,y:3,z:2", it would also be possible to store this in a new table and avoid the string serialization.

Steps to test:

  • Check out these changes on existing datasets -> no errors and data in db

TODOs:

  • ...

Issues:


(Please delete unneeded items, merge only when none are left open)

Copy link
Contributor

coderabbitai bot commented May 12, 2025

📝 Walkthrough

Walkthrough

This update introduces new metadata fields to dataset layers and magnifications, reflected in both the database schema and the application logic. The migration scripts add and remove these fields as needed. The Scala models and DAOs are extended to support and persist the new properties. A duplicate route is removed from the routing configuration, and migration documentation is updated accordingly.

Changes

Files/Paths Change Summary
conf/evolutions/133-datasource-properties-in-db.sql
conf/evolutions/reversions/133-datasource-properties-in-db.sql
tools/postgres/schema.sql
Database schema updated to version 133: new enum type DATASET_LAYER_DATAFORMAT added; new columns added to dataset_layers (numChannels, dataFormat) and dataset_mags (credentialId, axisOrder with check constraint, channelIndex, cubeLength). Migration and reversion scripts handle these schema changes and schema versioning.
app/models/dataset/Dataset.scala DAO methods updated to handle new metadata fields for dataset magnifications and layers. Insert/update queries now support dataFormat, numChannels, and additional mag attributes (credentialId, axisOrder, channelIndex, cubeLength). Logic extended to conditionally insert richer metadata depending on presence of magsOpt or wkwResolutionsOpt.
webknossos-datastore/app/com/scalableminds/webknossos/datastore/models/datasource/DataLayer.scala Data layer traits and case classes extended with optional fields: mags, dataFormat, numChannels, and wkwResolutions. New accessor methods added to DataLayerLike and DataLayerWithMagLocators traits. Companion objects updated to populate new fields from input layers.
webknossos-datastore/app/com/scalableminds/webknossos/datastore/datareaders/AxisOrder.scala Removed trailing blank line inside AxisOrder case class; no functional changes.
conf/webknossos.latest.routes Duplicate POST route for /maintenances removed from routing configuration.
MIGRATIONS.unreleased.md Migration documentation updated to reference new Postgres evolution script for datasource properties in the database.

Possibly related PRs

Suggested labels

refactoring

Suggested reviewers

  • fm3

Poem

🥕
New fields sprout in tables, like carrots in spring,
Axis orders now sing with a stringy new ring.
Mags and formats, channels galore—
Our data’s more detailed than ever before!
With routes trimmed neat and docs up to date,
This bunny hops on—database looking great!

Note

⚡️ AI Code Reviews for VS Code, Cursor, Windsurf

CodeRabbit now has a plugin for VS Code, Cursor and Windsurf. This brings AI code reviews directly in the code editor. Each commit is reviewed immediately, finding bugs before the PR is raised. Seamless context handoff to your AI code agent ensures that you can easily incorporate review feedback.
Learn more here.


Note

⚡️ Faster reviews with caching

CodeRabbit now supports caching for code and dependencies, helping speed up reviews. This means quicker feedback, reduced wait times, and a smoother review experience overall. Cached data is encrypted and stored securely. This feature will be automatically enabled for all accounts on May 16th. To opt out, configure Review - Disable Cache at either the organization or repository level. If you prefer to disable all data retention across your organization, simply turn off the Data Retention setting under your Organization Settings.
Enjoy the performance boost—your workflow just got faster.


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ef841ea and 636aa8a.

📒 Files selected for processing (5)
  • app/models/dataset/Dataset.scala (2 hunks)
  • conf/evolutions/133-datasource-properties-in-db.sql (1 hunks)
  • conf/webknossos.latest.routes (0 hunks)
  • tools/postgres/schema.sql (4 hunks)
  • webknossos-datastore/app/com/scalableminds/webknossos/datastore/datareaders/AxisOrder.scala (0 hunks)
💤 Files with no reviewable changes (2)
  • webknossos-datastore/app/com/scalableminds/webknossos/datastore/datareaders/AxisOrder.scala
  • conf/webknossos.latest.routes
🚧 Files skipped from review as they are similar to previous changes (3)
  • tools/postgres/schema.sql
  • conf/evolutions/133-datasource-properties-in-db.sql
  • app/models/dataset/Dataset.scala
⏰ Context from checks skipped due to timeout of 90000ms (2)
  • GitHub Check: backend-tests
  • GitHub Check: build-smoketest-push

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@frcroth frcroth force-pushed the dataset-properties-in-db branch from 3344f87 to a86896b Compare May 12, 2025 12:26
@frcroth frcroth force-pushed the dataset-properties-in-db branch from a86896b to d6231e4 Compare May 12, 2025 12:39
@frcroth frcroth marked this pull request as ready for review May 12, 2025 12:49
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (3)
app/models/dataset/Dataset.scala (1)

936-950: Keep INSERT / UPDATE column ordering consistent

The two branches diverge in column order (dataFormat, numChannels) which invites copy-paste errors and makes diffing harder.
Align both statements (or extract a helper) so future changes touch only one place.

webknossos-datastore/app/com/scalableminds/webknossos/datastore/models/datasource/DataLayer.scala (2)

453-464: getMags duplicates logic already expressed in magsOpt

getMags hard-codes the same exhaustive match list and throws at runtime. Prefer:

def getMags: List[MagLocator] =
  magsOpt.getOrElse(
    throw new IllegalStateException(s"Layer $name does not expose mags")
  )

This keeps the enumeration in a single location.


482-487: Case-class field proliferation risks hitting the 22-field limit

AbstractDataLayer now sits at 13 fields, AbstractSegmentationLayer at 15.
While still below the 22-field product limit, planned future extensions (e.g. compression, tiling strategy, provenance) will break compilation.

Start thinking about grouping related settings into small value objects (e.g. StorageInfo, DisplayInfo) instead of adding more primitive fields.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between dd43621 and 28144e2.

⛔ Files ignored due to path filters (2)
  • test/db/dataSet_layers.csv is excluded by !**/*.csv
  • test/db/dataSet_mags.csv is excluded by !**/*.csv
📒 Files selected for processing (8)
  • MIGRATIONS.unreleased.md (1 hunks)
  • app/models/dataset/Dataset.scala (2 hunks)
  • conf/evolutions/133-datasource-properties-in-db.sql (1 hunks)
  • conf/evolutions/reversions/133-datasource-properties-in-db.sql (1 hunks)
  • conf/webknossos.latest.routes (0 hunks)
  • tools/postgres/schema.sql (3 hunks)
  • webknossos-datastore/app/com/scalableminds/webknossos/datastore/datareaders/AxisOrder.scala (2 hunks)
  • webknossos-datastore/app/com/scalableminds/webknossos/datastore/models/datasource/DataLayer.scala (7 hunks)
💤 Files with no reviewable changes (1)
  • conf/webknossos.latest.routes
🧰 Additional context used
🧬 Code Graph Analysis (1)
webknossos-datastore/app/com/scalableminds/webknossos/datastore/models/datasource/DataLayer.scala (6)
webknossos-datastore/app/com/scalableminds/webknossos/datastore/dataformats/layers/WKWDataLayers.scala (3)
  • WKWResolution (12-12)
  • WKWResolution (14-16)
  • mags (29-29)
webknossos-tracingstore/app/com/scalableminds/webknossos/tracingstore/tracings/editablemapping/EditableMappingLayer.scala (2)
  • dataFormat (85-85)
  • additionalAxes (105-105)
webknossos-datastore/app/com/scalableminds/webknossos/datastore/dataformats/layers/N5DataLayers.scala (1)
  • numChannels (25-25)
webknossos-datastore/app/com/scalableminds/webknossos/datastore/dataformats/layers/PrecomputedDataLayers.scala (1)
  • numChannels (25-25)
webknossos-datastore/app/com/scalableminds/webknossos/datastore/dataformats/layers/Zarr3DataLayers.scala (1)
  • numChannels (25-25)
webknossos-datastore/app/com/scalableminds/webknossos/datastore/dataformats/layers/ZarrDataLayers.scala (1)
  • numChannels (23-23)
🔇 Additional comments (7)
MIGRATIONS.unreleased.md (1)

15-15: LGTM: Added new migration script to the unreleased list.

The newly added entry for the datasource properties migration script follows the proper format and is correctly placed in the list.

tools/postgres/schema.sql (3)

24-24: Correct schema version increment.

Schema version is properly incremented to 133 to match the new migration script.


139-140: LGTM: Added new columns to dataset_layers table.

The added columns match the PR objective, adding numChannels and dataFormat to the dataset layers table.


177-180: LGTM: Added new columns to dataset_mags table.

The added columns match the PR objective, adding axisOrder, channelIndex, cubeLength, and credentialId to store additional metadata in the database.

conf/evolutions/reversions/133-datasource-properties-in-db.sql (1)

1-17: LGTM: Proper rollback script for the migration.

The rollback script correctly:

  1. Verifies the current schema version
  2. Drops the newly added columns from both tables
  3. Downgrades the schema version
  4. Wraps operations in a transaction

Using DROP COLUMN IF EXISTS is good practice for maintaining idempotence.

webknossos-datastore/app/com/scalableminds/webknossos/datastore/datareaders/AxisOrder.scala (2)

24-37: Well-implemented toString method for serialization.

The toString implementation appropriately handles optional fields and creates a clean, consistent string representation of the axis order.


63-70: LGTM: Added proper deserialization logic.

The fromString method correctly parses the string representation back into an AxisOrder instance, handling optional fields appropriately.

Comment on lines +231 to +275
// Datasets that are not in the WKW format use mags
def magsOpt: Option[List[MagLocator]] = this match {
case layer: AbstractDataLayer => layer.mags
case layer: AbstractSegmentationLayer => layer.mags
case layer: DataLayerWithMagLocators => Some(layer.getMags)
case _ => None
}

def dataFormatOpt: Option[DataFormat.Value] = this match {
case layer: WKWDataLayer => Some(layer.dataFormat)
case layer: WKWSegmentationLayer => Some(layer.dataFormat)
case layer: ZarrDataLayer => Some(layer.dataFormat)
case layer: ZarrSegmentationLayer => Some(layer.dataFormat)
case layer: N5DataLayer => Some(layer.dataFormat)
case layer: N5SegmentationLayer => Some(layer.dataFormat)
case layer: PrecomputedDataLayer => Some(layer.dataFormat)
case layer: PrecomputedSegmentationLayer => Some(layer.dataFormat)
case layer: Zarr3DataLayer => Some(layer.dataFormat)
case layer: Zarr3SegmentationLayer => Some(layer.dataFormat)
// Abstract layers
case _ => None
}

def numChannelsOpt: Option[Int] = this match {
case layer: AbstractDataLayer => layer.numChannels
case layer: AbstractSegmentationLayer => layer.numChannels
case layer: ZarrDataLayer => layer.numChannels
case layer: ZarrSegmentationLayer => layer.numChannels
case layer: N5DataLayer => layer.numChannels
case layer: N5SegmentationLayer => layer.numChannels
case layer: PrecomputedDataLayer => layer.numChannels
case layer: PrecomputedSegmentationLayer => layer.numChannels
case layer: Zarr3DataLayer => layer.numChannels
case layer: Zarr3SegmentationLayer => layer.numChannels
case _ => None
}

def wkwResolutionsOpt: Option[List[WKWResolution]] = this match {
case layer: AbstractDataLayer => layer.wkwResolutions
case layer: AbstractSegmentationLayer => layer.wkwResolutions
case layer: WKWDataLayer => Some(layer.wkwResolutions)
case layer: WKWSegmentationLayer => Some(layer.wkwResolutions)
case _ => None
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Pattern-matching helper grows maintenance overhead

magsOpt, dataFormatOpt, numChannelsOpt, and wkwResolutionsOpt individually enumerate every concrete layer type.
Each new layer implementation will require four additions here – easy to forget and will compile but return None, causing subtle bugs.

Two alternatives:

  1. Push the responsibility down: add abstract def magsOpt: Option[List[MagLocator]] etc. to DataLayerLike with sensible default None, and override in the relevant sub-types.
  2. Keep the helper but replace exhaustive matching with a structural test, e.g.
this match {
  case l: { def mags: List[MagLocator] } => Some(l.mags)
  case _                                 => None
}

(uses structural types / asInstanceOf – trade-offs apply.)

Reducing the repetition makes the codebase safer and easier to extend.

Copy link
Contributor

@MichaelBuessemeyer MichaelBuessemeyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, couldn't find any issue here. Well done 🎉 (testing also worked out well 👍)

Note: I did not double check whether your list regarding the missing properties in the DB that needed to be added is complete.

IMO this should be mergable. @fm3 What do you think?

mappings = $mappings,
defaultViewConfiguration = ${s.defaultViewConfiguration.map(Json.toJson(_))}""".asUpdate
defaultViewConfiguration = ${s.defaultViewConfiguration.map(Json.toJson(_))},
adminViewConfiguration = ${s.adminViewConfiguration.map(Json.toJson(_))},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for also adding the missing adminViewConfiguration update

Copy link
Member

@fm3 fm3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking pretty good! I added a few small comments. If you had a specific reason against using json for the AxisOrder, let me know :)


CREATE TYPE webknossos.DATASET_LAYER_CATEGORY AS ENUM ('color', 'mask', 'segmentation');
CREATE TYPE webknossos.DATASET_LAYER_ELEMENT_CLASS AS ENUM ('uint8', 'uint16', 'uint24', 'uint32', 'uint64', 'float', 'double', 'int8', 'int16', 'int32', 'int64');
CREATE TYPE webknossos.DATASET_LAYER_DATAFORMAT AS ENUM ('wkw','zarr','zarr3','n5','neuroglancerPrecomputed','tracing');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think tracing should not happen for dataset layers, can be removed here

path TEXT,
realPath TEXT,
hasLocalData BOOLEAN NOT NULL DEFAULT FALSE,
axisOrder TEXT CONSTRAINT axisOrder_format CHECK (axisOrder ~ '^[xyzc]:[0-9]+(,[xyzc]:[0-9]+)+$'),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH I’m not a huge fan of the custom axisOrder literal. How about using a jsonb column?

case layer: WKWDataLayer => Some(layer.wkwResolutions)
case layer: WKWSegmentationLayer => Some(layer.wkwResolutions)
case _ => None
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The woes of case classes… Unfortunately I don’t know how to compact this further.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be compacted further by having a common trait which can be matched to. But not sure whether introducing another trait to the whole data layer hierarchy would be ideal 🤔

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

side note: That would be one advantage of scala 3 (still not voting for migrating to scala 3 though)

@frcroth frcroth requested a review from fm3 May 19, 2025 07:59
@frcroth frcroth merged commit 3d041ae into master May 19, 2025
5 checks passed
@frcroth frcroth deleted the dataset-properties-in-db branch May 19, 2025 08:15
@coderabbitai coderabbitai bot mentioned this pull request May 26, 2025
8 tasks
@coderabbitai coderabbitai bot mentioned this pull request Jun 23, 2025
8 tasks
@coderabbitai coderabbitai bot mentioned this pull request Jul 21, 2025
22 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants