-
Notifications
You must be signed in to change notification settings - Fork 29
Store dataset properties in DB #8609
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
📝 WalkthroughWalkthroughThis update introduces new metadata fields to dataset layers and magnifications, reflected in both the database schema and the application logic. The migration scripts add and remove these fields as needed. The Scala models and DAOs are extended to support and persist the new properties. A duplicate route is removed from the routing configuration, and migration documentation is updated accordingly. Changes
Possibly related PRs
Suggested labels
Suggested reviewers
Poem
Note ⚡️ AI Code Reviews for VS Code, Cursor, WindsurfCodeRabbit now has a plugin for VS Code, Cursor and Windsurf. This brings AI code reviews directly in the code editor. Each commit is reviewed immediately, finding bugs before the PR is raised. Seamless context handoff to your AI code agent ensures that you can easily incorporate review feedback. Note ⚡️ Faster reviews with cachingCodeRabbit now supports caching for code and dependencies, helping speed up reviews. This means quicker feedback, reduced wait times, and a smoother review experience overall. Cached data is encrypted and stored securely. This feature will be automatically enabled for all accounts on May 16th. To opt out, configure 📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (5)
💤 Files with no reviewable changes (2)
🚧 Files skipped from review as they are similar to previous changes (3)
⏰ Context from checks skipped due to timeout of 90000ms (2)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
3344f87
to
a86896b
Compare
a86896b
to
d6231e4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🧹 Nitpick comments (3)
app/models/dataset/Dataset.scala (1)
936-950
: Keep INSERT / UPDATE column ordering consistentThe two branches diverge in column order (
dataFormat
,numChannels
) which invites copy-paste errors and makes diffing harder.
Align both statements (or extract a helper) so future changes touch only one place.webknossos-datastore/app/com/scalableminds/webknossos/datastore/models/datasource/DataLayer.scala (2)
453-464
:getMags
duplicates logic already expressed inmagsOpt
getMags
hard-codes the same exhaustive match list and throws at runtime. Prefer:def getMags: List[MagLocator] = magsOpt.getOrElse( throw new IllegalStateException(s"Layer $name does not expose mags") )This keeps the enumeration in a single location.
482-487
: Case-class field proliferation risks hitting the 22-field limit
AbstractDataLayer
now sits at 13 fields,AbstractSegmentationLayer
at 15.
While still below the 22-field product limit, planned future extensions (e.g. compression, tiling strategy, provenance) will break compilation.Start thinking about grouping related settings into small value objects (e.g.
StorageInfo
,DisplayInfo
) instead of adding more primitive fields.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (2)
test/db/dataSet_layers.csv
is excluded by!**/*.csv
test/db/dataSet_mags.csv
is excluded by!**/*.csv
📒 Files selected for processing (8)
MIGRATIONS.unreleased.md
(1 hunks)app/models/dataset/Dataset.scala
(2 hunks)conf/evolutions/133-datasource-properties-in-db.sql
(1 hunks)conf/evolutions/reversions/133-datasource-properties-in-db.sql
(1 hunks)conf/webknossos.latest.routes
(0 hunks)tools/postgres/schema.sql
(3 hunks)webknossos-datastore/app/com/scalableminds/webknossos/datastore/datareaders/AxisOrder.scala
(2 hunks)webknossos-datastore/app/com/scalableminds/webknossos/datastore/models/datasource/DataLayer.scala
(7 hunks)
💤 Files with no reviewable changes (1)
- conf/webknossos.latest.routes
🧰 Additional context used
🧬 Code Graph Analysis (1)
webknossos-datastore/app/com/scalableminds/webknossos/datastore/models/datasource/DataLayer.scala (6)
webknossos-datastore/app/com/scalableminds/webknossos/datastore/dataformats/layers/WKWDataLayers.scala (3)
WKWResolution
(12-12)WKWResolution
(14-16)mags
(29-29)webknossos-tracingstore/app/com/scalableminds/webknossos/tracingstore/tracings/editablemapping/EditableMappingLayer.scala (2)
dataFormat
(85-85)additionalAxes
(105-105)webknossos-datastore/app/com/scalableminds/webknossos/datastore/dataformats/layers/N5DataLayers.scala (1)
numChannels
(25-25)webknossos-datastore/app/com/scalableminds/webknossos/datastore/dataformats/layers/PrecomputedDataLayers.scala (1)
numChannels
(25-25)webknossos-datastore/app/com/scalableminds/webknossos/datastore/dataformats/layers/Zarr3DataLayers.scala (1)
numChannels
(25-25)webknossos-datastore/app/com/scalableminds/webknossos/datastore/dataformats/layers/ZarrDataLayers.scala (1)
numChannels
(23-23)
🔇 Additional comments (7)
MIGRATIONS.unreleased.md (1)
15-15
: LGTM: Added new migration script to the unreleased list.The newly added entry for the datasource properties migration script follows the proper format and is correctly placed in the list.
tools/postgres/schema.sql (3)
24-24
: Correct schema version increment.Schema version is properly incremented to 133 to match the new migration script.
139-140
: LGTM: Added new columns to dataset_layers table.The added columns match the PR objective, adding
numChannels
anddataFormat
to the dataset layers table.
177-180
: LGTM: Added new columns to dataset_mags table.The added columns match the PR objective, adding
axisOrder
,channelIndex
,cubeLength
, andcredentialId
to store additional metadata in the database.conf/evolutions/reversions/133-datasource-properties-in-db.sql (1)
1-17
: LGTM: Proper rollback script for the migration.The rollback script correctly:
- Verifies the current schema version
- Drops the newly added columns from both tables
- Downgrades the schema version
- Wraps operations in a transaction
Using
DROP COLUMN IF EXISTS
is good practice for maintaining idempotence.webknossos-datastore/app/com/scalableminds/webknossos/datastore/datareaders/AxisOrder.scala (2)
24-37
: Well-implemented toString method for serialization.The toString implementation appropriately handles optional fields and creates a clean, consistent string representation of the axis order.
63-70
: LGTM: Added proper deserialization logic.The
fromString
method correctly parses the string representation back into anAxisOrder
instance, handling optional fields appropriately.
// Datasets that are not in the WKW format use mags | ||
def magsOpt: Option[List[MagLocator]] = this match { | ||
case layer: AbstractDataLayer => layer.mags | ||
case layer: AbstractSegmentationLayer => layer.mags | ||
case layer: DataLayerWithMagLocators => Some(layer.getMags) | ||
case _ => None | ||
} | ||
|
||
def dataFormatOpt: Option[DataFormat.Value] = this match { | ||
case layer: WKWDataLayer => Some(layer.dataFormat) | ||
case layer: WKWSegmentationLayer => Some(layer.dataFormat) | ||
case layer: ZarrDataLayer => Some(layer.dataFormat) | ||
case layer: ZarrSegmentationLayer => Some(layer.dataFormat) | ||
case layer: N5DataLayer => Some(layer.dataFormat) | ||
case layer: N5SegmentationLayer => Some(layer.dataFormat) | ||
case layer: PrecomputedDataLayer => Some(layer.dataFormat) | ||
case layer: PrecomputedSegmentationLayer => Some(layer.dataFormat) | ||
case layer: Zarr3DataLayer => Some(layer.dataFormat) | ||
case layer: Zarr3SegmentationLayer => Some(layer.dataFormat) | ||
// Abstract layers | ||
case _ => None | ||
} | ||
|
||
def numChannelsOpt: Option[Int] = this match { | ||
case layer: AbstractDataLayer => layer.numChannels | ||
case layer: AbstractSegmentationLayer => layer.numChannels | ||
case layer: ZarrDataLayer => layer.numChannels | ||
case layer: ZarrSegmentationLayer => layer.numChannels | ||
case layer: N5DataLayer => layer.numChannels | ||
case layer: N5SegmentationLayer => layer.numChannels | ||
case layer: PrecomputedDataLayer => layer.numChannels | ||
case layer: PrecomputedSegmentationLayer => layer.numChannels | ||
case layer: Zarr3DataLayer => layer.numChannels | ||
case layer: Zarr3SegmentationLayer => layer.numChannels | ||
case _ => None | ||
} | ||
|
||
def wkwResolutionsOpt: Option[List[WKWResolution]] = this match { | ||
case layer: AbstractDataLayer => layer.wkwResolutions | ||
case layer: AbstractSegmentationLayer => layer.wkwResolutions | ||
case layer: WKWDataLayer => Some(layer.wkwResolutions) | ||
case layer: WKWSegmentationLayer => Some(layer.wkwResolutions) | ||
case _ => None | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Pattern-matching helper grows maintenance overhead
magsOpt
, dataFormatOpt
, numChannelsOpt
, and wkwResolutionsOpt
individually enumerate every concrete layer type.
Each new layer implementation will require four additions here – easy to forget and will compile but return None
, causing subtle bugs.
Two alternatives:
- Push the responsibility down: add abstract
def magsOpt: Option[List[MagLocator]]
etc. toDataLayerLike
with sensible defaultNone
, and override in the relevant sub-types. - Keep the helper but replace exhaustive matching with a structural test, e.g.
this match {
case l: { def mags: List[MagLocator] } => Some(l.mags)
case _ => None
}
(uses structural types / asInstanceOf
– trade-offs apply.)
Reducing the repetition makes the codebase safer and easier to extend.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, couldn't find any issue here. Well done 🎉 (testing also worked out well 👍)
Note: I did not double check whether your list regarding the missing properties in the DB that needed to be added is complete.
IMO this should be mergable. @fm3 What do you think?
mappings = $mappings, | ||
defaultViewConfiguration = ${s.defaultViewConfiguration.map(Json.toJson(_))}""".asUpdate | ||
defaultViewConfiguration = ${s.defaultViewConfiguration.map(Json.toJson(_))}, | ||
adminViewConfiguration = ${s.adminViewConfiguration.map(Json.toJson(_))}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for also adding the missing adminViewConfiguration
update
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking pretty good! I added a few small comments. If you had a specific reason against using json for the AxisOrder, let me know :)
tools/postgres/schema.sql
Outdated
|
||
CREATE TYPE webknossos.DATASET_LAYER_CATEGORY AS ENUM ('color', 'mask', 'segmentation'); | ||
CREATE TYPE webknossos.DATASET_LAYER_ELEMENT_CLASS AS ENUM ('uint8', 'uint16', 'uint24', 'uint32', 'uint64', 'float', 'double', 'int8', 'int16', 'int32', 'int64'); | ||
CREATE TYPE webknossos.DATASET_LAYER_DATAFORMAT AS ENUM ('wkw','zarr','zarr3','n5','neuroglancerPrecomputed','tracing'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think tracing should not happen for dataset layers, can be removed here
tools/postgres/schema.sql
Outdated
path TEXT, | ||
realPath TEXT, | ||
hasLocalData BOOLEAN NOT NULL DEFAULT FALSE, | ||
axisOrder TEXT CONSTRAINT axisOrder_format CHECK (axisOrder ~ '^[xyzc]:[0-9]+(,[xyzc]:[0-9]+)+$'), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBH I’m not a huge fan of the custom axisOrder literal. How about using a jsonb column?
case layer: WKWDataLayer => Some(layer.wkwResolutions) | ||
case layer: WKWSegmentationLayer => Some(layer.wkwResolutions) | ||
case _ => None | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The woes of case classes… Unfortunately I don’t know how to compact this further.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be compacted further by having a common trait which can be matched to. But not sure whether introducing another trait to the whole data layer hierarchy would be ideal 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
side note: That would be one advantage of scala 3 (still not voting for migrating to scala 3 though)
URL of deployed dev instance (used for testing):
This PR aims to have all properties that may occur in the datasource-properties.json file of a dataset mirrored in the DB. The things that were missing in the DB are:
AxisOrder is done with a string in the form of "x:4,y:3,z:2", it would also be possible to store this in a new table and avoid the string serialization.
Steps to test:
TODOs:
Issues:
(Please delete unneeded items, merge only when none are left open)