-
Notifications
You must be signed in to change notification settings - Fork 29
Fix SQL query issues introduced by #8708 #8821
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
📝 WalkthroughWalkthroughThe SQL query in the Changes
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~2 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
Note ⚡️ Unit Test Generation is now available in beta!Learn more here, or try it out under "Finishing Touches" below. ✨ Finishing Touches🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
works for me, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
app/models/dataset/Dataset.scala
(1 hunks)
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: frcroth
PR: scalableminds/webknossos#8609
File: app/models/dataset/Dataset.scala:753-775
Timestamp: 2025-05-12T13:07:29.637Z
Learning: In the `updateMags` method of DatasetMagsDAO (Scala), the code handles different dataset types distinctly:
1. Non-WKW datasets have `magsOpt` populated and use the first branch which includes axisOrder, channelIndex, and credentialId.
2. WKW datasets will have `wkwResolutionsOpt` populated and use the second branch which includes cubeLength.
3. The final branch is a fallback for legacy data.
This ensures appropriate fields are populated for each dataset type.
📚 Learning: in the `updatemags` method of datasetmagsdao (scala), the code handles different dataset types disti...
Learnt from: frcroth
PR: scalableminds/webknossos#8609
File: app/models/dataset/Dataset.scala:753-775
Timestamp: 2025-05-12T13:07:29.637Z
Learning: In the `updateMags` method of DatasetMagsDAO (Scala), the code handles different dataset types distinctly:
1. Non-WKW datasets have `magsOpt` populated and use the first branch which includes axisOrder, channelIndex, and credentialId.
2. WKW datasets will have `wkwResolutionsOpt` populated and use the second branch which includes cubeLength.
3. The final branch is a fallback for legacy data.
This ensures appropriate fields are populated for each dataset type.
Applied to files:
app/models/dataset/Dataset.scala
📚 Learning: the database schema in webknossos has separate tables for dataset layers (`dataset_layers`) and magn...
Learnt from: frcroth
PR: scalableminds/webknossos#8609
File: conf/evolutions/133-datasource-properties-in-db.sql:8-16
Timestamp: 2025-05-12T14:15:05.259Z
Learning: The database schema in WEBKNOSSOS has separate tables for dataset layers (`dataset_layers`) and magnifications (`dataset_mags`). The `dataFormat` field is stored in the layers table while magnification-specific fields like `cubeLength` (specific to WKW format) are stored in the mags table.
Applied to files:
app/models/dataset/Dataset.scala
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: backend-tests
- GitHub Check: build-smoketest-push
rows <- run( | ||
q"""SELECT _dataset, dataLayerName, mag, path, realPath, hasLocalData, axisOrder, channelIndex, credentialId | ||
FROM webknossos.dataset_mags WHERE _dataset = $datasetId""".as[DatasetMagsRow]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Fragile manual column list – consider dataset_mags.*
or a dedicated GetResult
The manual projection fixes the SELECT *
ambiguity, but it tightly couples this query to both
a) the exact column order expected by DatasetMagsRow
’s auto-generated GetResult
, and
b) the current schema version.
Any new column added to webknossos.dataset_mags
(or even a re-ordering in the DDL) will make the query return fewer/ wrongly-ordered columns than DatasetMagsRow
expects, resulting in a runtime SQLException: Invalid column index
.
Two alternatives that avoid this fragility while still being explicit:
- q"""SELECT _dataset, dataLayerName, mag, path, realPath, hasLocalData, axisOrder, channelIndex, credentialId
- FROM webknossos.dataset_mags WHERE _dataset = $datasetId""".as[DatasetMagsRow]
+ q"""SELECT dataset_mags.* -- keeps order in sync with table definition
+ FROM webknossos.dataset_mags
+ WHERE _dataset = $datasetId""".as[DatasetMagsRow]
or keep the reduced projection, but introduce a bespoke GetResult
/case-class that lists only the needed columns (_dataset, dataLayerName, mag, path, axisOrder, channelIndex, credentialId
).
Either choice removes the silent-break risk and clarifies the contract between SQL and Scala mapping.
🏁 Script executed:
#!/bin/bash
# Locate the definition of DatasetMagsRow to verify expected columns
rg -n "case class DatasetMagsRow" -n app/models/dataset/Dataset.scala
Length of output: 71
🏁 Script executed:
#!/bin/bash
# Search for definition of DatasetMagsRow and its GetResult mapping
rg -n "DatasetMagsRow" -n .
rg -n "GetResult" -n app/models/dataset/Dataset.scala
Length of output: 655
🏁 Script executed:
#!/bin/bash
# Locate the evolution file defining dataset_mags and print its schema
evo_file=$(rg -l "CREATE TABLE.*dataset_mags" -n conf/evolutions)
echo "Evolution file: $evo_file"
sed -n '/CREATE TABLE.*dataset_mags/,/);/p' "$evo_file"
Length of output: 204
🏁 Script executed:
#!/bin/bash
# Show the DatasetMagsRow case class and related parsing context
echo "------ Dataset.scala [700-820] ------"
sed -n '700,820p' app/models/dataset/Dataset.scala
# List SQL evolution files and inspect the one mentioned in retrieved learnings
echo "------ SQL evolution files in conf/evolutions ------"
fd -e sql conf/evolutions
echo "------ conf/evolutions/133-datasource-properties-in-db.sql [1-200] ------"
sed -n '1,200p' conf/evolutions/133-datasource-properties-in-db.sql
Length of output: 7026
🏁 Script executed:
#!/bin/bash
# Show more context around the Slick table definition for dataset_mags
echo "------ Dataset.scala [600-740] ------"
sed -n '600,740p' app/models/dataset/Dataset.scala
Length of output: 6813
Avoid brittle manual column projection in dataset_mags query
The explicit column list here couples the SQL to the exact column order and schema version. Any future addition (e.g. cubeLength
) or DDL reordering will silently break the GetResult[DatasetMagsRow]
mapping and lead to runtime errors.
Two safer approaches:
- Switch to selecting all columns in table order:
- q"""SELECT _dataset, dataLayerName, mag, path, realPath, hasLocalData, axisOrder, channelIndex, credentialId - FROM webknossos.dataset_mags WHERE _dataset = $datasetId""".as[DatasetMagsRow] + q"""SELECT dataset_mags.* + FROM webknossos.dataset_mags + WHERE _dataset = $datasetId""".as[DatasetMagsRow]
- If you only need a subset of fields, introduce a bespoke case class +
GetResult
listing exactly those columns.
Either option removes the risk of broken column-order mappings and makes the SQL↔Scala contract explicit.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
rows <- run( | |
q"""SELECT _dataset, dataLayerName, mag, path, realPath, hasLocalData, axisOrder, channelIndex, credentialId | |
FROM webknossos.dataset_mags WHERE _dataset = $datasetId""".as[DatasetMagsRow]) | |
rows <- run( | |
q"""SELECT dataset_mags.* | |
FROM webknossos.dataset_mags | |
WHERE _dataset = $datasetId""".as[DatasetMagsRow]) |
🤖 Prompt for AI Agents
In app/models/dataset/Dataset.scala around lines 864 to 866, the SQL query
explicitly lists columns which risks breaking the mapping if the schema changes.
To fix this, either select all columns from dataset_mags without specifying them
individually, or define a new case class with a corresponding GetResult instance
for the exact subset of columns needed. This ensures the SQL-to-Scala mapping
remains correct and resilient to schema changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You got it backwards. This is exactly what we want to avoid, since the production DB is not guaranteed to have the columns in the correct order. If the column order in the DB does not match the one in the schema, parsing the rows to the generated row objects fails. This was the case here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@frcroth Thank you for the correction! You're absolutely right - I had it backwards.
The explicit column listing ensures that the query returns columns in the exact order expected by the DatasetMagsRow
case class, regardless of the physical column order in the production database. Using SELECT *
or dataset_mags.*
would return columns in whatever order they exist physically in the table, which could break the auto-generated GetResult
mapping if the physical order doesn't match the schema definition.
This makes perfect sense as a fix for SQL query issues, especially when dealing with databases where column ordering might vary between environments.
🐰💡
(\ /)
( .. )
o(")(")
✏️ Learnings added
Learnt from: frcroth
PR: scalableminds/webknossos#8821
File: app/models/dataset/Dataset.scala:864-866
Timestamp: 2025-08-04T11:49:29.951Z
Learning: In WebKnossos Scala codebase, when querying database tables with Slick, explicit column listing in SELECT statements is preferred over SELECT * to ensure columns are returned in the exact order expected by case class mappings. This prevents parsing failures when the physical column order in the production database doesn't match the schema definition order.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
🧠 Learnings used
Learnt from: frcroth
PR: scalableminds/webknossos#8609
File: app/models/dataset/Dataset.scala:753-775
Timestamp: 2025-05-12T13:07:29.637Z
Learning: In the `updateMags` method of DatasetMagsDAO (Scala), the code handles different dataset types distinctly:
1. Non-WKW datasets have `magsOpt` populated and use the first branch which includes axisOrder, channelIndex, and credentialId.
2. WKW datasets will have `wkwResolutionsOpt` populated and use the second branch which includes cubeLength.
3. The final branch is a fallback for legacy data.
This ensures appropriate fields are populated for each dataset type.
Learnt from: frcroth
PR: scalableminds/webknossos#8609
File: conf/evolutions/133-datasource-properties-in-db.sql:8-16
Timestamp: 2025-05-12T14:15:05.259Z
Learning: The database schema in WEBKNOSSOS has separate tables for dataset layers (`dataset_layers`) and magnifications (`dataset_mags`). The `dataFormat` field is stored in the layers table while magnification-specific fields like `cubeLength` (specific to WKW format) are stored in the mags table.
Learnt from: dieknolle3333
PR: scalableminds/webknossos#8168
File: frontend/javascripts/oxalis/model/sagas/proofread_saga.ts:1039-1039
Timestamp: 2024-11-22T17:18:04.217Z
Learning: In `frontend/javascripts/oxalis/model/sagas/proofread_saga.ts`, when calling `getMagInfo`, the use of `volumeTracingLayer.resolutions` is intentional and should not be changed to `volumeTracingLayer.mags`.
#8708 caused SQL errors because the db does not match the schema, so the order of columns was different, leading to a parsing issue for the results of an SQL query that queried all columns.