-
Notifications
You must be signed in to change notification settings - Fork 29
Allow binaryData scan for single organization #8791
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
📝 WalkthroughWalkthroughThis change refactors dataset scanning and reporting logic to support organization-specific operations. It introduces optional organization ID parameters throughout backend and frontend components, adjusts routes and controller signatures, and removes the in-memory DataSourceRepository in favor of remote client operations. Related tests and data models are updated for consistency. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Assessment against linked issues
Assessment against linked issues: Out-of-scope changes
Suggested reviewers
Poem
Note ⚡️ Unit Test Generation is now available in beta!Learn more here, or try it out under "Finishing Touches" below. 📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (2)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
I think it makes sense to wait on #8708 and adapt the changes in here to that. So I’ll let this lie for a moment. |
… allow searching for dataset id in dashboard
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
webknossos-datastore/app/com/scalableminds/webknossos/datastore/controllers/LegacyController.scala (1)
445-445
: Remote client usage is correct, but consider improving the error message.While the migration to
remoteWebknossosClient.reportDataSource
is proper, the error message on line 448 could be more accurate.Consider updating the error message to be more precise:
- Fox.failure(s"Dataset not found in DB or in directory: $status, cannot reload.") ~> NOT_FOUND + Fox.failure(s"Dataset in directory is not usable: $status, cannot reload.") ~> NOT_FOUNDAlso applies to: 448-448
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (21)
app/controllers/WKRemoteDataStoreController.scala
(1 hunks)app/models/dataset/Dataset.scala
(3 hunks)app/models/dataset/DatasetService.scala
(1 hunks)conf/webknossos.latest.routes
(1 hunks)frontend/javascripts/admin/rest_api.ts
(1 hunks)frontend/javascripts/dashboard/dataset/dataset_collection_context.tsx
(3 hunks)frontend/javascripts/dashboard/dataset_view.tsx
(2 hunks)test/backend/AdditionalCoordinateTestSuite.scala
(3 hunks)test/backend/VolumeBucketKeyTestSuite.scala
(1 hunks)unreleased_changes/8791.md
(1 hunks)webknossos-datastore/app/com/scalableminds/webknossos/datastore/DataStoreModule.scala
(0 hunks)webknossos-datastore/app/com/scalableminds/webknossos/datastore/controllers/DataSourceController.scala
(3 hunks)webknossos-datastore/app/com/scalableminds/webknossos/datastore/controllers/LegacyController.scala
(4 hunks)webknossos-datastore/app/com/scalableminds/webknossos/datastore/explore/NgffExplorationUtils.scala
(2 hunks)webknossos-datastore/app/com/scalableminds/webknossos/datastore/models/datasource/AdditionalAxis.scala
(4 hunks)webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/DSRemoteWebknossosClient.scala
(1 hunks)webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/DataSourceRepository.scala
(0 hunks)webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/DataSourceService.scala
(4 hunks)webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/mesh/DSFullMeshService.scala
(1 hunks)webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/uploading/UploadService.scala
(5 hunks)webknossos-datastore/conf/datastore.latest.routes
(1 hunks)
💤 Files with no reviewable changes (2)
- webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/DataSourceRepository.scala
- webknossos-datastore/app/com/scalableminds/webknossos/datastore/DataStoreModule.scala
🧰 Additional context used
🧠 Learnings (9)
📓 Common learnings
Learnt from: frcroth
PR: scalableminds/webknossos#8609
File: app/models/dataset/Dataset.scala:753-775
Timestamp: 2025-05-12T13:07:29.637Z
Learning: In the `updateMags` method of DatasetMagsDAO (Scala), the code handles different dataset types distinctly:
1. Non-WKW datasets have `magsOpt` populated and use the first branch which includes axisOrder, channelIndex, and credentialId.
2. WKW datasets will have `wkwResolutionsOpt` populated and use the second branch which includes cubeLength.
3. The final branch is a fallback for legacy data.
This ensures appropriate fields are populated for each dataset type.
📚 Learning: in the `updatemags` method of datasetmagsdao (scala), the code handles different dataset types disti...
Learnt from: frcroth
PR: scalableminds/webknossos#8609
File: app/models/dataset/Dataset.scala:753-775
Timestamp: 2025-05-12T13:07:29.637Z
Learning: In the `updateMags` method of DatasetMagsDAO (Scala), the code handles different dataset types distinctly:
1. Non-WKW datasets have `magsOpt` populated and use the first branch which includes axisOrder, channelIndex, and credentialId.
2. WKW datasets will have `wkwResolutionsOpt` populated and use the second branch which includes cubeLength.
3. The final branch is a fallback for legacy data.
This ensures appropriate fields are populated for each dataset type.
Applied to files:
test/backend/AdditionalCoordinateTestSuite.scala
webknossos-datastore/app/com/scalableminds/webknossos/datastore/models/datasource/AdditionalAxis.scala
webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/uploading/UploadService.scala
app/controllers/WKRemoteDataStoreController.scala
webknossos-datastore/app/com/scalableminds/webknossos/datastore/explore/NgffExplorationUtils.scala
app/models/dataset/Dataset.scala
app/models/dataset/DatasetService.scala
webknossos-datastore/app/com/scalableminds/webknossos/datastore/controllers/LegacyController.scala
📚 Learning: the parameter in applyvoxelmap was renamed from `slicecount` to `sliceoffset` to better reflect its ...
Learnt from: philippotto
PR: scalableminds/webknossos#8602
File: frontend/javascripts/oxalis/model/volumetracing/volume_annotation_sampling.ts:365-366
Timestamp: 2025-05-07T06:17:32.810Z
Learning: The parameter in applyVoxelMap was renamed from `sliceCount` to `sliceOffset` to better reflect its purpose, but this doesn't affect existing call sites since JavaScript/TypeScript function calls are position-based.
Applied to files:
webknossos-datastore/app/com/scalableminds/webknossos/datastore/explore/NgffExplorationUtils.scala
📚 Learning: in the webknossos codebase, classes extending `foximplicits` have access to an implicit conversion f...
Learnt from: frcroth
PR: scalableminds/webknossos#8236
File: webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/mesh/MeshFileService.scala:170-173
Timestamp: 2025-04-23T08:51:57.756Z
Learning: In the webknossos codebase, classes extending `FoxImplicits` have access to an implicit conversion from `Option[A]` to `Fox[A]`, where `None` is converted to an empty Fox that fails gracefully in for-comprehensions.
Applied to files:
webknossos-datastore/app/com/scalableminds/webknossos/datastore/explore/NgffExplorationUtils.scala
webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/DataSourceService.scala
webknossos-datastore/app/com/scalableminds/webknossos/datastore/controllers/LegacyController.scala
📚 Learning: in scala's for-comprehension with fox (future-like type), the `<-` operator ensures sequential execu...
Learnt from: MichaelBuessemeyer
PR: scalableminds/webknossos#8352
File: app/models/organization/CreditTransactionService.scala:0-0
Timestamp: 2025-01-27T12:06:42.865Z
Learning: In Scala's for-comprehension with Fox (Future-like type), the `<-` operator ensures sequential execution. If any step fails, the entire chain short-circuits and returns early, preventing subsequent operations from executing. This makes it safe to perform validation checks before database operations.
Applied to files:
webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/DataSourceService.scala
📚 Learning: in scala for-comprehensions with the fox error handling monad, `fox.frombool()` expressions should u...
Learnt from: frcroth
PR: scalableminds/webknossos#8236
File: webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/mesh/NeuroglancerPrecomputedMeshFileService.scala:161-166
Timestamp: 2025-04-28T14:18:04.368Z
Learning: In Scala for-comprehensions with the Fox error handling monad, `Fox.fromBool()` expressions should use the `<-` binding operator instead of the `=` assignment operator to properly propagate error conditions. Using `=` will cause validation failures to be silently ignored.
Applied to files:
webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/DataSourceService.scala
📚 Learning: in webknossos scala codebase, when querying database tables with slick, explicit column listing in s...
Learnt from: frcroth
PR: scalableminds/webknossos#8821
File: app/models/dataset/Dataset.scala:864-866
Timestamp: 2025-08-04T11:49:30.012Z
Learning: In WebKnossos Scala codebase, when querying database tables with Slick, explicit column listing in SELECT statements is preferred over SELECT * to ensure columns are returned in the exact order expected by case class mappings. This prevents parsing failures when the physical column order in the production database doesn't match the schema definition order.
Applied to files:
app/models/dataset/Dataset.scala
webknossos-datastore/app/com/scalableminds/webknossos/datastore/controllers/LegacyController.scala
📚 Learning: for the `getdatasetextentasproduct` function in `dataset_accessor.ts`, input validation for negative...
Learnt from: dieknolle3333
PR: scalableminds/webknossos#8229
File: frontend/javascripts/oxalis/model/accessors/dataset_accessor.ts:348-354
Timestamp: 2024-11-25T14:38:49.345Z
Learning: For the `getDatasetExtentAsProduct` function in `dataset_accessor.ts`, input validation for negative or zero dimensions is not necessary.
Applied to files:
frontend/javascripts/dashboard/dataset/dataset_collection_context.tsx
📚 Learning: in the neuroglancermesh class, shardingspecification is defined as a concrete shardingspecification ...
Learnt from: frcroth
PR: scalableminds/webknossos#8236
File: webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/mesh/NeuroglancerPrecomputedMeshService.scala:55-66
Timestamp: 2025-04-23T09:22:26.829Z
Learning: In the NeuroglancerMesh class, shardingSpecification is defined as a concrete ShardingSpecification value, not an Option. It uses meshInfo.sharding.getOrElse(ShardingSpecification.empty) to provide a default empty specification if none is present, ensuring that mesh.shardingSpecification is never null.
Applied to files:
webknossos-datastore/app/com/scalableminds/webknossos/datastore/controllers/DataSourceController.scala
🧬 Code Graph Analysis (2)
frontend/javascripts/admin/rest_api.ts (1)
frontend/javascripts/admin/api/token.ts (1)
doWithToken
(39-74)
frontend/javascripts/dashboard/dataset/dataset_collection_context.tsx (1)
frontend/javascripts/admin/rest_api.ts (1)
triggerDatasetCheck
(1307-1322)
🔇 Additional comments (36)
test/backend/VolumeBucketKeyTestSuite.scala (1)
47-47
: LGTM! Consistent with the Array to Seq refactoring.The change from
Array(0, 10)
toSeq(0, 10)
aligns with the broader refactoring to makeAdditionalAxis
bounds deterministic, which fixes the hashCode implementation issue mentioned in the PR objectives.unreleased_changes/8791.md (1)
1-6
: LGTM! Changelog accurately reflects the user-facing changes.The changelog entries clearly document the two main improvements: performance optimization for multi-organization dataset scanning and enhanced search functionality for dataset IDs.
webknossos-datastore/conf/datastore.latest.routes (1)
120-120
: LGTM! Route correctly supports organization-scoped inbox checking.The addition of the optional
organizationId: Option[String]
parameter enables organization-specific inbox scanning while maintaining backward compatibility through the optional parameter.test/backend/AdditionalCoordinateTestSuite.scala (1)
13-13
: LGTM! Consistent Array to Seq refactoring throughout the test suite.All
AdditionalAxis
constructor calls have been updated to useSeq
instead ofArray
for bounds, maintaining consistency with the broader codebase refactoring that fixes the hashCode determinism issue.Also applies to: 29-31, 34-34, 47-47, 51-51
webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/mesh/DSFullMeshService.scala (1)
37-43
: LGTM! Constructor properly updated to remove obsolete dependency.The removal of
dataSourceRepository
from the constructor aligns with the PR's architectural goal of eliminating the statefulDataSourceRepository
in favor of remote client operations.conf/webknossos.latest.routes (1)
110-110
: LGTM! Clean addition of optional organization scoping.The route update correctly adds an optional
organizationId
parameter to enable organization-specific dataset operations while maintaining backward compatibility.frontend/javascripts/dashboard/dataset_view.tsx (3)
36-36
: LGTM! Proper state management integration.Good addition of the
useWkSelector
hook to access the active organization from global state.
316-316
: LGTM! Correct organization ID retrieval.The organization ID is properly extracted from the Redux state using the standard selector pattern.
323-323
: LGTM! Organization-aware dataset checking.The dropdown menu action now correctly passes the organization ID to enable scoped dataset scanning, fulfilling the PR's main objective.
webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/DSRemoteWebknossosClient.scala (2)
118-118
: LGTM! Well-designed optional parameter addition.The method signature properly adds the optional
organizationId
parameter while maintaining backward compatibility.
121-121
: LGTM! Correct conditional query parameter handling.Good use of
addQueryStringOptional
to only include the organization ID parameter when it's provided, avoiding empty or null query parameters.frontend/javascripts/dashboard/dataset/dataset_collection_context.tsx (3)
30-30
: LGTM! Proper type definition update.The interface correctly reflects the new optional
organizationId
parameter for thecheckDatasets
function.
219-219
: LGTM! Clean function signature update.The async function properly accepts the optional
organizationId
parameter with correct TypeScript typing.
233-233
: LGTM! Correct parameter propagation.The
organizationId
is properly passed totriggerDatasetCheck
for each datastore, enabling organization-scoped dataset checking as intended.webknossos-datastore/app/com/scalableminds/webknossos/datastore/explore/NgffExplorationUtils.scala (2)
187-187
: LGTM! Better Scala collection usage.Good change from
Array[Int]
toSeq[Int]
for the bounds parameter.Seq
is more idiomatic Scala and provides better flexibility while maintaining the same functionality.
206-206
: LGTM! Consistent collection type usage.The construction of bounds using
Seq(0, shape(axisAndIndex._2).toInt)
is consistent with the updated method signature and maintains the same logic.frontend/javascripts/admin/rest_api.ts (1)
1307-1322
: LGTM! Clean implementation of organization-scoped dataset checking.The function signature change properly supports the new optional
organizationId
parameter, and the switch toURLSearchParams
is a good improvement over manual string concatenation. The conditional parameter addition ensures clean URLs when no organization filter is needed.app/models/dataset/DatasetService.scala (1)
294-301
: LGTM! Clean implementation of organization-scoped dataset deactivation.The method signature correctly adds the optional
organizationId
parameter and properly passes it through to the DAO layer. The use ofOption[String]
follows Scala conventions and maintains type safety.webknossos-datastore/app/com/scalableminds/webknossos/datastore/models/datasource/AdditionalAxis.scala (1)
9-9
: Excellent fix for deterministic hashCode behavior!Changing
bounds
fromArray[Int]
toSeq[Int]
resolves the non-deterministic hashCode issue that was causing datasets with AdditionalAxes to be incorrectly marked as changed on every re-report. Arrays use reference equality while Seq uses structural equality, making this change essential for consistent behavior.Also applies to: 21-21, 24-24, 45-45, 81-81
app/controllers/WKRemoteDataStoreController.scala (2)
191-191
: LGTM: Clean implementation of organization-scoped scanning.The optional
organizationId
parameter enables the organization-specific dataset scanning feature while maintaining backward compatibility.
197-200
: Excellent contextual logging for organization-aware operations.The conditional organization labeling in log messages provides clear visibility into whether operations are scoped to a specific organization or running across all organizations. This will be valuable for debugging and monitoring.
Also applies to: 202-202, 204-206
webknossos-datastore/app/com/scalableminds/webknossos/datastore/controllers/DataSourceController.scala (2)
77-86
: Well-implemented organization-scoped access validation.The conditional access validation properly handles both organization-specific and general administrative access patterns. When
organizationId
is provided, it validates organization-specific access; otherwise, it falls back to general administrative access.
457-457
: Correct transition from repository to remote client pattern.Replacing direct
dataSourceRepository
usage withdsRemoteWebknossosClient.deleteDataSource(dataSourceId)
aligns with the architectural changes described in the PR, where the WebKnossos-side database becomes the single source of truth.app/models/dataset/Dataset.scala (3)
387-393
: LGTM! Nice enhancement to support dataset ID searches.The implementation correctly detects when a search query is a valid ObjectId and performs direct ID matching, while preserving the existing name-based search functionality as a fallback.
663-663
: LGTM! Proper implementation of organization-scoped deactivation.The optional
organizationId
parameter correctly restricts dataset deactivation to a specific organization when provided, while maintaining backward compatibility. The SQL predicate construction is clean and secure.Also applies to: 666-669
1185-1185
: Good fix for the non-deterministic hashCode issue.Replacing
Array
withSeq
ensures deterministic hashCode behavior, preventing datasets with AdditionalAxes from being incorrectly marked as changed on every re-report.webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/uploading/UploadService.scala (2)
21-21
: LGTM! Clean removal of DataSourceRepository dependency.The constructor properly removes the
DataSourceRepository
parameter while retainingDSRemoteWebknossosClient
, aligning with the architectural shift to remote client-based operations.Also applies to: 108-108
364-364
: Consistent migration to remote client operations.All data source operations have been properly migrated from direct repository calls to
remoteWebknossosClient
methods, maintaining the same functionality while aligning with the new architecture where the WebKnossos-side database is the single source of truth.Also applies to: 462-462, 694-694
webknossos-datastore/app/com/scalableminds/webknossos/datastore/controllers/LegacyController.scala (2)
15-16
: Good dependency cleanup and delegation pattern.The controller has been properly simplified by removing unused dependencies and delegating mesh operations to
DSMeshController
, following the single responsibility principle.Also applies to: 28-36
426-428
: Clean delegation of mesh loading functionality.The method correctly obtains the dataset ID through
remoteWebknossosClient
and delegates the mesh STL loading to the specializedmeshController
, maintaining proper separation of concerns.webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/DataSourceService.scala (6)
55-57
: LGTM!The change correctly implements the default behavior where periodic scans continue to scan all organizations by passing
organizationId = None
.
74-79
: Well-structured organization filtering logic.The implementation correctly handles both organization-specific and all-organization scanning scenarios with clear filter functions and appropriate logging context.
81-84
: Correct application of organization filter.The filter is properly applied to directory listing, and empty directory logging is appropriately restricted to full scans.
85-92
: Proper propagation of organizationId parameter.The organizationId is correctly passed to the remote client for organization-scoped reporting, and error messages include appropriate context.
174-179
: Enhanced logging with organization context.The updated logging provides better visibility into which organization(s) were scanned, improving debugging and monitoring capabilities.
331-331
: Method rename improves clarity.The new name
scanOrganizationDirForDataSources
better describes the method's purpose of scanning a single organization directory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great stuff!
import com.scalableminds.util.tools.{Box, Failure, Full} | ||
import play.api.libs.json.{Format, Json} | ||
|
||
// bounds: lower bound inclusive, upper bound exclusive |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Array hashcode stuff is annoying, good catch!
Maybe a test that calculates the hashcode of a datasource would be nice?
Co-authored-by: frcroth <[email protected]>
Steps to test:
TODOs:
Issues:
$PR_NUMBER.md
file inunreleased_changes
or use./tools/create-changelog-entry.py
)