Skip to content

Conversation

fm3
Copy link
Member

@fm3 fm3 commented Jul 21, 2025

  • The “scan disk for new datasets” button now only scans for the current orga. This is mostly a performance optimization for multi-orga setups
  • Removed the now obsolete stateful dataSourceRepository (the wk-side database is the source of truth, the datastore side should use only the data from there, with its cache)
    • Remaining usages now either talk to wk directly or, in the case of STL download in LegacyController, use the original controller’s implementation.
  • The dashboard search now supports searching for dataset ids (only active when a full ObjectId is entered in the search field)
  • Fixed a bug where datasets with AdditionalAxes would be assumed changed on every re-report due to the hashCode being non-deterministic. This is fixed by using Seq instead of Array for the bounds of AdditionalAxis.

Steps to test:

  • Create setup with multiple orgas (e.g. with isWkOrgInstance=true)
  • Put datasets in multiple orgas, hit refresh button. It should scan those of the user’s current orga only.
  • The regular once-a-minute scan should still scan everything.
  • Try searching for a dataset id in the dashboard

TODOs:

  • Backend
    • Take optional orga id parameter, scan only that directory
    • Don’t unreport datasets of other orgas
    • adapt to the changes of Virtual Datasets #8708
  • Frontend
    • Adapt function in rest_api.ts
    • Pass current organizationId when hitting the button

Issues:


  • Added changelog entry (create a $PR_NUMBER.md file in unreleased_changes or use ./tools/create-changelog-entry.py)
  • Removed dev-only changes like prints and application.conf edits
  • Considered common edge cases
  • Needs datastore update after deployment

@fm3 fm3 self-assigned this Jul 21, 2025
Copy link
Contributor

coderabbitai bot commented Jul 21, 2025

📝 Walkthrough

Walkthrough

This change refactors dataset scanning and reporting logic to support organization-specific operations. It introduces optional organization ID parameters throughout backend and frontend components, adjusts routes and controller signatures, and removes the in-memory DataSourceRepository in favor of remote client operations. Related tests and data models are updated for consistency.

Changes

Cohort / File(s) Change Summary
Backend: Organization-aware dataset scanning & reporting
app/controllers/WKRemoteDataStoreController.scala, app/models/dataset/Dataset.scala, app/models/dataset/DatasetService.scala, webknossos-datastore/app/com/scalableminds/webknossos/datastore/controllers/DataSourceController.scala, webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/DataSourceService.scala, webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/DSRemoteWebknossosClient.scala
Added optional organizationId parameter to dataset scanning, deactivation, and reporting methods. Logging and SQL predicates are updated to support organization context. Controller and service signatures are modified accordingly.
Backend: Removal of DataSourceRepository and related refactors
webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/DataSourceRepository.scala, webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/mesh/DSFullMeshService.scala, webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/uploading/UploadService.scala, webknossos-datastore/app/com/scalableminds/webknossos/datastore/controllers/LegacyController.scala, webknossos-datastore/app/com/scalableminds/webknossos/datastore/DataStoreModule.scala
Removed DataSourceRepository and all direct usages. Replaced update/remove logic with remote client calls. Updated dependency injection and constructor parameters in affected services and controllers.
Backend: AdditionalAxis bounds type change
webknossos-datastore/app/com/scalableminds/webknossos/datastore/models/datasource/AdditionalAxis.scala, webknossos-datastore/app/com/scalableminds/webknossos/datastore/explore/NgffExplorationUtils.scala, app/models/dataset/Dataset.scala
Changed AdditionalAxis.bounds type from Array[Int] to Seq[Int]. Updated all construction and usage sites for compatibility.
Frontend: Organization-aware dataset check
frontend/javascripts/admin/rest_api.ts, frontend/javascripts/dashboard/dataset/dataset_collection_context.tsx, frontend/javascripts/dashboard/dataset_view.tsx
Propagated optional organizationId parameter through dataset check functions, context, and UI triggers. Updated API calls to include organization context when provided.
Routes: Organization-aware endpoints
conf/webknossos.latest.routes, webknossos-datastore/conf/datastore.latest.routes
Route signatures updated to accept optional organizationId in relevant controller methods.
Tests: AdditionalAxis bounds type update
test/backend/AdditionalCoordinateTestSuite.scala, test/backend/VolumeBucketKeyTestSuite.scala
Updated test code to use Seq instead of Array for AdditionalAxis.bounds. No change to test logic.
Docs/Changelog
unreleased_changes/8791.md
Added changelog entry describing organization-aware scanning and dataset ID search improvements.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Assessment against linked issues

Objective Addressed Explanation
Restrict scanning binaryData folder to organization (#8784)
Dashboard "Scan disk for new datasets" scans only current organization (#8784)

Assessment against linked issues: Out-of-scope changes

Code Change Explanation
Change of AdditionalAxis.bounds from Array[Int] to Seq[Int] (webknossos-datastore/app/com/scalableminds/webknossos/datastore/models/datasource/AdditionalAxis.scala and related files) This change is a type generalization unrelated to the organization-scoped scanning objective.
SQL predicate update for dataset search by ObjectId (app/models/dataset/Dataset.scala) Enhancing search by dataset ID is not part of the org-scoped scanning requirement.
Changelog entry for dataset search by ID (unreleased_changes/8791.md) The dataset search by ID feature is unrelated to the scanning restriction objective.

Suggested reviewers

  • normanrz

Poem

In burrows deep, I scan with care,
Now each organization gets its share.
No more searching far and wide—
The right datasets, neatly supplied!
Arrays to Seqs, old code retired,
Out-of-scope bits gently rewired.
🐇✨ Another hop, the job’s inspired!

Note

⚡️ Unit Test Generation is now available in beta!

Learn more here, or try it out under "Finishing Touches" below.


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8b32ccb and 83c43d8.

📒 Files selected for processing (2)
  • frontend/javascripts/admin/rest_api.ts (1 hunks)
  • unreleased_changes/8791.md (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • unreleased_changes/8791.md
  • frontend/javascripts/admin/rest_api.ts
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: build-smoketest-push
  • GitHub Check: backend-tests
✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch inbox-check-per-orga

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai generate unit tests to generate unit tests for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@fm3
Copy link
Member Author

fm3 commented Jul 23, 2025

I think it makes sense to wait on #8708 and adapt the changes in here to that. So I’ll let this lie for a moment.

@fm3 fm3 marked this pull request as ready for review August 5, 2025 13:30
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
webknossos-datastore/app/com/scalableminds/webknossos/datastore/controllers/LegacyController.scala (1)

445-445: Remote client usage is correct, but consider improving the error message.

While the migration to remoteWebknossosClient.reportDataSource is proper, the error message on line 448 could be more accurate.

Consider updating the error message to be more precise:

-          Fox.failure(s"Dataset not found in DB or in directory: $status, cannot reload.") ~> NOT_FOUND
+          Fox.failure(s"Dataset in directory is not usable: $status, cannot reload.") ~> NOT_FOUND

Also applies to: 448-448

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 96d6b45 and 8b32ccb.

📒 Files selected for processing (21)
  • app/controllers/WKRemoteDataStoreController.scala (1 hunks)
  • app/models/dataset/Dataset.scala (3 hunks)
  • app/models/dataset/DatasetService.scala (1 hunks)
  • conf/webknossos.latest.routes (1 hunks)
  • frontend/javascripts/admin/rest_api.ts (1 hunks)
  • frontend/javascripts/dashboard/dataset/dataset_collection_context.tsx (3 hunks)
  • frontend/javascripts/dashboard/dataset_view.tsx (2 hunks)
  • test/backend/AdditionalCoordinateTestSuite.scala (3 hunks)
  • test/backend/VolumeBucketKeyTestSuite.scala (1 hunks)
  • unreleased_changes/8791.md (1 hunks)
  • webknossos-datastore/app/com/scalableminds/webknossos/datastore/DataStoreModule.scala (0 hunks)
  • webknossos-datastore/app/com/scalableminds/webknossos/datastore/controllers/DataSourceController.scala (3 hunks)
  • webknossos-datastore/app/com/scalableminds/webknossos/datastore/controllers/LegacyController.scala (4 hunks)
  • webknossos-datastore/app/com/scalableminds/webknossos/datastore/explore/NgffExplorationUtils.scala (2 hunks)
  • webknossos-datastore/app/com/scalableminds/webknossos/datastore/models/datasource/AdditionalAxis.scala (4 hunks)
  • webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/DSRemoteWebknossosClient.scala (1 hunks)
  • webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/DataSourceRepository.scala (0 hunks)
  • webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/DataSourceService.scala (4 hunks)
  • webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/mesh/DSFullMeshService.scala (1 hunks)
  • webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/uploading/UploadService.scala (5 hunks)
  • webknossos-datastore/conf/datastore.latest.routes (1 hunks)
💤 Files with no reviewable changes (2)
  • webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/DataSourceRepository.scala
  • webknossos-datastore/app/com/scalableminds/webknossos/datastore/DataStoreModule.scala
🧰 Additional context used
🧠 Learnings (9)
📓 Common learnings
Learnt from: frcroth
PR: scalableminds/webknossos#8609
File: app/models/dataset/Dataset.scala:753-775
Timestamp: 2025-05-12T13:07:29.637Z
Learning: In the `updateMags` method of DatasetMagsDAO (Scala), the code handles different dataset types distinctly:
1. Non-WKW datasets have `magsOpt` populated and use the first branch which includes axisOrder, channelIndex, and credentialId.
2. WKW datasets will have `wkwResolutionsOpt` populated and use the second branch which includes cubeLength.
3. The final branch is a fallback for legacy data.
This ensures appropriate fields are populated for each dataset type.
📚 Learning: in the `updatemags` method of datasetmagsdao (scala), the code handles different dataset types disti...
Learnt from: frcroth
PR: scalableminds/webknossos#8609
File: app/models/dataset/Dataset.scala:753-775
Timestamp: 2025-05-12T13:07:29.637Z
Learning: In the `updateMags` method of DatasetMagsDAO (Scala), the code handles different dataset types distinctly:
1. Non-WKW datasets have `magsOpt` populated and use the first branch which includes axisOrder, channelIndex, and credentialId.
2. WKW datasets will have `wkwResolutionsOpt` populated and use the second branch which includes cubeLength.
3. The final branch is a fallback for legacy data.
This ensures appropriate fields are populated for each dataset type.

Applied to files:

  • test/backend/AdditionalCoordinateTestSuite.scala
  • webknossos-datastore/app/com/scalableminds/webknossos/datastore/models/datasource/AdditionalAxis.scala
  • webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/uploading/UploadService.scala
  • app/controllers/WKRemoteDataStoreController.scala
  • webknossos-datastore/app/com/scalableminds/webknossos/datastore/explore/NgffExplorationUtils.scala
  • app/models/dataset/Dataset.scala
  • app/models/dataset/DatasetService.scala
  • webknossos-datastore/app/com/scalableminds/webknossos/datastore/controllers/LegacyController.scala
📚 Learning: the parameter in applyvoxelmap was renamed from `slicecount` to `sliceoffset` to better reflect its ...
Learnt from: philippotto
PR: scalableminds/webknossos#8602
File: frontend/javascripts/oxalis/model/volumetracing/volume_annotation_sampling.ts:365-366
Timestamp: 2025-05-07T06:17:32.810Z
Learning: The parameter in applyVoxelMap was renamed from `sliceCount` to `sliceOffset` to better reflect its purpose, but this doesn't affect existing call sites since JavaScript/TypeScript function calls are position-based.

Applied to files:

  • webknossos-datastore/app/com/scalableminds/webknossos/datastore/explore/NgffExplorationUtils.scala
📚 Learning: in the webknossos codebase, classes extending `foximplicits` have access to an implicit conversion f...
Learnt from: frcroth
PR: scalableminds/webknossos#8236
File: webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/mesh/MeshFileService.scala:170-173
Timestamp: 2025-04-23T08:51:57.756Z
Learning: In the webknossos codebase, classes extending `FoxImplicits` have access to an implicit conversion from `Option[A]` to `Fox[A]`, where `None` is converted to an empty Fox that fails gracefully in for-comprehensions.

Applied to files:

  • webknossos-datastore/app/com/scalableminds/webknossos/datastore/explore/NgffExplorationUtils.scala
  • webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/DataSourceService.scala
  • webknossos-datastore/app/com/scalableminds/webknossos/datastore/controllers/LegacyController.scala
📚 Learning: in scala's for-comprehension with fox (future-like type), the `<-` operator ensures sequential execu...
Learnt from: MichaelBuessemeyer
PR: scalableminds/webknossos#8352
File: app/models/organization/CreditTransactionService.scala:0-0
Timestamp: 2025-01-27T12:06:42.865Z
Learning: In Scala's for-comprehension with Fox (Future-like type), the `<-` operator ensures sequential execution. If any step fails, the entire chain short-circuits and returns early, preventing subsequent operations from executing. This makes it safe to perform validation checks before database operations.

Applied to files:

  • webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/DataSourceService.scala
📚 Learning: in scala for-comprehensions with the fox error handling monad, `fox.frombool()` expressions should u...
Learnt from: frcroth
PR: scalableminds/webknossos#8236
File: webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/mesh/NeuroglancerPrecomputedMeshFileService.scala:161-166
Timestamp: 2025-04-28T14:18:04.368Z
Learning: In Scala for-comprehensions with the Fox error handling monad, `Fox.fromBool()` expressions should use the `<-` binding operator instead of the `=` assignment operator to properly propagate error conditions. Using `=` will cause validation failures to be silently ignored.

Applied to files:

  • webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/DataSourceService.scala
📚 Learning: in webknossos scala codebase, when querying database tables with slick, explicit column listing in s...
Learnt from: frcroth
PR: scalableminds/webknossos#8821
File: app/models/dataset/Dataset.scala:864-866
Timestamp: 2025-08-04T11:49:30.012Z
Learning: In WebKnossos Scala codebase, when querying database tables with Slick, explicit column listing in SELECT statements is preferred over SELECT * to ensure columns are returned in the exact order expected by case class mappings. This prevents parsing failures when the physical column order in the production database doesn't match the schema definition order.

Applied to files:

  • app/models/dataset/Dataset.scala
  • webknossos-datastore/app/com/scalableminds/webknossos/datastore/controllers/LegacyController.scala
📚 Learning: for the `getdatasetextentasproduct` function in `dataset_accessor.ts`, input validation for negative...
Learnt from: dieknolle3333
PR: scalableminds/webknossos#8229
File: frontend/javascripts/oxalis/model/accessors/dataset_accessor.ts:348-354
Timestamp: 2024-11-25T14:38:49.345Z
Learning: For the `getDatasetExtentAsProduct` function in `dataset_accessor.ts`, input validation for negative or zero dimensions is not necessary.

Applied to files:

  • frontend/javascripts/dashboard/dataset/dataset_collection_context.tsx
📚 Learning: in the neuroglancermesh class, shardingspecification is defined as a concrete shardingspecification ...
Learnt from: frcroth
PR: scalableminds/webknossos#8236
File: webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/mesh/NeuroglancerPrecomputedMeshService.scala:55-66
Timestamp: 2025-04-23T09:22:26.829Z
Learning: In the NeuroglancerMesh class, shardingSpecification is defined as a concrete ShardingSpecification value, not an Option. It uses meshInfo.sharding.getOrElse(ShardingSpecification.empty) to provide a default empty specification if none is present, ensuring that mesh.shardingSpecification is never null.

Applied to files:

  • webknossos-datastore/app/com/scalableminds/webknossos/datastore/controllers/DataSourceController.scala
🧬 Code Graph Analysis (2)
frontend/javascripts/admin/rest_api.ts (1)
frontend/javascripts/admin/api/token.ts (1)
  • doWithToken (39-74)
frontend/javascripts/dashboard/dataset/dataset_collection_context.tsx (1)
frontend/javascripts/admin/rest_api.ts (1)
  • triggerDatasetCheck (1307-1322)
🔇 Additional comments (36)
test/backend/VolumeBucketKeyTestSuite.scala (1)

47-47: LGTM! Consistent with the Array to Seq refactoring.

The change from Array(0, 10) to Seq(0, 10) aligns with the broader refactoring to make AdditionalAxis bounds deterministic, which fixes the hashCode implementation issue mentioned in the PR objectives.

unreleased_changes/8791.md (1)

1-6: LGTM! Changelog accurately reflects the user-facing changes.

The changelog entries clearly document the two main improvements: performance optimization for multi-organization dataset scanning and enhanced search functionality for dataset IDs.

webknossos-datastore/conf/datastore.latest.routes (1)

120-120: LGTM! Route correctly supports organization-scoped inbox checking.

The addition of the optional organizationId: Option[String] parameter enables organization-specific inbox scanning while maintaining backward compatibility through the optional parameter.

test/backend/AdditionalCoordinateTestSuite.scala (1)

13-13: LGTM! Consistent Array to Seq refactoring throughout the test suite.

All AdditionalAxis constructor calls have been updated to use Seq instead of Array for bounds, maintaining consistency with the broader codebase refactoring that fixes the hashCode determinism issue.

Also applies to: 29-31, 34-34, 47-47, 51-51

webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/mesh/DSFullMeshService.scala (1)

37-43: LGTM! Constructor properly updated to remove obsolete dependency.

The removal of dataSourceRepository from the constructor aligns with the PR's architectural goal of eliminating the stateful DataSourceRepository in favor of remote client operations.

conf/webknossos.latest.routes (1)

110-110: LGTM! Clean addition of optional organization scoping.

The route update correctly adds an optional organizationId parameter to enable organization-specific dataset operations while maintaining backward compatibility.

frontend/javascripts/dashboard/dataset_view.tsx (3)

36-36: LGTM! Proper state management integration.

Good addition of the useWkSelector hook to access the active organization from global state.


316-316: LGTM! Correct organization ID retrieval.

The organization ID is properly extracted from the Redux state using the standard selector pattern.


323-323: LGTM! Organization-aware dataset checking.

The dropdown menu action now correctly passes the organization ID to enable scoped dataset scanning, fulfilling the PR's main objective.

webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/DSRemoteWebknossosClient.scala (2)

118-118: LGTM! Well-designed optional parameter addition.

The method signature properly adds the optional organizationId parameter while maintaining backward compatibility.


121-121: LGTM! Correct conditional query parameter handling.

Good use of addQueryStringOptional to only include the organization ID parameter when it's provided, avoiding empty or null query parameters.

frontend/javascripts/dashboard/dataset/dataset_collection_context.tsx (3)

30-30: LGTM! Proper type definition update.

The interface correctly reflects the new optional organizationId parameter for the checkDatasets function.


219-219: LGTM! Clean function signature update.

The async function properly accepts the optional organizationId parameter with correct TypeScript typing.


233-233: LGTM! Correct parameter propagation.

The organizationId is properly passed to triggerDatasetCheck for each datastore, enabling organization-scoped dataset checking as intended.

webknossos-datastore/app/com/scalableminds/webknossos/datastore/explore/NgffExplorationUtils.scala (2)

187-187: LGTM! Better Scala collection usage.

Good change from Array[Int] to Seq[Int] for the bounds parameter. Seq is more idiomatic Scala and provides better flexibility while maintaining the same functionality.


206-206: LGTM! Consistent collection type usage.

The construction of bounds using Seq(0, shape(axisAndIndex._2).toInt) is consistent with the updated method signature and maintains the same logic.

frontend/javascripts/admin/rest_api.ts (1)

1307-1322: LGTM! Clean implementation of organization-scoped dataset checking.

The function signature change properly supports the new optional organizationId parameter, and the switch to URLSearchParams is a good improvement over manual string concatenation. The conditional parameter addition ensures clean URLs when no organization filter is needed.

app/models/dataset/DatasetService.scala (1)

294-301: LGTM! Clean implementation of organization-scoped dataset deactivation.

The method signature correctly adds the optional organizationId parameter and properly passes it through to the DAO layer. The use of Option[String] follows Scala conventions and maintains type safety.

webknossos-datastore/app/com/scalableminds/webknossos/datastore/models/datasource/AdditionalAxis.scala (1)

9-9: Excellent fix for deterministic hashCode behavior!

Changing bounds from Array[Int] to Seq[Int] resolves the non-deterministic hashCode issue that was causing datasets with AdditionalAxes to be incorrectly marked as changed on every re-report. Arrays use reference equality while Seq uses structural equality, making this change essential for consistent behavior.

Also applies to: 21-21, 24-24, 45-45, 81-81

app/controllers/WKRemoteDataStoreController.scala (2)

191-191: LGTM: Clean implementation of organization-scoped scanning.

The optional organizationId parameter enables the organization-specific dataset scanning feature while maintaining backward compatibility.


197-200: Excellent contextual logging for organization-aware operations.

The conditional organization labeling in log messages provides clear visibility into whether operations are scoped to a specific organization or running across all organizations. This will be valuable for debugging and monitoring.

Also applies to: 202-202, 204-206

webknossos-datastore/app/com/scalableminds/webknossos/datastore/controllers/DataSourceController.scala (2)

77-86: Well-implemented organization-scoped access validation.

The conditional access validation properly handles both organization-specific and general administrative access patterns. When organizationId is provided, it validates organization-specific access; otherwise, it falls back to general administrative access.


457-457: Correct transition from repository to remote client pattern.

Replacing direct dataSourceRepository usage with dsRemoteWebknossosClient.deleteDataSource(dataSourceId) aligns with the architectural changes described in the PR, where the WebKnossos-side database becomes the single source of truth.

app/models/dataset/Dataset.scala (3)

387-393: LGTM! Nice enhancement to support dataset ID searches.

The implementation correctly detects when a search query is a valid ObjectId and performs direct ID matching, while preserving the existing name-based search functionality as a fallback.


663-663: LGTM! Proper implementation of organization-scoped deactivation.

The optional organizationId parameter correctly restricts dataset deactivation to a specific organization when provided, while maintaining backward compatibility. The SQL predicate construction is clean and secure.

Also applies to: 666-669


1185-1185: Good fix for the non-deterministic hashCode issue.

Replacing Array with Seq ensures deterministic hashCode behavior, preventing datasets with AdditionalAxes from being incorrectly marked as changed on every re-report.

webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/uploading/UploadService.scala (2)

21-21: LGTM! Clean removal of DataSourceRepository dependency.

The constructor properly removes the DataSourceRepository parameter while retaining DSRemoteWebknossosClient, aligning with the architectural shift to remote client-based operations.

Also applies to: 108-108


364-364: Consistent migration to remote client operations.

All data source operations have been properly migrated from direct repository calls to remoteWebknossosClient methods, maintaining the same functionality while aligning with the new architecture where the WebKnossos-side database is the single source of truth.

Also applies to: 462-462, 694-694

webknossos-datastore/app/com/scalableminds/webknossos/datastore/controllers/LegacyController.scala (2)

15-16: Good dependency cleanup and delegation pattern.

The controller has been properly simplified by removing unused dependencies and delegating mesh operations to DSMeshController, following the single responsibility principle.

Also applies to: 28-36


426-428: Clean delegation of mesh loading functionality.

The method correctly obtains the dataset ID through remoteWebknossosClient and delegates the mesh STL loading to the specialized meshController, maintaining proper separation of concerns.

webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/DataSourceService.scala (6)

55-57: LGTM!

The change correctly implements the default behavior where periodic scans continue to scan all organizations by passing organizationId = None.


74-79: Well-structured organization filtering logic.

The implementation correctly handles both organization-specific and all-organization scanning scenarios with clear filter functions and appropriate logging context.


81-84: Correct application of organization filter.

The filter is properly applied to directory listing, and empty directory logging is appropriately restricted to full scans.


85-92: Proper propagation of organizationId parameter.

The organizationId is correctly passed to the remote client for organization-scoped reporting, and error messages include appropriate context.


174-179: Enhanced logging with organization context.

The updated logging provides better visibility into which organization(s) were scanned, improving debugging and monitoring capabilities.


331-331: Method rename improves clarity.

The new name scanOrganizationDirForDataSources better describes the method's purpose of scanning a single organization directory.

Copy link
Contributor

@frcroth frcroth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great stuff!

import com.scalableminds.util.tools.{Box, Failure, Full}
import play.api.libs.json.{Format, Json}

// bounds: lower bound inclusive, upper bound exclusive
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Array hashcode stuff is annoying, good catch!
Maybe a test that calculates the hashcode of a datasource would be nice?

@fm3 fm3 changed the title Allow binaryData scan for single orga Allow binaryData scan for single organization Aug 6, 2025
@fm3 fm3 enabled auto-merge (squash) August 6, 2025 08:59
@fm3 fm3 merged commit 0fd771b into master Aug 6, 2025
5 checks passed
@fm3 fm3 deleted the inbox-check-per-orga branch August 6, 2025 09:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Restrict scanning binaryData folder to organization
3 participants