Skip to content

Conversation

fm3
Copy link
Member

@fm3 fm3 commented Aug 7, 2025

Description

This PR changes how clients with direct access to the underlying data storage can get datasets registered into WK.

Corresponding libs client PR: scalableminds/webknossos-libs#1359

New ReserveDatasetUploadToPaths Protocol
  • Client sends reserveDatasetUploadToPaths request to wk side, including a preliminary datasource plus layersToLink.
  • Server generates id and paths for all mags and attachments where the client must put the stuff. This is returned to the client.
  • Server also validates layersToLink, and creates a dataset in the database, marked as not yet fully uploaded to the target paths.
  • After the client has actually put the data there, it calls finishDatasetUploadToPath, marking the dataset as complete. It can now be used.
  • Note that layersToLink here are by datasetId+layerName, but for normal dataset upload it’s still datasetDirectoryName+layerName (we’ll change that to id later)
New ReserveAttachmentUploadToPath Protocol
  • Client sends reserveAttachmentUploadToPath request with dataset id, attachment name, type etc. WK returns the path where the client must put the attachment. Wk inserts this path in the DB, marked as not yet fully uploaded
  • After the client has actually put the data there, it calls finishAttachmentUploadToPath with the same parameters (as there is no attachment id), marking the attachment as complete. It can now be used.
New ReserveUploadToPathForPreliminary Protocol This extra route supports a new workflow, which will be used only by the convert_to_wkw worker job, which webknossos starts when a user uploads a dataset that is not yet in wkw/zarr format. When starting this upload, a dataset is already inserted in the database (via reserveUpload – not manual). Then after finishUpload, the conversion worker job is started. Now after the worker has converted the dataset to wkw/zarr, it needs to call this new reserveUploadToPathsForPreliminary route, which works only on datasets in this particular state. It returns paths, just like the normal, user-facing reserveUploadToPaths described above. But it is different in that it needs to operate on the datasetId of the already-created, preliminary dataset from the original reserveUpload.
Changes to Editing DataSources
  • Dataset settings view no longer allows changing layer names
  • It also no longer allows editing the json directly (textarea is hidden for now, to be removed in follow-up)
  • The backend now only applies the allowed changes on save. (category, boundingBox, coordinatesTransformations, defaultViewConfiguration, adminViewConfiguration, largestSegmentId, voxelSize, deleted layers)
Misc Changes
  • localDirectoryWhitelist is no longer checked in data loading, only in add/exploreAndAdd/upload
  • exploreAndAdd now returns dataset id
  • full dataSource now included in dataset publicWrites, with source data paths if requested by includePaths
Refactoring: UPath
  • New class UPath handles paths that can be either remote (s3, https, gcs) or local (file or no schem)
  • Attachments and mags have this UPath class now instead of URI and String respectively
  • VaultPath also wraps a UPath (plus its DataVault with credentials)
Refactoring: New DataLayer Class Hierarchy
  • Simplified DataSource and DataLayer classes.
image

Steps to test:

Test against regressions
  • Browse some data, annotate some, explore remote dataset, explore local dataset in directoryWhitelist, should all still work
  • Test that both the newest and also older libs versions can still open_remote and list existing datasets (including ones with unusual data, like remote N5)
Test new features
  • With new libs client code, go through the reserveUploadToPaths flow Read directly from remote paths, upload Dataset with reserve_manual_upload webknossos-libs#1359
  • Also test reserveAttachmentUploadToPath via libs
  • Test that libs can still upload a dataset.
  • Attempt to change paths of an existing dataset in wk, or add a dataset with paths outside of itself, should not be possible (security!)
  • Change the application.conf webKnossos.datasets.uploadToPathsPrefixes to our managed s3 object storage, give both wk and the client credentials, should also work with this.

TODOs:

Backend
  • Decide how the directory name should look: just the id? or prefix-id? (might get outdated on renamings)
  • is the reserve response the full datasource again? or just a path prefix?
  • how to configure this in the dev setup
  • clean up the whole DataSource mess?
    • ensure we include resolutions;
    • never leak credentialIds (except during explore/add flow);
      • settings view reads InboxDataSource, which includes credentialIds.
      • Needs new Api for changing relevant fields
    • compact json version? (resolutions, no paths?)
    • do not write resolutions to disk?
  • directory scan overwrites status to No longer available on the datastore
  • reserveAttachmentUploadToPaths
  • what’s up with the zarr streaming snapshots?
  • evolutions
  • does upload still check unique directory name?
  • ensure that add route still checks localDirectoryWhiteList (also for exploreAndAdd)
  • move stuff to DatasetService, make createDataset private again
  • re-test upload with layersToLink from libs
  • return readable failures on now-unsupported routes? (old datastore-based reserveManualUpload)
  • do not require status field in datasource in reserveUpload request body.
  • introduce api version 11
  • re-test that editing DS in DB + reload propagates to datastore
  • on finishAttachmentUploadToPath, update datasource on disk too if isVirtual=False.
  • datasource update call via wk backend instead of frontend->datastore? skip datastore if isVirtual=true
  • more unit tests for UPath
  • double check that dataset info route is backwards compatible for libs
  • add remote UI layer name no longer editable, but explore returns empty, that’s a problem. test with first entry from notion remote datasets table.
Frontend
  • dataset settings view: include paths, remove advanced view
  • updating datasource should go via wk first

Follow-Ups

Possible Follow-Up Issues

Issues:


@fm3 fm3 self-assigned this Aug 7, 2025
Copy link
Contributor

coderabbitai bot commented Aug 7, 2025

📝 Walkthrough

Walkthrough

Refactors datastore/layer models to UsableDataSource/StaticLayer with UPath paths; adds upload-to-paths and attachment reservation flows, DB evolutions, many controller/service/DAO signature updates, threads MessagesProvider through annotation/task APIs, and bumps API to v11.

Changes

Cohort / File(s) Summary
Core datasource & layer model
webknossos-datastore/.../models/datasource/*, app/models/dataset/DatasetService.scala, app/models/dataset/Dataset.scala
Introduce DataSource algebra (UsableDataSource/UnusableDataSource), StaticLayer family, DataFormat, ElementClass, LayerCategory, DataSourceId, DataSourceStatus; replace inbox/generic models; update JSON formats and DatasetService APIs (add usableDataSourceFor, update many signatures).
Path abstraction & datavault
webknossos-datastore/.../helpers/UPath.scala, .../helpers/PathSchemes.scala, .../datavault/*, .../VaultPath.scala
Add UPath and PathSchemes; migrate VaultPath and DataVault implementations to UPath-based APIs; add toRemoteUriUnsafe/toUPath; update path resolution and hashing.
Explorers & data-format layers
webknossos-datastore/.../explore/*, .../explore/*Explorer.scala, .../dataformats/*
Replace Zarr/N5/WKW/Precomputed layer types with StaticLayer/StaticColorLayer/StaticSegmentationLayer; introduce DataFormat; mags and mag paths use UPath; explorers now return (StaticLayer, VoxelSize).
Disk write, validation & dataset utils
webknossos-datastore/.../services/DataSourceToDiskWriter.scala, .../services/DataSourceValidation.scala, .../helpers/DatasetDeleter.scala, app/models/dataset/ComposeService.scala
Add DataSourceToDiskWriter.updateDataSourceOnDisk, DataSourceValidation trait; adapt deleter/compose to UsableDataSource/DataSourceId; add relativize/backup/validation logic and thread MessagesProvider in compose.
Upload-to-paths & attachments
app/models/dataset/DatasetUploadToPathsService.scala, app/controllers/DatasetController.scala, app/controllers/DataStoreController.scala, webknossos-datastore/.../services/uploading/UploadService.scala
New DatasetUploadToPathsService; request/response types and controller endpoints for reserving/finishing dataset and attachment uploads; introduce pending-attachment flag and link-layer flows; validate datastore upload-to-paths permission; update create/virtual dataset flows.
Controllers & routes (server)
app/controllers/*, conf/webknossos*.routes
Many controller signatures updated to accept/return UsableDataSource and UPath types; implicit MessagesProvider threaded into annotation/task APIs; added reserve/finish upload-to-paths and folder routes; updated WKRemoteDataStoreController and removed/renamed legacy endpoints; API version routing bumped to v11.
DAOs, schema & migrations
app/models/dataset/DataStore.scala, tools/postgres/schema.sql, conf/evolutions/141-allows-upload-to-paths.sql
Add allowsUploadToPaths to DataStore model/DAO and DB; add uploadToPathIsPending to dataset_layer_attachments; SQL evolution 141 and reversion added.
Datastore controllers & streaming
webknossos-datastore/.../controllers/*
Streaming/controllers use UsableDataSource/StaticLayer; replace GenericDataSource constants with UsableDataSource.FILENAME_DATASOURCE_PROPERTIES_JSON; add invalidateCache endpoint and adjust explore/add flows for directory naming.
Annotation/task surfaces & tracing clients
app/models/annotation/*, app/models/task/*, app/models/annotation/handler/*, app/models/annotation/AnnotationDataSourceTemporaryStore.scala, app/models/annotation/WKRemoteTracingStoreClient.scala
Thread MessagesProvider implicit through many annotation handlers/stores; replace DataSourceLike/generic types with UsableDataSource; adjust fallback/fallback-layer logic to StaticSegmentationLayer; update temporary-store and tracing-store client signatures.
Datastore attachments & helpers
webknossos-datastore/.../models/datasource/DataLayerAttachments.scala, .../dataformats/MagLocator.scala
Add DataLayerAttachments, LayerAttachment types and enums; change MagLocator.path to Option[UPath]; remove old DatasetLayerAttachments and related legacy attachment files.
Frontend changes & tests
frontend/javascripts/*, frontend/javascripts/test/*
getDataset gains includePaths param; removed read/update datasource helpers; dataset settings UI hides advanced/config editing; partial dataset updates carry dataSource; tests updated (e2e assertions, mock signatures); snapshot stabilization adds path volatile key.
Utilities & build
util/.../JsonHelper.scala, app/utils/WkConf.scala, conf/application.conf, project/Dependencies.scala, project/DependencyResolvers.scala (removed), build.sbt
JsonHelper.removeKeyRecursively made public and accepts multiple keys; add WebKnossos.Datasets config keys and Datastore.baseDirectory; add application.conf keys; move dependency resolvers into Dependencies and update build.sbt; remove old DependencyResolvers.scala.

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~180 minutes

Possibly related PRs

Suggested reviewers

  • MichaelBuessemeyer

Poem

I nibble UPaths at the break of light,
Static layers hopping, tidy and bright.
Usable sources find their way,
Paths reserved — I dance and sway.
V11 dawns — a rabbit's little delight. 🐇

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 20.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (4 passed)
Check name Status Explanation
Title Check ✅ Passed The title concisely names the three primary changes introduced by the PR — the new reserveUploadToPaths protocol, the DataLayer class refactor, and the UPath abstraction — and accurately reflects the main intent and scope of the changeset.
Linked Issues Check ✅ Passed The code implements the linked objectives: server-driven reserve/finish upload flows and attachment reservation via DatasetUploadToPathsService and new controller routes ([8827], [8859], [8857]); the DataLayer simplification to StaticLayer/Static*Layer and UsableDataSource is applied across explorers, services, and controllers ([8851]); a unified path abstraction (UPath) and VaultPath migration appear consistently across vaults, explorers, and mag/attachment handling ([8762]); explore-and-add now returns dataset id as described ([8881]); and the dependency resolver changes and project/Dependencies update address the sonatype deprecation concern ([8929]). These mappings are visible in the added/modified files (new service and endpoints, many StaticLayer/UsableDataSource replacements, UPath/VaultPath additions, and project build changes).
Out of Scope Changes Check ✅ Passed I found no changes that appear unrelated to the stated objectives; the wide-ranging edits (DataLayer refactor, UPath, new reserve/upload-to-paths flows, controller and frontend adjustments, and DB evolutions) are coherent and consistent with the PR goals rather than out-of-scope.
Description Check ✅ Passed The PR description is directly related to the changeset: it documents the new reserveUploadToPaths/reserveAttachmentUploadToPath/reserveUploadToPathsForPreliminary protocols, the UPath and DataLayer/DataSource refactors, UI/backend editing changes, testing steps, TODOs, and the linked libs PR, which matches the file-level summaries. It provides clear, actionable context for reviewers and is not off-topic.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch reserve-manual

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (4)
app/models/dataset/DatasetService.scala (4)

280-289: Consider clearing thumbnail cache for consistency

Other dataset update flows clear the thumbnail cache before updating. This path should do the same to ensure thumbnails are regenerated with the new data.

Apply this diff to clear thumbnails before updating:

       _ <- if (isChanged) {
         logger.info(s"Updating dataSource of $datasetId")
         for {
+          _ <- thumbnailCachingService.removeFromCache(datasetId)
           _ <- Fox.runIf(!dataset.isVirtual)(dataStoreClient.updateDataSourceOnDisk(datasetId, updatedDataSource))
           _ <- dataStoreClient.invalidateDatasetInDSCache(datasetId)
           _ <- datasetDAO.updateDataSource(datasetId,
                                            dataset._dataStore,
                                            updatedDataSource.hashCode(),
                                            updatedDataSource,
                                            isUsable = true)(GlobalAccessContext)
         } yield ()

407-416: Optimize database query for unusable datasets

The current implementation always queries datasetDataLayerDAO.findAllForDataset even when the dataset is not usable, which wastes a database query.

Apply this optimization to avoid unnecessary DB queries:

   def dataSourceFor(dataset: Dataset): Fox[DataSource] = {
     val dataSourceId = DataSourceId(dataset.directoryName, dataset._organization)
-    if (dataset.isUsable)
+    if (!dataset.isUsable)
+      Fox.successful(UnusableDataSource(dataSourceId, None, dataset.status, dataset.voxelSize))
+    else
       for {
         voxelSize <- dataset.voxelSize.toFox ?~> "dataset.source.usableButNoVoxelSize"
         dataLayers <- datasetDataLayerDAO.findAllForDataset(dataset._id)
       } yield UsableDataSource(dataSourceId, dataLayers, voxelSize)
-    else
-      Fox.successful(UnusableDataSource(dataSourceId, None, dataset.status, dataset.voxelSize))
   }

295-301: Critical: Layers are dropped when not present in updates

The flatMap in applyDataSourceUpdates will remove any existing layer not found in the updates. This is dangerous for partial updates where the client might only send layers they want to modify.

Apply this diff to preserve layers not included in the update:

-    val updatedLayers = existingDataSource.dataLayers.flatMap { existingLayer =>
-      val layerUpdatesOpt = updates.dataLayers.find(_.name == existingLayer.name)
-      layerUpdatesOpt match {
-        case Some(layerUpdates) => Some(applyLayerUpdates(existingLayer, layerUpdates))
-        case None               => None
-      }
-    }
+    val updatedLayers = existingDataSource.dataLayers.map { existingLayer =>
+      updates.dataLayers.find(_.name == existingLayer.name) match {
+        case Some(layerUpdates) => applyLayerUpdates(existingLayer, layerUpdates)
+        case None               => existingLayer
+      }
+    }

270-290: String interpolation will fail compilation

Line 290 uses f interpolator without format specifiers which won't compile.

Apply this diff:

-      } else Fox.successful(logger.info(f"DataSource $datasetId not updated as the hashCode is the same"))
+      } else Fox.successful(logger.info(s"DataSource $datasetId not updated as the hashCode is the same"))
🧹 Nitpick comments (2)
app/models/dataset/DataStore.scala (1)

152-159: Fix inconsistent error message in finder method

The findOneWithUploadsToPathsAllowed method returns "find one with uploads allowed" as the error message, which doesn't accurately describe what this method does. It should indicate it's looking for uploads to paths specifically.

Apply this diff to fix the error message:

-      parsed <- parseFirst(r, "find one with uploads allowed")
+      parsed <- parseFirst(r, "find one with uploads to paths allowed")
app/models/dataset/DatasetService.scala (1)

69-73: Redundant dataset name uniqueness check

The method assertNewDatasetNameUnique appears to duplicate the functionality of checkNameAvailable (lines 75-79). Both methods check if a dataset name exists in an organization and return an error if it does.

Consider removing assertNewDatasetNameUnique and using checkNameAvailable consistently throughout the codebase to avoid duplication:

-  def assertNewDatasetNameUnique(name: String, organizationId: String): Fox[Unit] =
-    for {
-      exists <- datasetDAO.doesDatasetNameExistInOrganization(name, organizationId)
-      _ <- Fox.fromBool(!exists) ?~> "dataset.name.taken"
-    } yield ()
-
   def checkNameAvailable(organizationId: String, datasetName: String): Fox[Unit] =
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 65ace89 and 79ba6e5.

📒 Files selected for processing (4)
  • app/controllers/DataStoreController.scala (2 hunks)
  • app/models/dataset/DataStore.scala (7 hunks)
  • app/models/dataset/DatasetService.scala (18 hunks)
  • app/models/dataset/DatasetUploadToPathsService.scala (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • app/models/dataset/DatasetUploadToPathsService.scala
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-04-28T14:18:04.368Z
Learnt from: frcroth
PR: scalableminds/webknossos#8236
File: webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/mesh/NeuroglancerPrecomputedMeshFileService.scala:161-166
Timestamp: 2025-04-28T14:18:04.368Z
Learning: In Scala for-comprehensions with the Fox error handling monad, `Fox.fromBool()` expressions should use the `<-` binding operator instead of the `=` assignment operator to properly propagate error conditions. Using `=` will cause validation failures to be silently ignored.

Applied to files:

  • app/models/dataset/DatasetService.scala
📚 Learning: 2025-05-12T13:07:29.637Z
Learnt from: frcroth
PR: scalableminds/webknossos#8609
File: app/models/dataset/Dataset.scala:753-775
Timestamp: 2025-05-12T13:07:29.637Z
Learning: In the `updateMags` method of DatasetMagsDAO (Scala), the code handles different dataset types distinctly:
1. Non-WKW datasets have `magsOpt` populated and use the first branch which includes axisOrder, channelIndex, and credentialId.
2. WKW datasets will have `wkwResolutionsOpt` populated and use the second branch which includes cubeLength.
3. The final branch is a fallback for legacy data.
This ensures appropriate fields are populated for each dataset type.

Applied to files:

  • app/models/dataset/DatasetService.scala
🧬 Code graph analysis (3)
app/models/dataset/DataStore.scala (2)
app/utils/sql/SecuredSQLDAO.scala (2)
  • readAccessQuery (23-36)
  • existingCollectionName (16-16)
app/utils/sql/SqlInterpolation.scala (2)
  • q (20-39)
  • asUpdate (74-74)
app/controllers/DataStoreController.scala (3)
app/models/dataset/DataStore.scala (3)
  • DataStore (20-31)
  • DataStore (33-63)
  • fromForm (36-56)
util/src/main/scala/com/scalableminds/util/tools/Fox.scala (3)
  • Fox (30-223)
  • Fox (225-298)
  • fromBool (32-36)
app/models/analytics/AnalyticsService.scala (1)
  • isSuperUser (103-108)
app/models/dataset/DatasetService.scala (7)
util/src/main/scala/com/scalableminds/util/tools/JsonHelper.scala (2)
  • JsonHelper (16-112)
  • removeKeyRecursively (100-112)
util/src/main/scala/com/scalableminds/util/tools/TextUtils.scala (2)
  • TextUtils (6-75)
  • normalizeStrong (50-56)
webknossos-datastore/app/com/scalableminds/webknossos/datastore/helpers/UPath.scala (4)
  • UPath (54-96)
  • fromString (59-59)
  • toString (106-109)
  • toString (174-174)
webknossos-datastore/app/com/scalableminds/webknossos/datastore/models/datasource/DataSource.scala (6)
  • DataSource (31-43)
  • UnusableDataSource (45-62)
  • UnusableDataSource (64-66)
  • UsableDataSource (68-107)
  • UsableDataSource (109-113)
  • isUsable (20-20)
webknossos-datastore/app/com/scalableminds/webknossos/datastore/models/datasource/DataSourceId.scala (3)
  • DataSourceId (5-7)
  • DataSourceId (9-26)
  • toString (6-6)
webknossos-datastore/app/com/scalableminds/webknossos/datastore/models/datasource/DataSourceStatus.scala (1)
  • DataSourceStatus (3-11)
app/models/dataset/WKRemoteDataStoreClient.scala (3)
  • updateDataSourceOnDisk (116-123)
  • invalidateDatasetInDSCache (109-116)
  • validatePaths (104-109)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: build-smoketest-push
  • GitHub Check: frontend-tests
  • GitHub Check: backend-tests
🔇 Additional comments (8)
app/models/dataset/DatasetService.scala (7)

283-283: Cache invalidation effect not executed

The datastore cache invalidation uses <- binding correctly, so the effect will be executed.


559-563: Directory name generation is robust

The new generateDirectoryName method properly handles edge cases where normalization returns None by falling back to the dataset ID. This ensures every dataset gets a unique directory name.


593-593: Good security practice: credentials properly removed from public JSON

The use of JsonHelper.removeKeyRecursively to strip out credentialId and credentials from the dataSource JSON is a good security practice. This ensures sensitive authentication information is never exposed in API responses.


401-405: Clear error messaging for unusable datasets

The usableDataSourceFor method provides clear error messages when datasets are not imported, helping users understand why a dataset cannot be used.


81-89: Consistent use of DataSourceStatus constants

Good use of the new DataSourceStatus.notYetUploaded constant instead of hardcoded strings, improving maintainability.


102-103: Verify unique directory names across the system

Appending the dataset ID makes names unique but breaks the previous “plain name” behavior. Occurrences of directoryName found in:
./webknossos-datastore/app/com/scalableminds/webknossos/datastore/controllers/LegacyController.scala
./webknossos-datastore/app/com/scalableminds/webknossos/datastore/controllers/DataSourceController.scala
./test/backend/DataSourceTestSuite.scala
./app/models/dataset/DatasetService.scala
./app/models/job/JobService.scala
./app/models/job/Job.scala
./app/models/annotation/AnnotationService.scala
./app/controllers/UserTokenController.scala
./app/controllers/WKRemoteDataStoreController.scala
./app/controllers/DatasetController.scala

Inspect/update these callers and tests to accept the new directory format and run the full test suite.


308-369: Add unit tests for applyLayerUpdates (Color ↔ Segmentation conversions)

File: app/models/dataset/DatasetService.scala (lines 308–369)

applyLayerUpdates performs type conversions and selective field updates — add/unit-test the conversion matrix and edge cases:

  • Color -> Color: assert only boundingBox, coordinateTransformations, defaultViewConfiguration, adminViewConfiguration are replaced; name, dataFormat, elementClass, mags, additionalAxes, attachments remain unchanged.
  • Color -> Segmentation: assert resulting StaticSegmentationLayer uses existing layer’s name, dataFormat, elementClass, mags, additionalAxes, attachments and takes boundingBox, coordinateTransformations, defaultViewConfiguration, adminViewConfiguration, largestSegmentId from the update; verify the final segmentation-only constructor field is set to None.
  • Segmentation -> Segmentation: assert boundingBox, coordinateTransformations, defaultViewConfiguration, adminViewConfiguration, largestSegmentId are updated.
  • Segmentation -> Color: assert resulting StaticColorLayer preserves existing name/dataFormat/elementClass/mags/additionalAxes/attachments and takes boundingBox/default/adminViewConfiguration/coordinateTransformations from the update; largestSegmentId must be dropped.
  • Edge cases: test largestSegmentId = None vs Some(value), empty vs non-empty attachments/additionalAxes, and null/absent coordinate transformations.

Repo search produced no matching tests for these conversions — add unit tests (and an integration test exercising the public API path that calls applyLayerUpdates, if available).

app/controllers/DataStoreController.scala (1)

48-49: Fix authorization flow issue with Fox.fromBool

The current code uses = for Fox.fromBool which doesn't properly sequence the effect. This will cause the authorization check to be evaluated but not properly fail when the user is not a super user.

Apply this diff to fix the authorization flow:

-            multiUser <- multiUserDAO.findOne(request.identity._multiUser)
-            _ <- Fox.fromBool(multiUser.isSuperUser) ?~> "notAllowed" ~> FORBIDDEN
+            multiUser <- multiUserDAO.findOne(request.identity._multiUser)
+            _ <- Fox.fromBool(multiUser.isSuperUser) ?~> "notAllowed" ~> FORBIDDEN

Wait, I see the code already uses <- binding. Let me re-examine...

Actually, the code is correct as shown. The <- operator is properly used for both lines.

@MichaelBuessemeyer MichaelBuessemeyer mentioned this pull request Sep 19, 2025
7 tasks
@MichaelBuessemeyer
Copy link
Contributor

Is there an open TODO for me? You need an approval from my side correct?

Is there anything that needs re-reviewing / testing as you made some changes?

@fm3
Copy link
Member Author

fm3 commented Sep 19, 2025

Yes, the latest changes are:

  • Re-Add v10 reserveManualUpload, creating backwards compatibility for older libs versions. That means we can merge this PR as soon as it is approved :)
  • Renaming of the new APIs
  • added reserveUploadToPathsForPreliminary route, which is to be used (probably only) by the convert_to_wkw worker job. This allows to fill an “uploading” dataset, which the worker converts from e.g. tif to wkw.

I’d be happy if we could merge this in the coming week. Maybe you could have another quick look at the latest changes

Copy link
Contributor

@MichaelBuessemeyer MichaelBuessemeyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I reviewed the new commits again:

  1. Some messages still need to be added to conf/messages
  2. I do not fully understand the new workflow for uploading a dataset to S3 🙈

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
frontend/javascripts/admin/rest_api.ts (2)

993-1000: Avoid emitting a trailing “?” when no params are set.

If both sharingToken and includePaths are unset, the URL becomes /api/datasets/{id}?. Small but noisy for logs/caches.

Apply:

   const params = new URLSearchParams();
   if (sharingToken != null) {
     params.set("sharingToken", String(sharingToken));
   }
   if (includePaths != null) {
     params.set("includePaths", String(includePaths));
   }
-  return Request.receiveJSON(`/api/datasets/${datasetId}?${params}`, options);
+  const qs = params.toString();
+  const url = qs ? `/api/datasets/${datasetId}?${qs}` : `/api/datasets/${datasetId}`;
+  return Request.receiveJSON(url, options);

1018-1027: Scope dataSource updates to server-allowed fields (or confirm backend ignores others).

Type allows full APIDataSource. If backend only permits a subset, consider a narrowed AllowedDataSourceUpdates type to prevent accidental over-posting; otherwise confirm server-side whitelisting.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 336ebbd and ca63a9b.

📒 Files selected for processing (1)
  • frontend/javascripts/admin/rest_api.ts (3 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
frontend/javascripts/admin/rest_api.ts (2)
frontend/javascripts/libs/request.ts (1)
  • RequestOptions (31-31)
frontend/javascripts/types/api_types.ts (2)
  • APIDataset (243-246)
  • APIDataSource (152-152)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: build-smoketest-push
  • GitHub Check: backend-tests
  • GitHub Check: frontend-tests
🔇 Additional comments (2)
frontend/javascripts/admin/rest_api.ts (2)

1015-1015: Delegation LGTM.

getDatasetLegacy forcing includePaths = true matches legacy behavior intent.


987-992: Breaking signature change — verify call sites or add overloads.

getDataset's params were reordered to (datasetId, includePaths?, sharingToken?, options?). Callers that pass a string as the 2nd argument (old sharingToken position) will break. Automated scan couldn't be completed in this environment — run a repo-wide search for getDataset call sites and update callers, or add TypeScript overloads to accept both shapes: (id, sharingToken?, options?) and (id, includePaths?, sharingToken?, options?). File: frontend/javascripts/admin/rest_api.ts (≈lines 987–992).

Copy link
Contributor

@MichaelBuessemeyer MichaelBuessemeyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! Thanks a lot for your hard work!

Let's give it a go 🟢

@fm3 fm3 merged commit d0ace83 into master Sep 22, 2025
5 checks passed
@fm3 fm3 deleted the reserve-manual branch September 22, 2025 10:54
fm3 added a commit that referenced this pull request Sep 23, 2025
Based on and thus blocked by #8844 

### Steps to test:
- refresh local database schema
- start up, log in, should see l4_sample_remote in the dashboard dataset
list
- should be usable.

### Issues

- contributes to #8813 (we can easily add a few more once we have them
publicly statically hosted somewhere. Ideal would be something with
plenty of attachments and something with a different data type, like n5
or neuroglancerPrecomputed)

------
- [x] Added changelog entry (create a `$PR_NUMBER.md` file in
`unreleased_changes` or use `./tools/create-changelog-entry.py`)
- [x] Considered [common edge
cases](../blob/master/.github/common_edge_cases.md)

---------

Co-authored-by: valentin-pinkau <[email protected]>
fm3 added a commit that referenced this pull request Sep 25, 2025
#8897)

If the mag has a relative path, it must be resolved in the dataset root
path.

Based on and thus blocked by #8844 

### URL of deployed dev instance (used for testing):
- https://fixexplorepaths.webknossos.xyz

### Steps to test:
- Visit the served zarr-streaming datasource-properties.json at
`http://localhost:9000/data/zarr/<datasetId>/datasource-properties.json?token=secretSampleUserToken`
- Should list relative paths
- Explore the dataset in the same webknossos
`http://localhost:9000/data/zarr/<datasetId>/` and add it (self-stream)
- Should be explorable and afterwards show data.
- Same for annotation zarr links (note: explore the whole annotation,
not just one layer)

### Issues:
- fixes #8811

------
- [x] Added changelog entry (create a `$PR_NUMBER.md` file in
`unreleased_changes` or use `./tools/create-changelog-entry.py`)
- [x] Considered [common edge
cases](../blob/master/.github/common_edge_cases.md)
- [x] Needs datastore update after deployment

---------

Co-authored-by: valentin-pinkau <[email protected]>
philippotto added a commit that referenced this pull request Sep 29, 2025
1. This PR removes the advanced dataset settings mode (JSON input). PR
#8844 removed the
ability to edit JSON directly anyway - this is the frontend follow-up /
clean-up. This affects both the views for dataset settings, as well as,
remote dataset upload/import.

2. PR #8844 added a `includePaths` GET parameter to `/api/datasets`. The
matching backend changes were never merged to `master` and discarded.

3. Small refinements:
- Added a background color for the upload datasets tabs for better
contrast.
- Added a developer-only button to view the raw API response for
datasets to inspect paths, mags etc, now that the JSON view is gone.
    
<img width="1635" height="737" alt="Screenshot 2025-09-25 at 11 09 01"
src="https://github.com/user-attachments/assets/0242b0ca-6764-48ca-bf5f-dca16c19e3fc"
/>


### Steps to test:
1. Edit an existing dataset. Make changes, save and double-check they
save correctly.
2. Upload a new remote dataset, e.g. from
https://docs.webknossos.org/webknossos/data/neuroglancer_precomputed.html
. Check that you can add one or more layers without issue.


### Issues:
- fixes #8942
- fixes #5571
- fixes #5639

------
(Please delete unneeded items, merge only when none are left open)
- [x] Added changelog entry (create a `$PR_NUMBER.md` file in
`unreleased_changes` or use `./tools/create-changelog-entry.py`)
- [ ] Added migration guide entry if applicable (edit the same file as
for the changelog)
- [x] Updated [documentation](../blob/master/docs) if applicable
- [ ] Adapted [wk-libs python
client](https://github.com/scalableminds/webknossos-libs/tree/master/webknossos/webknossos/client)
if relevant API parts change
- [ ] Removed dev-only changes like prints and application.conf edits
- [x] Considered [common edge
cases](../blob/master/.github/common_edge_cases.md)
- [ ] Needs datastore update after deployment

---------

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: Philipp Otto <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants