Skip to content

Conversation

fm3
Copy link
Member

@fm3 fm3 commented Dec 18, 2024

Unified terminology: we decided a while ago that “folder” should refer only to the dashboard dataset folders that are implemented in postgres, while “directory” should refer to the actual filesystem. This PR adapts the backend config and code to follow this terminology.

Steps to test

  • Load datasets, create annotations
  • Upload a dataset
  • Test zarr streaming (e.g. self-streaming via “add remote dataset”)

Issues:


@fm3 fm3 self-assigned this Dec 18, 2024
Copy link
Contributor

coderabbitai bot commented Dec 18, 2024

📝 Walkthrough

Walkthrough

This pull request introduces a comprehensive renaming effort across multiple files to distinguish between "folder" and "directory" terminology. The changes primarily focus on configuration options, method signatures, and variable names in the webknossos-datastore and webknossos-tracingstore projects. Key modifications include renaming configuration properties like datastore.baseFolder to datastore.baseDirectory and localFolderWhitelist to localDirectoryWhitelist, as well as updating method names to consistently use "directory" when referring to filesystem paths.

Changes

File Change Summary
MIGRATIONS.unreleased.md Renamed configuration options from baseFolder to baseDirectory and localFolderWhitelist to localDirectoryWhitelist
app/utils/WkConf.scala Removed localFolderWhitelist configuration
test/e2e/End2EndSpec.scala Updated unzip method from unzipToFolder to unzipToDirectory
util/src/main/scala/com/scalableminds/util/io/ZipIO.scala Renamed unzipping methods to use "directory" terminology
Multiple datastore services Updated base directory configuration references
Multiple controllers Renamed methods related to Zarr and directory contents

Assessment against linked issues

Objective Addressed Explanation
Strict naming for Folder vs Directory (#7780) Comprehensive renaming across codebase to distinguish filesystem directories from in-db dashboard folders
Update config options (#7780) Renamed datastore.baseFolder to datastore.baseDirectory and localFolderWhitelist to localDirectoryWhitelist

Possibly related PRs

Suggested labels

enhancement

Suggested reviewers

  • MichaelBuessemeyer
  • normanrz

Poem

🐰 In the realm of code, a rabbit's delight,
Folders and directories now shine so bright!
From "baseFolder" to "baseDirectory" we leap,
Naming conventions no longer make us weep.
Clarity reigns in our digital space! 🗂️


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9f015d6 and d78493e.

📒 Files selected for processing (2)
  • conf/application.conf (1 hunks)
  • webknossos-datastore/conf/standalone-datastore.conf (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: circleci_build
🔇 Additional comments (3)
webknossos-datastore/conf/standalone-datastore.conf (1)

57-58: Consistency with the "folder → directory" convention looks good.

Renaming these keys from baseFolder to baseDirectory and from localFolderWhitelist to localDirectoryWhitelist aligns well with the new terminology scheme. Ensure that all references to these configuration keys in the codebase have been updated accordingly to prevent misconfiguration or runtime errors.

Please run the following script to confirm that the old baseFolder and localFolderWhitelist keywords are fully replaced throughout the codebase:

✅ Verification successful

Configuration key updates are fully implemented

The old configuration keys (baseFolder, localFolderWhitelist) only appear in changelog and migration documentation, while all actual code implementations consistently use the new keys (baseDirectory, localDirectoryWhitelist). The changes have been properly propagated throughout the codebase.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify that the old config names are removed and only the new keys are used.

# Searching for old references
rg "baseFolder"
rg "localFolderWhitelist"

# Searching for new references
rg "baseDirectory"
rg "localDirectoryWhitelist"

Length of output: 6384

conf/application.conf (2)

198-198: Consistency check on renamed whitelist configuration.

Changing localFolderWhitelist to localDirectoryWhitelist stays true to the new distinction between “folder” vs. “directory.” Confirm any surrounding code has been similarly adjusted to prevent referencing the old property name.

✅ Verification successful

Configuration rename verified successfully

The old configuration name localFolderWhitelist is only referenced in changelog entries and migration documentation, with no remaining references in the actual codebase. The rename appears to have been thoroughly implemented.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
#!/bin/bash
# Searching for old references to "localFolderWhitelist" in the repo
rg "localFolderWhitelist" 

Length of output: 715


197-197: Renaming aligns with the PR’s objective.

Renaming baseFolder to baseDirectory is consistent with the newly established terminology. Ensure that elsewhere in the code, especially where baseFolder was referenced, all instances are updated to avoid confusion or potential runtime errors.

✅ Verification successful

Verification successful: All baseFolder references are properly updated

The only remaining instances of baseFolder are in documentation files (MIGRATIONS and CHANGELOG) where they appropriately document this exact renaming change.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Searching for old references to "baseFolder" in the repo
rg "baseFolder" 

Length of output: 630


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@fm3 fm3 marked this pull request as ready for review January 6, 2025 10:17
@fm3 fm3 requested a review from frcroth January 6, 2025 10:17
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
webknossos-tracingstore/conf/com.scalableminds.webknossos.tracingstore.routes (2)

51-52: Consider consolidating trailing slash variants
Both routes with and without a trailing slash are defined, which is typically fine for backward compatibility. To streamline, consider automatically handling the trailing slash in your routing framework if no backward compatibility requirement exists.


63-64: Trailing slash usage
Similar to lines 51-52, you have a trailing slash variant. Evaluate whether to unify these routes if you do not need to differentiate them for specific client requests or to maintain backward compatibility.

webknossos-datastore/app/com/scalableminds/webknossos/datastore/storage/RemoteSourceDescriptorService.scala (1)

63-68: Consider using a more specific exception type and adding contextual details.

While this logic correctly enforces the new localDirectoryWhitelist, using a generic Exception might hamper error handling and troubleshooting. You could throw a custom exception (e.g., IllegalArgumentException) or integrate more contextual details (like organization/dataset identifiers) to provide better visibility into potential mismatches.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4e11a14 and 9f015d6.

📒 Files selected for processing (22)
  • MIGRATIONS.unreleased.md (1 hunks)
  • app/utils/WkConf.scala (0 hunks)
  • test/e2e/End2EndSpec.scala (1 hunks)
  • util/src/main/scala/com/scalableminds/util/io/ZipIO.scala (1 hunks)
  • webknossos-datastore/app/com/scalableminds/webknossos/datastore/DataStoreConfig.scala (1 hunks)
  • webknossos-datastore/app/com/scalableminds/webknossos/datastore/controllers/ExportsController.scala (1 hunks)
  • webknossos-datastore/app/com/scalableminds/webknossos/datastore/controllers/ZarrStreamingController.scala (6 hunks)
  • webknossos-datastore/app/com/scalableminds/webknossos/datastore/explore/ExploreRemoteLayerService.scala (1 hunks)
  • webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/AgglomerateService.scala (1 hunks)
  • webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/BinaryDataServiceHolder.scala (1 hunks)
  • webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/ConnectomeFileService.scala (1 hunks)
  • webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/DSRemoteTracingstoreClient.scala (1 hunks)
  • webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/DSUsedStorageService.scala (1 hunks)
  • webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/DataSourceService.scala (1 hunks)
  • webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/MappingService.scala (1 hunks)
  • webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/MeshFileService.scala (1 hunks)
  • webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/SegmentIndexFileService.scala (1 hunks)
  • webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/uploading/UploadService.scala (1 hunks)
  • webknossos-datastore/app/com/scalableminds/webknossos/datastore/storage/RemoteSourceDescriptorService.scala (1 hunks)
  • webknossos-datastore/conf/com.scalableminds.webknossos.datastore.routes (1 hunks)
  • webknossos-tracingstore/app/com/scalableminds/webknossos/tracingstore/controllers/VolumeTracingZarrStreamingController.scala (4 hunks)
  • webknossos-tracingstore/conf/com.scalableminds.webknossos.tracingstore.routes (1 hunks)
💤 Files with no reviewable changes (1)
  • app/utils/WkConf.scala
✅ Files skipped from review due to trivial changes (8)
  • webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/AgglomerateService.scala
  • webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/uploading/UploadService.scala
  • webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/DSUsedStorageService.scala
  • webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/ConnectomeFileService.scala
  • webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/SegmentIndexFileService.scala
  • webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/DSRemoteTracingstoreClient.scala
  • webknossos-datastore/app/com/scalableminds/webknossos/datastore/controllers/ZarrStreamingController.scala
  • webknossos-tracingstore/app/com/scalableminds/webknossos/tracingstore/controllers/VolumeTracingZarrStreamingController.scala
⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: circleci_build
🔇 Additional comments (24)
webknossos-tracingstore/conf/com.scalableminds.webknossos.tracingstore.routes (2)

44-47: Consistent naming convention for directory-based endpoints
These routes correctly replace "folder" nomenclature with "directory," aligning well with the PR objective. The method names in the VolumeTracingZarrStreamingController appear to match these new routes.


57-60: Good extension to Zarr v3 routes
The transition to the "directory" naming for Zarr v3 routes is consistent and clear.

test/e2e/End2EndSpec.scala (1)

65-65: Rename aligns with PR objective.

The call to ZipIO.unzipToDirectory is consistent with the rename from “folder” to “directory” mandated by this PR. Good to see that the method usage here now matches the updated utility methods.

util/src/main/scala/com/scalableminds/util/io/ZipIO.scala (2)

278-283: Maintains consistent naming and behavior.

Renaming unzipToFolder to unzipToDirectory clarifies the intended usage for filesystem paths, fully aligning with the PR’s goal of distinguishing “directories” from “folders” in the codebase. The parameters remain the same, so existing logic is preserved.


287-292: Implementation carried over correctly.

The new unzipToDirectory method (accepting a ZipFile instead of a File) mirrors the logic of the earlier approach while improving naming clarity. Good job streamlining the code.

webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/MappingService.scala (1)

23-23: Consistent Renaming of Base Directory

Switching from config.Datastore.baseFolder to config.Datastore.baseDirectory aligns perfectly with the PR objective of unifying terminology. The logic remains unchanged and should continue functioning without issue.

webknossos-datastore/app/com/scalableminds/webknossos/datastore/controllers/ExportsController.scala (1)

34-34: Terminology Change: Base Directory

Renaming this variable to baseDirectory from baseFolder is consistent with the unified naming approach introduced in this PR. This change is straightforward and fully aligns with the updated configuration property.

webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/BinaryDataServiceHolder.scala (1)

45-45: Unified Terminology for Directory Handling

Using config.Datastore.baseDirectory clarifies that the path refers to a filesystem directory rather than a dashboard folder. This renaming completes the move toward consistent vocabulary across the codebase.

webknossos-datastore/app/com/scalableminds/webknossos/datastore/DataStoreConfig.scala (1)

23-24: Property Renaming in Configuration

Renaming baseFolder to baseDirectory and localFolderWhitelist to localDirectoryWhitelist reflects the new convention for directory paths and ensures consistency throughout the configuration options. These changes meet the PR goal of distinguishing between dashboard folders and filesystem directories.

webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/DataSourceService.scala (1)

44-44: This renaming aligns well with the new terminology.

The update from baseFolder to baseDirectory is consistent with the broader PR objective and aids in clarity. No concerns here.

webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/MeshFileService.scala (1)

180-180: Terminology update looks good.

Changing baseFolder to baseDirectory is consistent with the codebase-wide renaming strategy.

webknossos-datastore/conf/com.scalableminds.webknossos.datastore.routes (12)

19-20: Consistently renamed routes for Zarr data source.

The new naming convention aligns with the PR objectives of distinguishing “directory” from “folder.” This appears consistent and helps avoid confusion in the future.


23-24: Good consistency for data layer route.

Renaming to requestDataLayerDirectoryContents clarifies that it deals with filesystem directories, in line with the broader refactor.


27-28: Maintains naming clarity for magnification sub-routes.

Renaming to requestDataLayerMagDirectoryContents is consistent with the rest of the changes. No concerns noted.


32-33: Private link route rename is clear.

Renaming to dataSourceDirectoryContentsPrivateLink mirrors the underlying concept of directories vs. folders. Well done.


36-37: Consistent rename for data layer directory in private link.

The route name properly reflects the usage of “directory.” This also ensures uniform usage of new terminology across private link endpoints.


40-41: Magnification route naming is aligned with “directory” refactor.

Using “Directory” in dataLayerMagDirectoryContentsPrivateLink clarifies the endpoint’s nature.


46-47: Zarr 3 experimental routes maintain consistent directory naming.

Adoption of directory-based terminology remains consistent with the rest of the file. Looks good.


49-50: Renamed data layer routes for zarr3_experimental.

This continues the “directory” nomenclature. The approach is uniform across all zarr versions.


52-53: Magnification sub-routes remain consistent.

No additional caveats; changes match the standardized naming practice introduced by the PR.


57-58: Annotations in zarr3_experimental also follow directory pattern.

The usage of “directory” is consistent, so references to “folder” do not appear.


60-61: Data layer directory naming in annotations.

No issues spotted. The rename is coherent with the approach used in the other zarr endpoints.


63-64: Magnification route for zarr3_experimental is updated to directory naming.

The naming scheme remains consistent throughout. Nice job.

MIGRATIONS.unreleased.md (1)

12-12: Clear rename of datastore configuration options.

Renaming datastore.baseFolderdatastore.baseDirectory and localFolderWhitelistlocalDirectoryWhitelist aligns perfectly with the new convention. Make sure to update any references or environment variable docs to avoid confusion.

Comment on lines +107 to +108
bool2Fox(dataStoreConfig.Datastore.localDirectoryWhitelist.exists(whitelistEntry =>
uri.getPath.startsWith(whitelistEntry))) ?~> s"Absolute path ${uri.getPath} in local file system is not in path whitelist. Consider adding it to datastore.localDirectoryWhitelist"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Validate using canonical path checks instead of simple startsWith.

Relying on startsWith may introduce a security loophole (e.g., /allowed vs. /allowed_evil). To avoid prefix-based false positives or negatives, consider normalizing both the target path and the whitelist entries with toRealPath() or a similar method to ensure robust checks.

Apply this diff to implement canonical path checks:

- bool2Fox(dataStoreConfig.Datastore.localDirectoryWhitelist.exists(whitelistEntry =>
-   uri.getPath.startsWith(whitelistEntry))) ?~> s"Absolute path ${uri.getPath} ...
+ val localCanonicalPath = Paths.get(uri.getPath).toRealPath().toString
+ bool2Fox(dataStoreConfig.Datastore.localDirectoryWhitelist.exists(whitelistEntry =>
+   localCanonicalPath.startsWith(Paths.get(whitelistEntry).toRealPath().toString))) ?~>
+   s"Absolute path $localCanonicalPath ...

Committable suggestion skipped: line range outside the PR's diff.

Copy link
Contributor

@frcroth frcroth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@fm3 fm3 enabled auto-merge (squash) January 14, 2025 12:26
@fm3 fm3 merged commit 2704f0b into master Jan 14, 2025
3 checks passed
@fm3 fm3 deleted the folder-vs-directory branch January 14, 2025 12:35
@coderabbitai coderabbitai bot mentioned this pull request Jan 23, 2025
3 tasks
@coderabbitai coderabbitai bot mentioned this pull request Sep 12, 2025
7 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Be strict about naming for Folder (only for in-db dashboard folders) vs Directory (for actual filesystem)
2 participants