Skip to content

Conversation

fm3
Copy link
Member

@fm3 fm3 commented Aug 6, 2025

Fixed a bug where hdf5 attachments explicitly mentioned in the datasource-properties.json would not be readable if they have an absolute path but no file:// prefix.

These URI/path/string/VaultPath/remoteSourceDescriptor conversions are no fun. We need to refactor this into a unified solution. #8762

Steps to test:

  • In your datasource-properties.json include an hdf5 agglomerate file with an absolute path that does not have the file:// prefix
  • Try to load data with this agglomerate file active, should work.
  • Using zarr in the same way should also still work

Issues:


  • Added changelog entry (create a $PR_NUMBER.md file in unreleased_changes or use ./tools/create-changelog-entry.py)
  • Considered common edge cases
  • Needs datastore update after deployment

@fm3 fm3 self-assigned this Aug 6, 2025
Copy link
Contributor

coderabbitai bot commented Aug 6, 2025

Warning

Rate limit exceeded

@fm3 has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 5 minutes and 13 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 7ec2c86 and d99793c.

📒 Files selected for processing (1)
  • webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/mapping/Hdf5AgglomerateService.scala (2 hunks)
📝 Walkthrough

Walkthrough

This change fixes a bug that prevented reading HDF5 attachments with absolute paths lacking the file:// URI scheme in the datasource-properties.json file. The update refines path handling logic to correctly recognize and process such paths, ensuring attachments are accessible regardless of URI scheme presence.

Changes

Cohort / File(s) Change Summary
LayerAttachment Path Handling
webknossos-datastore/app/com/scalableminds/webknossos/datastore/models/datasource/DatasetLayerAttachments.scala
Refined the localPath method to explicitly check for null URI schemes and convert such URIs to local file paths, ensuring correct handling of absolute paths without a file:// prefix.
Agglomerate File Path Resolution
webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/mapping/Hdf5AgglomerateService.scala
Updated path resolution for "cumsum.json" to use the improved localPath logic, ensuring consistent and correct base path derivation for agglomerate file handling.
Documentation / Release Notes
unreleased_changes/8832.md
Added a summary documenting the bug fix for absolute path handling in HDF5 attachments, clarifying the nature and scope of the change.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~7 minutes

Possibly related PRs

Suggested reviewers

  • frcroth

Poem

A bug with paths, both sly and sly,
Hid in URIs without a guide.
Now with schemes or none at all,
Each attachment answers the call.
HDF5s, come out and play—
The rabbit’s fixed your file today! 🐇✨

✨ Finishing Touches
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix-attachment-path-without-scheme

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai generate unit tests to generate unit tests for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@fm3 fm3 marked this pull request as ready for review August 6, 2025 09:41
@fm3 fm3 requested a review from frcroth August 6, 2025 09:41
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 192cf21 and 7ec2c86.

📒 Files selected for processing (3)
  • unreleased_changes/8832.md (1 hunks)
  • webknossos-datastore/app/com/scalableminds/webknossos/datastore/models/datasource/DatasetLayerAttachments.scala (1 hunks)
  • webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/mapping/Hdf5AgglomerateService.scala (1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: frcroth
PR: scalableminds/webknossos#8598
File: webknossos-datastore/app/com/scalableminds/webknossos/datastore/models/datasource/DatasetLayerAttachments.scala:89-95
Timestamp: 2025-06-02T09:49:51.047Z
Learning: In WebKnossos dataset layer attachments, multiple file types can safely use the same directory name (like "agglomerates") because the scanning logic filters by file extension. For example, AgglomerateFileInfo scans for .hdf5 files while CumsumFileInfo scans for .json files in the same "agglomerates" directory without interference.
📚 Learning: in webknossos dataset layer attachments, multiple file types can safely use the same directory name ...
Learnt from: frcroth
PR: scalableminds/webknossos#8598
File: webknossos-datastore/app/com/scalableminds/webknossos/datastore/models/datasource/DatasetLayerAttachments.scala:89-95
Timestamp: 2025-06-02T09:49:51.047Z
Learning: In WebKnossos dataset layer attachments, multiple file types can safely use the same directory name (like "agglomerates") because the scanning logic filters by file extension. For example, AgglomerateFileInfo scans for .hdf5 files while CumsumFileInfo scans for .json files in the same "agglomerates" directory without interference.

Applied to files:

  • unreleased_changes/8832.md
  • webknossos-datastore/app/com/scalableminds/webknossos/datastore/models/datasource/DatasetLayerAttachments.scala
  • webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/mapping/Hdf5AgglomerateService.scala
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: build-smoketest-push
  • GitHub Check: backend-tests
🔇 Additional comments (2)
webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/mapping/Hdf5AgglomerateService.scala (1)

266-266: LGTM! Proper usage of the fixed localPath method.

The change correctly utilizes the updated localPath method from LayerAttachment to resolve the cumsum.json file path. This ensures consistent and correct path handling for HDF5 attachments, especially those with absolute paths lacking the file:// prefix.

unreleased_changes/8832.md (1)

1-2: LGTM! Clear and accurate changelog entry.

The description accurately captures the bug fix, providing users with a clear understanding of what was resolved. The entry appropriately explains that HDF5 attachments with absolute paths lacking the file:// prefix are now readable.

Comment on lines +49 to +53
if (path.getScheme == null) {
Path.of(path.toString)
} else {
Path.of(path)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

LGTM! The URI scheme handling fix looks correct.

The change properly addresses the bug by handling URIs with null schemes (absolute paths without file:// prefix) by converting them to strings before creating a Path. This maintains backward compatibility while fixing the reported issue.

Consider testing edge cases to ensure robustness:


🏁 Script executed:

#!/bin/bash
# Verify that the Path.of behavior works correctly for various URI formats
# Look for existing tests that might cover these scenarios

rg -A 10 -B 5 "localPath|Path\.of" --type scala

Length of output: 74815


Approve URI scheme handling in localPath; add unit tests for edge cases

The conditional correctly covers both “no scheme” URIs (absolute paths without file://) and preserves original behavior for file:// URIs, while rejecting remote schemes. To ensure full coverage, please add unit tests for the following scenarios in a new DatasetLayerAttachmentsSpec (or similar):

  • URI with null scheme (e.g. new URI("/absolute/path/data.h5")) → localPath returns Path.of(path.toString)
  • URI with explicit file scheme (e.g. new URI("file:///absolute/path/data.h5")) → localPath returns Path.of(uri)
  • URI with non-file scheme (e.g. new URI("http://example.com/data.h5")) → localPath throws an exception
  • (Optional) Relative URI literal (e.g. new URI("data/data.h5")) if you expect to support it

Example test stub:

"localPath" should {
  "handle null scheme URIs as absolute paths" in {
    val uri = new URI("/tmp/foo.h5")
    val attachment = LayerAttachment("foo", uri, LayerAttachmentDataformat.HDF5)
    attachment.localPath shouldBe Path.of(uri.toString)
  }
  "handle file:// URIs correctly" in {
    val uri = new URI("file:///tmp/foo.h5")
    val attachment = LayerAttachment("foo", uri, LayerAttachmentDataformat.HDF5)
    attachment.localPath shouldBe Path.of(uri)
  }
  "reject non-file schemes" in {
    val uri = new URI("http://example.com/foo.h5")
    val attachment = LayerAttachment("foo", uri, LayerAttachmentDataformat.HDF5)
    an [Exception] should be thrownBy attachment.localPath
  }
}
🤖 Prompt for AI Agents
In
webknossos-datastore/app/com/scalableminds/webknossos/datastore/models/datasource/DatasetLayerAttachments.scala
around lines 49 to 53, the URI scheme handling logic is correct but lacks unit
test coverage. Add a new test suite named DatasetLayerAttachmentsSpec (or
similar) with tests covering: URIs with null scheme returning
Path.of(path.toString), URIs with explicit file scheme returning Path.of(uri),
and URIs with non-file schemes throwing an exception. Optionally, include tests
for relative URI literals if supported. Implement these tests to ensure all edge
cases are properly validated.

Copy link
Contributor

@frcroth frcroth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works for me

@fm3 fm3 enabled auto-merge (squash) August 6, 2025 09:58
@fm3 fm3 merged commit 1664d3c into master Aug 6, 2025
5 checks passed
@fm3 fm3 deleted the fix-attachment-path-without-scheme branch August 6, 2025 09:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants