-
Notifications
You must be signed in to change notification settings - Fork 29
Resolve symbolic links to mags and store them in db #8350
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
📝 WalkthroughWalkthroughThis pull request introduces comprehensive changes to dataset path management and metadata handling in the WEBKNOSSOS application. The modifications span multiple components, including controllers, services, and database schemas. Key additions include new methods for retrieving and updating dataset paths, introducing case classes to represent dataset magnification information, and extending the database schema to support real paths and local data tracking. The changes enhance the system's ability to manage and track dataset metadata more precisely. Changes
Suggested labels
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🔭 Outside diff range comments (1)
webknossos-datastore/app/com/scalableminds/webknossos/datastore/storage/RemoteSourceDescriptorService.scala (1)
Line range hint
49-64
: Consider using canonical paths and typed exceptions for security and clarity.While this logic properly checks remote vs. local schemes, relying on
startsWith
for whitelisting can be susceptible to path traversal (e.g., embedded"../"
). Additionally, throwing rawException
can obscure the root cause. Use a canonical path check (e.g.,toRealPath
) and a specialized exception type to enhance security and clarity.
🧹 Nitpick comments (13)
webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/DataSourceService.scala (3)
91-105
: Enhance error logging for better debugging.While the error handling is good, the error messages could be more descriptive to aid in debugging.
Consider adding more context to error messages:
- logger.error(s"Failed to determine real paths for mags of $ds: $msg") + logger.error(s"Failed to determine real paths for mags of dataset ${ds.id}: $msg. This may affect path resolution.") - logger.error(s"Failed to determine real paths for mags of $ds") + logger.error(s"Failed to determine real paths for mags of dataset ${ds.id}. This may affect path resolution.")
153-158
: Add input validation and documentation.The method would benefit from input validation and clear documentation of its purpose and behavior.
Consider these improvements:
+ /** + * Resolves a potentially relative path against a base path. + * If the relative path is already absolute, returns it unchanged. + * Otherwise, resolves it against the base path and normalizes the result. + * + * @param basePath The base path to resolve against + * @param relativePath The path to resolve + * @return The resolved absolute path + * @throws NullPointerException if either path is null + */ private def resolveRelativePath(basePath: Path, relativePath: Path): Path = { + require(basePath != null, "basePath cannot be null") + require(relativePath != null, "relativePath cannot be null") if (relativePath.isAbsolute) { relativePath } else { basePath.resolve(relativePath).normalize().toAbsolutePath } }
91-158
: Consider implementing a dedicated PathResolver service.The new path resolution functionality is well-implemented but could benefit from being extracted into a dedicated service. This would:
- Improve separation of concerns
- Make the path resolution logic more testable
- Reduce the complexity of the DataSourceService class
- Make it easier to modify path resolution strategies in the future
Consider creating a new
PathResolverService
class to handle all path resolution logic, including symlink resolution and URI generation. This would make the codebase more maintainable and modular.webknossos-datastore/app/com/scalableminds/webknossos/datastore/storage/RemoteSourceDescriptorService.scala (2)
77-83
: Check for invalid local directories and handle them explicitly.If neither
localDirWithScalarMag
norlocalDirWithVec3Mag
exists, the code silently defaults tolocalDirWithVec3Mag
. Updating the logic or adding a warning/error for missing directories can avoid unintentional fallbacks and help debug user misconfiguration.
86-98
: Consolidate path resolution logic in a single place.
uriForMagLocator
callsresolveMagPath
and then applies an additionaltoAbsolutePath.toUri
if not remote. Consider directly returning an absolute path fromresolveMagPath
for consistency, or unifying their logic to avoid duplication and reduce potential path resolution discrepancies.webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/DataSourceRepository.scala (1)
49-53
: Add error handling and logging around real path reporting.This method simply calls
reportRealPaths
without any additional handling for potential failures or partial updates. Consider adding logging or structured error handling to ensure visibility into path reporting issues.webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/DSRemoteWebknossosClient.scala (1)
40-50
: Document or validate new path-related fields.The new case classes
DataSourcePathInfo
andMagPathInfo
introduce important fields (e.g.,layerName
,mag
,path
,realPath
,hasLocalData
). Brief documentation or validation logic would help ensure correct usage across the codebase, especially formag
or path fields.app/controllers/WKRemoteDataStoreController.scala (1)
224-238
: Consider adding request logging and error details.The method implementation is correct but could benefit from additional logging to aid in debugging:
- Log the number of path infos being processed
- Include more details in the warning message for invalid JSON
Apply this diff to enhance logging:
def updatePaths(name: String, key: String): Action[JsValue] = Action.async(parse.json) { implicit request => dataStoreService.validateAccess(name, key) { _ => request.body.validate[List[DataSourcePathInfo]] match { case JsSuccess(infos, _) => + logger.info(s"Updating paths for ${infos.length} data sources from data store '$name'") for { _ <- datasetService.updateRealPaths(infos)(GlobalAccessContext) } yield { JsonOk } case e: JsError => - logger.warn("Data store reported invalid json for data source paths.") + logger.warn(s"Data store '$name' reported invalid json for data source paths: ${JsError.toJson(e)}") Fox.successful(JsonBadRequest(JsError.toJson(e))) } } }app/models/dataset/DatasetService.scala (1)
343-356
: Consider adding error logging for dataset not found case.The error handling is correct, but adding logging would help track issues in production.
Apply this diff to enhance error logging:
private def updateRealPath(pathInfo: DataSourcePathInfo)(implicit ctx: DBAccessContext): Fox[Unit] = if (pathInfo.magPathInfos.isEmpty) { Fox.successful(()) } else { val dataset = datasetDAO.findOneByDataSourceId(pathInfo.dataSourceId).futureBox dataset.flatMap { case Full(dataset) => datasetMagsDAO.updateMagPathsForDataset(dataset._id, pathInfo.magPathInfos) case Empty => // Dataset reported but ignored (non-existing/forbidden org) + logger.debug(s"Ignoring path update for non-existing dataset: ${pathInfo.dataSourceId}") Fox.successful(()) case e: EmptyBox => + logger.warn(s"Failed to find dataset: ${pathInfo.dataSourceId}", e) Fox.failure("dataset.notFound", e) } }app/controllers/DatasetController.scala (1)
137-149
: Consider adding pagination for large datasets.The method correctly implements the required functionality but might need pagination for datasets with many mags.
Apply this diff to add pagination:
-def getLinkedMags(datasetId: String, dataLayerName: String): Action[AnyContent] = +def getLinkedMags(datasetId: String, dataLayerName: String, limit: Option[Int], offset: Option[Int]): Action[AnyContent] = sil.SecuredAction.async { implicit request => for { datasetIdValidated <- ObjectId.fromString(datasetId) _ <- datasetDAO.findOne(datasetIdValidated) ?~> notFoundMessage(datasetId) ~> NOT_FOUND _ <- Fox.bool2Fox(request.identity.isAdmin) ?~> "notAllowed" ~> FORBIDDEN magsAndLinkedMags <- datasetService.getPathsForDatalayer(datasetIdValidated, dataLayerName) - returnValues = magsAndLinkedMags.map { + returnValues = magsAndLinkedMags.drop(offset.getOrElse(0)).take(limit.getOrElse(Int.MaxValue)).map { case (mag, linkedMags) => MagLinkInfo(mag, linkedMags) } } yield Ok(Json.toJson(returnValues)) }conf/webknossos.latest.routes (1)
94-94
: Update route documentation.The routes are correctly defined but should be documented in the API documentation.
Add comments above each route to describe their purpose:
+# Get linked magnifications for a dataset layer GET /datasets/:datasetId/layers/:layer/linkedMags controllers.DatasetController.getLinkedMags(datasetId: String, layer: String) +# Update data source paths in the datastore PUT /datastores/:name/datasources/paths controllers.WKRemoteDataStoreController.updatePaths(name: String, key: String)Also applies to: 112-112
app/models/dataset/Dataset.scala (1)
767-784
: Consider improving the handling of uninitialized paths.Instead of using a magic string "uninitialized" for null paths, consider:
- Using an Option type to represent potentially uninitialized paths
- Handling the null case at the application level where the context is clearer
- row.path.getOrElse("uninitialized"), - row.realpath.getOrElse("uninitialized"), + row.path.getOrElse(""), + row.realpath.getOrElse(""),conf/evolutions/126-mag-real-paths.sql (1)
1-12
: Well-structured SQL evolution with proper safeguards!The SQL evolution:
- Uses transaction for atomicity
- Includes schema version check
- Adds columns with appropriate types
- Sets sensible default for hasLocalData
However, consider adding NOT NULL constraints to path and realPath columns since they are required fields in the DatasetMagInfo case class.
-ALTER TABLE webknossos.dataset_mags ADD COLUMN realPath TEXT; -ALTER TABLE webknossos.dataset_mags ADD COLUMN path TEXT; +ALTER TABLE webknossos.dataset_mags ADD COLUMN realPath TEXT NOT NULL DEFAULT ''; +ALTER TABLE webknossos.dataset_mags ADD COLUMN path TEXT NOT NULL DEFAULT '';
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (13)
MIGRATIONS.unreleased.md
(1 hunks)app/controllers/DatasetController.scala
(2 hunks)app/controllers/WKRemoteDataStoreController.scala
(2 hunks)app/models/dataset/Dataset.scala
(4 hunks)app/models/dataset/DatasetService.scala
(3 hunks)conf/evolutions/126-mag-real-paths.sql
(1 hunks)conf/evolutions/reversions/126-mag-real-paths.sql
(1 hunks)conf/webknossos.latest.routes
(2 hunks)tools/postgres/schema.sql
(2 hunks)webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/DSRemoteWebknossosClient.scala
(3 hunks)webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/DataSourceRepository.scala
(1 hunks)webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/DataSourceService.scala
(3 hunks)webknossos-datastore/app/com/scalableminds/webknossos/datastore/storage/RemoteSourceDescriptorService.scala
(2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: circleci_build
🔇 Additional comments (16)
conf/evolutions/reversions/126-mag-real-paths.sql (3)
1-3
: LGTM! Transaction and schema version check are properly implemented.The transaction block ensures atomicity, and the schema version assertion provides a critical safety check against incorrect migrations.
9-11
: LGTM! Schema version update and transaction commit are correct.The schema version is properly decremented to 125, and the transaction is committed to ensure all changes are applied atomically.
5-7
: Verify the impact of dropping these columns.The column drops align with the PR objectives for managing symlinked mags. However, ensure that:
- Any existing data in these columns has been properly migrated or is safe to discard
- All application code references to these columns are removed
- The corresponding forward migration (126-mag-real-paths.sql) correctly creates these columns
webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/DataSourceService.scala (3)
11-11
: LGTM: Required imports added for new functionality.The new imports for
MagLocator
andDataVaultService
align with the PR's objective to handle mag symlinks.Also applies to: 15-15
81-81
: LGTM: Path resolution integrated into inbox scanning.The addition of
sendRealPaths
call ensures that real paths are processed and published whenever inbox sources are scanned.
143-151
: LGTM: Clean and focused implementation.The method follows the single responsibility principle and clearly handles mag URI resolution.
webknossos-datastore/app/com/scalableminds/webknossos/datastore/storage/RemoteSourceDescriptorService.scala (1)
65-70
: Raise an error if neither dataset nor layer path exists.Currently, if
magPathRelativeToDataset
does not exist, the code falls back tomagPathRelativeToLayer
. Consider raising an error or logging a warning if neither path is found, ensuring graceful handling of invalid file structures.app/controllers/WKRemoteDataStoreController.scala (1)
10-10
: LGTM!The import statement correctly includes the new
DataSourcePathInfo
type needed for the path update functionality.app/models/dataset/DatasetService.scala (2)
17-17
: LGTM!The imports are correctly organized and include all necessary types for the new functionality.
Also applies to: 23-23
357-361
: LGTM!The
updateRealPaths
method correctly serializes updates to maintain consistency.app/controllers/DatasetController.scala (1)
74-78
: LGTM!The case class and its JSON format are correctly defined.
app/models/dataset/Dataset.scala (3)
24-24
: LGTM!The import for
MagPathInfo
is correctly added to support the new mag path handling functionality.
712-721
: Well-structured data model for mag information!The
DatasetMagInfo
case class and its companion object are well-designed:
- Encapsulates all necessary fields for mag path information
- Provides JSON serialization support
- Aligns with the PR objectives for storing symbolic link information
748-765
: Robust implementation with proper transaction handling!The implementation includes:
- Proper SQL escaping and type handling
- Transaction with serializable isolation
- Retry logic for handling serialization failures
- Batch updates for efficiency
MIGRATIONS.unreleased.md (1)
21-22
: LGTM!The migration guide is correctly updated to include the new SQL evolution.
tools/postgres/schema.sql (1)
23-23
: LGTM!The schema changes:
- Correctly update the schema version
- Add the new columns matching the SQL evolution
- Set appropriate default for hasLocalData
Also applies to: 181-183
private def determineMagRealPathsForDataSource(dataSource: InboxDataSource) = tryo { | ||
val organizationPath = dataBaseDir.resolve(dataSource.id.organizationId) | ||
val datasetPath = organizationPath.resolve(dataSource.id.directoryName) | ||
dataSource.toUsable match { | ||
case Some(usableDataSource) => { | ||
usableDataSource.dataLayers.flatMap { dataLayer => | ||
val rawLayerPath = datasetPath.resolve(dataLayer.name) | ||
val absoluteLayerPath = if (Files.isSymbolicLink(rawLayerPath)) { | ||
resolveRelativePath(datasetPath, Files.readSymbolicLink(rawLayerPath)) | ||
} else { | ||
rawLayerPath.toAbsolutePath | ||
} | ||
dataLayer.mags.map { mag => | ||
val (magURI, isRemote) = getMagURI(datasetPath, absoluteLayerPath, mag) | ||
if (isRemote) { | ||
MagPathInfo(dataLayer.name, mag.mag, magURI.toString, magURI.toString, hasLocalData = false) | ||
} else { | ||
val magPath = Paths.get(magURI) | ||
val realPath = magPath.toRealPath() | ||
// Does this dataset have local data, i.e. the data that is referenced by the mag path is within the dataset directory | ||
val isLocal = realPath.startsWith(datasetPath.toAbsolutePath) | ||
val unresolvedPath = | ||
rawLayerPath.toAbsolutePath.resolve(absoluteLayerPath.relativize(magPath)).normalize() | ||
MagPathInfo(dataLayer.name, | ||
mag.mag, | ||
unresolvedPath.toUri.toString, | ||
realPath.toUri.toString, | ||
hasLocalData = isLocal) | ||
} | ||
} | ||
} | ||
} | ||
case None => List() | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add error handling for symlink resolution and improve code structure.
The method handles complex path resolution logic but has some areas for improvement:
- Missing error handling for symlink resolution
- Complex nested structure could be simplified
- Missing documentation for the business logic
Consider these improvements:
private def determineMagRealPathsForDataSource(dataSource: InboxDataSource) = tryo {
+ // Resolves paths for each mag in the data source, handling both local and remote data
+ // Returns a list of MagPathInfo containing the original and resolved paths
val organizationPath = dataBaseDir.resolve(dataSource.id.organizationId)
val datasetPath = organizationPath.resolve(dataSource.id.directoryName)
dataSource.toUsable match {
case Some(usableDataSource) => {
usableDataSource.dataLayers.flatMap { dataLayer =>
val rawLayerPath = datasetPath.resolve(dataLayer.name)
- val absoluteLayerPath = if (Files.isSymbolicLink(rawLayerPath)) {
- resolveRelativePath(datasetPath, Files.readSymbolicLink(rawLayerPath))
- } else {
- rawLayerPath.toAbsolutePath
- }
+ val absoluteLayerPath = try {
+ if (Files.isSymbolicLink(rawLayerPath)) {
+ resolveRelativePath(datasetPath, Files.readSymbolicLink(rawLayerPath))
+ } else {
+ rawLayerPath.toAbsolutePath
+ }
+ } catch {
+ case e: Exception =>
+ logger.error(s"Failed to resolve layer path for ${dataLayer.name}: ${e.getMessage}")
+ rawLayerPath.toAbsolutePath
+ }
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
private def determineMagRealPathsForDataSource(dataSource: InboxDataSource) = tryo { | |
val organizationPath = dataBaseDir.resolve(dataSource.id.organizationId) | |
val datasetPath = organizationPath.resolve(dataSource.id.directoryName) | |
dataSource.toUsable match { | |
case Some(usableDataSource) => { | |
usableDataSource.dataLayers.flatMap { dataLayer => | |
val rawLayerPath = datasetPath.resolve(dataLayer.name) | |
val absoluteLayerPath = if (Files.isSymbolicLink(rawLayerPath)) { | |
resolveRelativePath(datasetPath, Files.readSymbolicLink(rawLayerPath)) | |
} else { | |
rawLayerPath.toAbsolutePath | |
} | |
dataLayer.mags.map { mag => | |
val (magURI, isRemote) = getMagURI(datasetPath, absoluteLayerPath, mag) | |
if (isRemote) { | |
MagPathInfo(dataLayer.name, mag.mag, magURI.toString, magURI.toString, hasLocalData = false) | |
} else { | |
val magPath = Paths.get(magURI) | |
val realPath = magPath.toRealPath() | |
// Does this dataset have local data, i.e. the data that is referenced by the mag path is within the dataset directory | |
val isLocal = realPath.startsWith(datasetPath.toAbsolutePath) | |
val unresolvedPath = | |
rawLayerPath.toAbsolutePath.resolve(absoluteLayerPath.relativize(magPath)).normalize() | |
MagPathInfo(dataLayer.name, | |
mag.mag, | |
unresolvedPath.toUri.toString, | |
realPath.toUri.toString, | |
hasLocalData = isLocal) | |
} | |
} | |
} | |
} | |
case None => List() | |
} | |
} | |
private def determineMagRealPathsForDataSource(dataSource: InboxDataSource) = tryo { | |
// Resolves paths for each mag in the data source, handling both local and remote data | |
// Returns a list of MagPathInfo containing the original and resolved paths | |
val organizationPath = dataBaseDir.resolve(dataSource.id.organizationId) | |
val datasetPath = organizationPath.resolve(dataSource.id.directoryName) | |
dataSource.toUsable match { | |
case Some(usableDataSource) => { | |
usableDataSource.dataLayers.flatMap { dataLayer => | |
val rawLayerPath = datasetPath.resolve(dataLayer.name) | |
val absoluteLayerPath = try { | |
if (Files.isSymbolicLink(rawLayerPath)) { | |
resolveRelativePath(datasetPath, Files.readSymbolicLink(rawLayerPath)) | |
} else { | |
rawLayerPath.toAbsolutePath | |
} | |
} catch { | |
case e: Exception => | |
logger.error(s"Failed to resolve layer path for ${dataLayer.name}: ${e.getMessage}") | |
rawLayerPath.toAbsolutePath | |
} | |
dataLayer.mags.map { mag => | |
val (magURI, isRemote) = getMagURI(datasetPath, absoluteLayerPath, mag) | |
if (isRemote) { | |
MagPathInfo(dataLayer.name, mag.mag, magURI.toString, magURI.toString, hasLocalData = false) | |
} else { | |
val magPath = Paths.get(magURI) | |
val realPath = magPath.toRealPath() | |
// Does this dataset have local data, i.e. the data that is referenced by the mag path is within the dataset directory | |
val isLocal = realPath.startsWith(datasetPath.toAbsolutePath) | |
val unresolvedPath = | |
rawLayerPath.toAbsolutePath.resolve(absoluteLayerPath.relativize(magPath)).normalize() | |
MagPathInfo(dataLayer.name, | |
mag.mag, | |
unresolvedPath.toUri.toString, | |
realPath.toUri.toString, | |
hasLocalData = isLocal) | |
} | |
} | |
} | |
} | |
case None => List() | |
} | |
} |
def reportRealPaths(dataSourcePaths: List[DataSourcePathInfo]): Fox[_] = | ||
rpc(s"$webknossosUri/api/datastores/$dataStoreName/datasources/paths") | ||
.addQueryString("key" -> dataStoreKey) | ||
.silent | ||
.putJson(dataSourcePaths) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Consider explicit result type and user feedback.
reportRealPaths
returns Fox[_]
and uses .silent
. In case of partial or complete failure, there is no user-facing feedback or logs. Specifying a clearer return type (e.g., Fox[ReportPathsResult]
) and/or logging errors can facilitate troubleshooting.
def getPathsForDatalayer(datasetId: ObjectId, layerName: String): Fox[List[(DatasetMagInfo, List[DatasetMagInfo])]] = | ||
for { | ||
magInfos <- datasetMagsDAO.findPathsForDatasetAndDatalayer(datasetId, layerName) | ||
magInfosAndLinkedMags <- Fox.serialCombined(magInfos)(magInfo => | ||
for { | ||
pathInfos <- datasetMagsDAO.findAllByRealPath(magInfo.realPath) | ||
} yield (magInfo, pathInfos.filter(!_.equals(magInfo)))) | ||
|
||
} yield magInfosAndLinkedMags | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Consider adding input validation for layer name.
The method should validate the layer name before querying the database.
Apply this diff to add validation:
def getPathsForDatalayer(datasetId: ObjectId, layerName: String): Fox[List[(DatasetMagInfo, List[DatasetMagInfo])]] =
for {
+ _ <- assertValidLayerNameLax(layerName)
magInfos <- datasetMagsDAO.findPathsForDatasetAndDatalayer(datasetId, layerName)
magInfosAndLinkedMags <- Fox.serialCombined(magInfos)(magInfo =>
for {
pathInfos <- datasetMagsDAO.findAllByRealPath(magInfo.realPath)
} yield (magInfo, pathInfos.filter(!_.equals(magInfo))))
} yield magInfosAndLinkedMags
Committable suggestion skipped: line range outside the PR's diff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
app/models/dataset/Dataset.scala (1)
765-782
: Remove unnecessary empty Fox.successful.The empty
Fox.successful(())
on line 767 is unnecessary and can be removed.def updateMagPathsForDataset(datasetId: ObjectId, magPaths: List[MagPathInfo]): Fox[Unit] = for { - _ <- Fox.successful(()) updateQueries = magPaths.map(pathInfo => {
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
app/models/dataset/Dataset.scala
(4 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: circleci_build
🔇 Additional comments (4)
app/models/dataset/Dataset.scala (4)
712-721
: LGTM! Well-structured case class implementation.The
DatasetMagInfo
case class and its companion object are well-designed, providing a clean representation of mag path metadata with proper JSON serialization support. The implementation aligns perfectly with the PR objectives for storing mag paths.
736-738
: Replace "uninitialized" with empty string as default value.Using "uninitialized" as a default value for paths is not ideal. Consider using an empty string instead, as suggested in previous reviews.
Apply this diff:
- row.path.getOrElse("uninitialized"), - row.realpath.getOrElse("uninitialized"), + row.path.getOrElse(""), + row.realpath.getOrElse(""),
784-794
: LGTM! Well-implemented query and mapping logic.The method correctly retrieves and maps mag paths for a dataset and layer, with proper error handling using Fox combinators.
795-810
: Replace "uninitialized" with empty string and consider adding an index.
- Replace "uninitialized" with empty string as default value.
- Consider adding an index on the
realPath
column since we're querying by it.Apply this diff for the default values:
- row.path.getOrElse("uninitialized"), - row.realpath.getOrElse("uninitialized"), + row.path.getOrElse(""), + row.realpath.getOrElse(""),Consider adding the following index to improve query performance:
CREATE INDEX IF NOT EXISTS dataset_mags_realpath_idx ON webknossos.dataset_mags(realPath);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
app/controllers/DatasetController.scala (1)
135-145
: Consider production readiness improvements.While the implementation is functionally correct, consider the following improvements for production use:
- The admin-only access might be too restrictive for testing purposes. Consider allowing dataset owners or managers to access this endpoint.
- Add pagination support for potentially large result sets.
- Consider implementing caching to improve performance for frequently accessed datasets.
Here's a suggested implementation with pagination:
-def getLinkedMags(datasetId: ObjectId, dataLayerName: String): Action[AnyContent] = +def getLinkedMags( + datasetId: ObjectId, + dataLayerName: String, + offset: Option[Int] = Some(0), + limit: Option[Int] = Some(100) +): Action[AnyContent] = sil.SecuredAction.async { implicit request => for { _ <- datasetDAO.findOne(datasetId) ?~> notFoundMessage(datasetId) ~> NOT_FOUND - _ <- Fox.bool2Fox(request.identity.isAdmin) ?~> "notAllowed" ~> FORBIDDEN + _ <- Fox.bool2Fox( + request.identity.isAdmin || + datasetService.isEditableBy(dataset, Some(request.identity)) + ) ?~> "notAllowed" ~> FORBIDDEN magsAndLinkedMags <- datasetService.getPathsForDatalayer(datasetId, dataLayerName) - returnValues = magsAndLinkedMags.map { + paginatedMags = magsAndLinkedMags + .drop(offset.getOrElse(0)) + .take(limit.getOrElse(100)) + returnValues = paginatedMags.map { case (mag, linkedMags) => MagLinkInfo(mag, linkedMags) } - } yield Ok(Json.toJson(returnValues)) + } yield Ok(Json.obj( + "items" -> returnValues, + "total" -> magsAndLinkedMags.size, + "offset" -> offset.getOrElse(0), + "limit" -> limit.getOrElse(100) + )) }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
test/db/dataSet_mags.csv
is excluded by!**/*.csv
📒 Files selected for processing (3)
app/controllers/DatasetController.scala
(2 hunks)app/controllers/WKRemoteDataStoreController.scala
(2 hunks)conf/webknossos.latest.routes
(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- conf/webknossos.latest.routes
- app/controllers/WKRemoteDataStoreController.scala
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: circleci_build
🔇 Additional comments (1)
app/controllers/DatasetController.scala (1)
74-78
: LGTM!The case class and companion object are well-structured and follow Scala best practices.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good! My first tests worked fine :) I think the new data structures make sense.
I’m not sure if we should maybe combine the two report requests into one, to avoid this uninitialized state for new datasets, but I guess having two is also fine for the moment.
I added a couple of code suggestions.
for { | ||
_ <- Fox.successful(()) | ||
_ <- remoteWebknossosClient.reportRealPaths(infos) | ||
} yield () |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for { | |
_ <- Fox.successful(()) | |
_ <- remoteWebknossosClient.reportRealPaths(infos) | |
} yield () | |
remoteWebknossosClient.reportRealPaths(infos) |
(I would assume that this is identical?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, if these infos don’t live in the DataSourceRepository, I would skip this forwarding function entirely and rather call remoteWebknossosClient.reportRealPaths(infos)
directly from the DataSourceService
.
foundInboxSources = organizationDirs.flatMap(teamAwareInboxSources) | ||
_ = logFoundDatasources(foundInboxSources, verbose) | ||
_ <- dataSourceRepository.updateDataSources(foundInboxSources) | ||
_ <- sendRealPaths(foundInboxSources) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I saw report, send and publish now. I’d suggest to stick with “report”
case e: JsError => | ||
logger.warn("Data store reported invalid json for data source paths.") | ||
Fox.successful(JsonBadRequest(JsError.toJson(e))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a specific reason to validate explicitly and log errors here, rather than using a playBodyParser to build an Action[List[DataSourcePathInfo]]
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that was just copied from the function above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So should this be changed for all methods in this controller?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did it for updateAll in #8402 – but I’d say it’s not super important to update this now. Both is valid. The older variant has the benefit of controller-side logging (but I don’t think I ever saw this)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created #8406 for that.
app/models/dataset/Dataset.scala
Outdated
def updateMagPathsForDataset(datasetId: ObjectId, magPaths: List[MagPathInfo]): Fox[Unit] = | ||
for { | ||
_ <- Fox.successful(()) | ||
updateQueries = magPaths.map(pathInfo => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updateQueries = magPaths.map(pathInfo => { | |
updateQueries = magPathInfos.map(magPathInfo => { |
(Seems cleaner to me if the variable names match, and also match the type. Or, for the inner-most, use an obvious abbreviation like m. That way the reader doesn’t have to understand that these things are all the same)
app/models/dataset/Dataset.scala
Outdated
q"""UPDATE webknossos.dataset_mags | ||
SET path = ${pathInfo.path}, realPath = ${pathInfo.realPath}, haslocaldata = ${pathInfo.hasLocalData} | ||
WHERE _dataset = $datasetId | ||
AND datalayername = ${pathInfo.layerName} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please follow capitalization from schema.sql here, dataLayerName and hasLocalData
app/models/dataset/Dataset.scala
Outdated
row.path.getOrElse("uninitialized"), | ||
row.realpath.getOrElse("uninitialized"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we use None for this case? I feel like having a custom string literal in the json could cause headaches down the line. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did this to avoid having options in the DatasetMagInfo, since this is only a theoretical case either way. The path and realPath are set when the dataset is reported, so it would only be empty between the two report functions that are separate at the moment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn’t call it a theoretical case, since there is a real time difference between the two report functions. That means that this value will, at least for short durations, occur in the database and downstream functions will have to be able to handle it. If a function lists all datasets that use the same layer, anad does string comparison, this would return all that have "uninitialized", which could become a nonsensical list. I’d rather have the Option if we cannot prevent that this value is ever inserted.
case Full(dataset) => datasetMagsDAO.updateMagPathsForDataset(dataset._id, pathInfo.magPathInfos) | ||
case Empty => // Dataset reported but ignored (non-existing/forbidden org) | ||
Fox.successful(()) | ||
case e: EmptyBox => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh, TIL about the EmptyBox trait. Looks like we should have used this in a lot of places where we match against Failure (but miss ParamFailure?). But I guess we instantiate so few ParamFailures that this hasn’t caused any real problems.
_ <- Fox.serialCombined(pathInfos)(updateRealPath) | ||
} yield () | ||
|
||
def getPathsForDatalayer(datasetId: ObjectId, layerName: String): Fox[List[(DatasetMagInfo, List[DatasetMagInfo])]] = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use capital L in DataLayer/dataLayer for consistency. There are a couple of usages.
pathInfos = magPathBoxes.map { | ||
case (ds, Full(magPaths)) => DataSourcePathInfo(ds.id, magPaths) | ||
case (ds, Failure(msg, _, _)) => | ||
logger.error(s"Failed to determine real paths for mags of $ds: $msg") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should use formatFailureChain from the Formatter trait. Have a look at the other usages.
} | ||
case None => List() | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function has grown pretty complex, maybe some of its logic could be extracted to subfunctions? That could make it easier to read. A candidate might be the block handling one layer, and also the block handling one mag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (4)
app/models/dataset/DatasetService.scala (1)
343-355
: Consider improving error handling and logging.The error handling in
updateRealPath
could be enhanced:
- The empty case could benefit from debug logging
- The failure case could include more context in the error message
Apply this diff to improve error handling:
private def updateRealPath(pathInfo: DataSourcePathInfo)(implicit ctx: DBAccessContext): Fox[Unit] = if (pathInfo.magPathInfos.isEmpty) { + logger.debug(s"No mag paths to update for dataset ${pathInfo.dataSourceId}") Fox.successful(()) } else { val dataset = datasetDAO.findOneByDataSourceId(pathInfo.dataSourceId).futureBox dataset.flatMap { case Full(dataset) => datasetMagsDAO.updateMagPathsForDataset(dataset._id, pathInfo.magPathInfos) case Empty => // Dataset reported but ignored (non-existing/forbidden org) + logger.debug(s"Skipping dataset ${pathInfo.dataSourceId} (non-existing/forbidden org)") Fox.successful(()) case e: EmptyBox => - Fox.failure("dataset.notFound", e) + Fox.failure(s"Failed to find dataset ${pathInfo.dataSourceId}", e) } }webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/DataSourceService.scala (2)
94-108
: Consider improving error handling and logging for path reporting.The error handling in
reportRealPaths
could be enhanced:
- Add debug logging for successful path reporting
- Improve error message formatting
Apply this diff to improve error handling:
private def reportRealPaths(dataSources: List[InboxDataSource]) = for { _ <- Fox.successful(()) magPathBoxes = dataSources.map(ds => (ds, determineMagRealPathsForDataSource(ds))) pathInfos = magPathBoxes.map { case (ds, Full(magPaths)) => + logger.debug(s"Successfully determined paths for ${ds.id}") DataSourcePathInfo(ds.id, magPaths) case (ds, failure: Failure) => - logger.error(formatFailureChain(failure)) + logger.error(s"Failed to determine paths for ${ds.id}: ${formatFailureChain(failure)}") DataSourcePathInfo(ds.id, List()) case (ds, Empty) => - logger.error(s"Failed to determine real paths for mags of $ds") + logger.error(s"Empty result while determining paths for ${ds.id}") DataSourcePathInfo(ds.id, List()) } _ <- remoteWebknossosClient.reportRealPaths(pathInfos) } yield ()
110-128
: Extract complex path resolution logic into subfunctions.The method has grown complex and could benefit from being split into smaller, focused functions.
Apply this diff to improve code organization:
+ private def resolveLayerPath(datasetPath: Path, dataLayer: DataLayer): Path = { + val rawLayerPath = datasetPath.resolve(dataLayer.name) + if (Files.isSymbolicLink(rawLayerPath)) { + resolveRelativePath(datasetPath, Files.readSymbolicLink(rawLayerPath)) + } else { + rawLayerPath.toAbsolutePath + } + } + private def determineMagRealPathsForDataSource(dataSource: InboxDataSource) = tryo { val organizationPath = dataBaseDir.resolve(dataSource.id.organizationId) val datasetPath = organizationPath.resolve(dataSource.id.directoryName) dataSource.toUsable match { case Some(usableDataSource) => usableDataSource.dataLayers.flatMap { dataLayer => - val rawLayerPath = datasetPath.resolve(dataLayer.name) - val absoluteLayerPath = if (Files.isSymbolicLink(rawLayerPath)) { - resolveRelativePath(datasetPath, Files.readSymbolicLink(rawLayerPath)) - } else { - rawLayerPath.toAbsolutePath - } + val absoluteLayerPath = resolveLayerPath(datasetPath, dataLayer) dataLayer.mags.map { mag => getMagPathInfo(datasetPath, absoluteLayerPath, rawLayerPath, dataLayer, mag) } } case None => List() } }app/models/dataset/Dataset.scala (1)
722-739
: Consider using None for uninitialized paths.Using string literals for uninitialized paths could cause issues. Consider using
Option[String]
instead.Apply this diff to improve path handling:
- row.path.getOrElse("uninitialized"), - row.realpath.getOrElse("uninitialized"), + row.path.getOrElse(""), + row.realpath.getOrElse(""),
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
MIGRATIONS.unreleased.md
(1 hunks)app/controllers/DatasetController.scala
(2 hunks)app/models/dataset/Dataset.scala
(4 hunks)app/models/dataset/DatasetService.scala
(3 hunks)webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/DataSourceService.scala
(4 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- MIGRATIONS.unreleased.md
- app/controllers/DatasetController.scala
🔇 Additional comments (10)
app/models/dataset/DatasetService.scala (4)
17-17
: LGTM!The imports are correctly organized and necessary for the new functionality.
Also applies to: 23-23
37-37
: LGTM!The constructor dependency is correctly added and follows the existing pattern.
357-360
: LGTM!The
updateRealPaths
method correctly handles batch processing of path updates usingFox.serialCombined
.
362-370
: Add input validation for layer name.The method should validate the layer name before querying the database.
webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/DataSourceService.scala (3)
8-8
: LGTM!The imports and trait mixins are correctly organized and necessary for the new functionality.
Also applies to: 12-12, 40-41
34-34
: LGTM!The constructor dependency is correctly added and follows the existing pattern.
153-161
: LGTM!The utility methods
getMagURI
andresolveRelativePath
are well-implemented and handle their specific tasks correctly.Also applies to: 163-168
app/models/dataset/Dataset.scala (3)
712-721
: LGTM!The
DatasetMagInfo
case class is well-designed and includes all necessary fields with proper JSON formatting.
765-782
: LGTM!The
updateMagPathsForDataset
method correctly handles path updates with proper transaction isolation and retry logic.
784-794
: LGTM!The path query methods are well-implemented with proper error handling and data transformation.
Also applies to: 795-811
private def getMagPathInfo(datasetPath: Path, | ||
absoluteLayerPath: Path, | ||
rawLayerPath: Path, | ||
dataLayer: DataLayer, | ||
mag: MagLocator) = { | ||
val (magURI, isRemote) = getMagURI(datasetPath, absoluteLayerPath, mag) | ||
if (isRemote) { | ||
MagPathInfo(dataLayer.name, mag.mag, magURI.toString, magURI.toString, hasLocalData = false) | ||
} else { | ||
val magPath = Paths.get(magURI) | ||
val realPath = magPath.toRealPath() | ||
// Does this dataset have local data, i.e. the data that is referenced by the mag path is within the dataset directory | ||
val isLocal = realPath.startsWith(datasetPath.toAbsolutePath) | ||
val unresolvedPath = | ||
rawLayerPath.toAbsolutePath.resolve(absoluteLayerPath.relativize(magPath)).normalize() | ||
MagPathInfo(dataLayer.name, | ||
mag.mag, | ||
unresolvedPath.toUri.toString, | ||
realPath.toUri.toString, | ||
hasLocalData = isLocal) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add error handling for path resolution failures.
The getMagPathInfo
method should handle potential failures in path resolution.
Apply this diff to add error handling:
private def getMagPathInfo(datasetPath: Path,
absoluteLayerPath: Path,
rawLayerPath: Path,
dataLayer: DataLayer,
mag: MagLocator) = {
- val (magURI, isRemote) = getMagURI(datasetPath, absoluteLayerPath, mag)
- if (isRemote) {
- MagPathInfo(dataLayer.name, mag.mag, magURI.toString, magURI.toString, hasLocalData = false)
- } else {
- val magPath = Paths.get(magURI)
- val realPath = magPath.toRealPath()
- val isLocal = realPath.startsWith(datasetPath.toAbsolutePath)
- val unresolvedPath =
- rawLayerPath.toAbsolutePath.resolve(absoluteLayerPath.relativize(magPath)).normalize()
- MagPathInfo(dataLayer.name,
- mag.mag,
- unresolvedPath.toUri.toString,
- realPath.toUri.toString,
- hasLocalData = isLocal)
+ try {
+ val (magURI, isRemote) = getMagURI(datasetPath, absoluteLayerPath, mag)
+ if (isRemote) {
+ MagPathInfo(dataLayer.name, mag.mag, magURI.toString, magURI.toString, hasLocalData = false)
+ } else {
+ val magPath = Paths.get(magURI)
+ val realPath = magPath.toRealPath()
+ val isLocal = realPath.startsWith(datasetPath.toAbsolutePath)
+ val unresolvedPath =
+ rawLayerPath.toAbsolutePath.resolve(absoluteLayerPath.relativize(magPath)).normalize()
+ MagPathInfo(dataLayer.name,
+ mag.mag,
+ unresolvedPath.toUri.toString,
+ realPath.toUri.toString,
+ hasLocalData = isLocal)
+ }
+ } catch {
+ case e: Exception =>
+ logger.error(s"Failed to resolve path for mag ${mag.mag} in layer ${dataLayer.name}: ${e.getMessage}")
+ throw e
}
}
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
private def getMagPathInfo(datasetPath: Path, | |
absoluteLayerPath: Path, | |
rawLayerPath: Path, | |
dataLayer: DataLayer, | |
mag: MagLocator) = { | |
val (magURI, isRemote) = getMagURI(datasetPath, absoluteLayerPath, mag) | |
if (isRemote) { | |
MagPathInfo(dataLayer.name, mag.mag, magURI.toString, magURI.toString, hasLocalData = false) | |
} else { | |
val magPath = Paths.get(magURI) | |
val realPath = magPath.toRealPath() | |
// Does this dataset have local data, i.e. the data that is referenced by the mag path is within the dataset directory | |
val isLocal = realPath.startsWith(datasetPath.toAbsolutePath) | |
val unresolvedPath = | |
rawLayerPath.toAbsolutePath.resolve(absoluteLayerPath.relativize(magPath)).normalize() | |
MagPathInfo(dataLayer.name, | |
mag.mag, | |
unresolvedPath.toUri.toString, | |
realPath.toUri.toString, | |
hasLocalData = isLocal) | |
} | |
} | |
private def getMagPathInfo(datasetPath: Path, | |
absoluteLayerPath: Path, | |
rawLayerPath: Path, | |
dataLayer: DataLayer, | |
mag: MagLocator) = { | |
try { | |
val (magURI, isRemote) = getMagURI(datasetPath, absoluteLayerPath, mag) | |
if (isRemote) { | |
MagPathInfo(dataLayer.name, mag.mag, magURI.toString, magURI.toString, hasLocalData = false) | |
} else { | |
val magPath = Paths.get(magURI) | |
val realPath = magPath.toRealPath() | |
// Does this dataset have local data, i.e. the data that is referenced by the mag path is within the dataset directory | |
val isLocal = realPath.startsWith(datasetPath.toAbsolutePath) | |
val unresolvedPath = | |
rawLayerPath.toAbsolutePath.resolve(absoluteLayerPath.relativize(magPath)).normalize() | |
MagPathInfo(dataLayer.name, | |
mag.mag, | |
unresolvedPath.toUri.toString, | |
realPath.toUri.toString, | |
hasLocalData = isLocal) | |
} | |
} catch { | |
case e: Exception => | |
logger.error(s"Failed to resolve path for mag ${mag.mag} in layer ${dataLayer.name}: ${e.getMessage}") | |
throw e | |
} | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
app/models/dataset/DatasetService.scala (1)
357-360
: Consider adding batch size limit for path updates.While the implementation is correct, processing a large number of path updates sequentially could impact performance.
def updateRealPaths(pathInfos: List[DataSourcePathInfo])(implicit ctx: DBAccessContext): Fox[Unit] = for { - _ <- Fox.serialCombined(pathInfos)(updateRealPath) + _ <- Fox.serialCombined(pathInfos.grouped(100).toList)(batch => + Fox.serialCombined(batch)(updateRealPath)) } yield ()app/models/dataset/Dataset.scala (1)
766-783
: Consider adding validation for mag values.While the implementation handles the database operations correctly, it might be good to add validation for mag values before updating.
def updateMagPathsForDataset(datasetId: ObjectId, magPathInfos: List[MagPathInfo]): Fox[Unit] = for { - _ <- Fox.successful(()) + _ <- Fox.serialCombined(magPathInfos)(magPathInfo => + bool2Fox(magPathInfo.mag.x >= 0 && magPathInfo.mag.y >= 0 && magPathInfo.mag.z >= 0) ?~> "Invalid mag values") updateQueries = magPathInfos.map(magPathInfo => {
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
app/models/dataset/Dataset.scala
(4 hunks)app/models/dataset/DatasetService.scala
(3 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: circleci_build
🔇 Additional comments (4)
app/models/dataset/DatasetService.scala (2)
343-355
: LGTM! Good error handling for dataset path updates.The implementation properly handles different cases:
- Empty magPathInfos
- Non-existing datasets
- Error conditions with EmptyBox
362-373
: LGTM! Robust implementation of path retrieval.The implementation correctly:
- Handles both linked and unlinked mags
- Filters out self-references in linked mags
- Properly handles None case for realPath
app/models/dataset/Dataset.scala (2)
712-727
: LGTM! Well-structured data models for mag information.The case classes are well-designed with:
- Clear separation of concerns
- Proper type safety
- JSON format support
785-806
: LGTM! Efficient implementation of path queries.The implementation:
- Uses proper SQL joins
- Handles path resolution correctly
- Maintains type safety
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
app/models/dataset/Dataset.scala (1)
712-717
: 🛠️ Refactor suggestionFollow schema.sql field naming convention.
The field names should match the capitalization from schema.sql:
dataLayerName
should bedataLayerName
realPath
should berealPath
hasLocalData
should behasLocalData
Apply this diff to fix the field names:
case class DatasetMagInfo(datasetId: ObjectId, - dataLayerName: String, + dataLayerName: String, mag: Vec3Int, path: Option[String], - realPath: Option[String], + realPath: Option[String], - hasLocalData: Boolean) + hasLocalData: Boolean)
🧹 Nitpick comments (2)
app/models/dataset/Dataset.scala (2)
767-769
: Remove unnecessary empty Fox.successful.The empty
Fox.successful(())
is not needed here as it doesn't contribute to the logic.Apply this diff to remove the unnecessary line:
def updateMagPathsForDataset(datasetId: ObjectId, magPathInfos: List[MagPathInfo]): Fox[Unit] = for { - _ <- Fox.successful(()) updateQueries = magPathInfos.map(magPathInfo => {
769-776
: Match variable names with their types.For better readability, variable names should match their types. In this case,
magPathInfo
should be used consistently.Apply this diff to improve variable naming:
- updateQueries = magPathInfos.map(magPathInfo => { + updateQueries = magPathInfos.map(m => { - val magLiteral = s"(${magPathInfo.mag.x}, ${magPathInfo.mag.y}, ${magPathInfo.mag.z})" + val magLiteral = s"(${m.mag.x}, ${m.mag.y}, ${m.mag.z})" q"""UPDATE webknossos.dataset_mags SET path = ${magPathInfo.path}, realPath = ${magPathInfo.realPath}, hasLocalData = ${magPathInfo.hasLocalData} WHERE _dataset = $datasetId - AND dataLayerName = ${magPathInfo.layerName} + AND dataLayerName = ${m.layerName} AND mag = CAST($magLiteral AS webknossos.vector3)""".asUpdate
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
app/models/dataset/Dataset.scala
(4 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: circleci_build
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! 😎 Backend LGTM.
Please remove the getLinkedMags route before merging to avoid leaking non-public information.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/DSRemoteWebknossosClient.scala (1)
40-50
: Consider adding unit tests for these newly introduced case classes.It would be beneficial to have tests covering JSON (de)serialization and the logical interactions of these data classes. That way, any regressions in their structure or usage can be caught early.
Would you like me to generate a basic test suite for these case classes?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
MIGRATIONS.unreleased.md
(1 hunks)app/controllers/DatasetController.scala
(1 hunks)app/controllers/WKRemoteDataStoreController.scala
(2 hunks)conf/webknossos.latest.routes
(1 hunks)webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/DSRemoteWebknossosClient.scala
(3 hunks)webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/DataSourceRepository.scala
(1 hunks)webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/DataSourceService.scala
(4 hunks)
🚧 Files skipped from review as they are similar to previous changes (6)
- MIGRATIONS.unreleased.md
- conf/webknossos.latest.routes
- webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/DataSourceRepository.scala
- app/controllers/DatasetController.scala
- app/controllers/WKRemoteDataStoreController.scala
- webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/DataSourceService.scala
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: circleci_build
🔇 Additional comments (2)
webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/DSRemoteWebknossosClient.scala (2)
8-8
: Import looks good.This import from
com.scalableminds.util.geometry.Vec3Int
properly facilitates the newmag
field inMagPathInfo
.
117-122
: Revisit explicit result type and logging strategy.Currently,
reportRealPaths
returnsFox[_]
and uses.silent
, providing minimal visibility into errors or partial failures. Please consider returning a more explicit type (e.g.,Fox[ReportPathsResult]
) and/or adding logs on failures to facilitate troubleshooting.
URL of deployed dev instance (used for testing):
Steps to test:
Overview
In the datastore we now scan for mags and symlinks. Symlinks are resolved for layer directories and mag directories. The result is send to WK via a new route (subject to discussion).
In WK the database stores per mag:
So that
Using hasLocalData and realPath we will be able to handle deletions.
I added a route to get a list of mags that use a mags path, this is for testing purposes and can be deleted before merge.
TODOs:
Issues:
(Please delete unneeded items, merge only when none are left open)