Use FossilDBPutBuffer for more efficient saving of multiple fossil keys #8482

fm3 · 2025-03-31T14:47:06Z

Measured duplicate on a volume annotation.
Duplicating 923 volume buckets took 827ms on master, 360ms here (factor 2.2).
Similar speedups are to be expected for the other usages.

URL of deployed dev instance (used for testing):

https://putbuffer.webknossos.xyz

Steps to test:

Edit a skeleton annotation with multiple trees, reload, should work
Duplicate annotation, should work (skeleton, volume, editable mapping)
Merge annotations, should work (skeleton, volume, editable mapping)

Issues:

contributes to Use fossilDB multi-get to optimize performance, e.g. for editable mappings/proofreading #6962

Removed dev-only changes like prints and application.conf edits
Considered common edge cases
Needs datastore update after deployment

…tched

… with outdated mapping?)

… index

…ngstore/tracings/volume/VolumeSegmentIndexBuffer.scala Co-authored-by: MichaelBuessemeyer <[email protected]>

…ngstore/tracings/volume/VolumeSegmentIndexBuffer.scala Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

MichaelBuessemeyer

Looks good to me as well. Testing went also well 👍

Just dumping thoughts here to comment on:
Currently, the buffer blocks while flushing. But having a list of buffers in the buffer would enable us to fill a second buffer while on is being flushed to the fossilddb and then the second buffer could be filled. This might not be necessary, but might save some waiting time while the buffer is being flushed. But this would also increase the complexity of this feature

fm3 · 2025-04-07T07:16:25Z

Thanks for your review! You’re right, this might be a way to speed this up further but as you said, I’d rather avoid the complexity of having to ensure that everything is working correctly in parallel and correctly waiting for everything on the final flush.

Also, since we are serving many users in parallel, maximum parallelization could actually also be a downside. While it would serve to answer individual requests more quickly, it would potentially also use up many threads, CPU, and RPC requests, so that some (possibly smaller) requests by other users might get answered considerably later. This is also why I mostly still use serialCombined, to limit parallelization inside of individual user requests. An alternative would be to use custom thread pools for fine-grained control, but that would add another layer of complexity. I hope this also answers #8469 (comment)

fm3 added 30 commits March 20, 2025 14:17

WIP: Improve segment index performance

4a8f067

implementation with triple cast on every element, not faster than scala

45adb6a

measure what else is slow

398e757

wip cleanup, measure other stuff

3b8fd9f

Merge branch 'master' into segment-index-perf

ecb2c9a

wip: use volume bucket buffer, fossil multi-put

be8ba6f

logging

0f29acb

WIP rewrite segmentIndexBuffer to include caching, use Set

79312c8

use new segment index buffer functionality

f515bfc

WIP fossil multi-get for buckets

ad7eeff

fix setting volumeBucketDataHasChanged

7693b78

do not parallelize prefill

a3a36ce

Merge branch 'master' into segment-index-perf

0d5c9df

use batched multi-get and multi-put

8264691

connect to temporaryStore, compress if needed

9353c6b

load buckets from temporarystore, remote fallback layer

9dc7d31

load buckets from fallback layer in one request

2cea8b5

Merge branch 'master' into segment-index-perf

f6290c1

Merge branch 'segment-index-perf' into segment-index-perf-fallback-ba…

b9c75ce

…tched

some cleanup

ab0f2d7

readerOnly segmentIndexBuffer

9f1872f

cleanup

10c2c92

cleanup

2dd76bc

format, remove logging

9bdb487

debug wrong volume values for editable mapping (id seems to be mapped…

d400771

… with outdated mapping?)

Merge branch 'master' into segment-index-perf

6247a9a

request correct version of editable mapping data

27129b9

changelog, migration

f0ff544

cleanup

eee5bbc

Merge branch 'master' into segment-index-perf

bf99be3

fm3 added backend performance labels Mar 31, 2025

fm3 and others added 14 commits April 1, 2025 11:52

Merge branch 'master' into segment-index-perf

5275c20

clean up cpp code

5352edf

const, try/catch

993b9cc

clang-format

d80da1b

size_t for index

b6ac259

Do not cache EditableMappingBucketProvider across versions

3b7b4af

pr feedback part 1; consistent volumeLayer naming

21bfbec

use fossil multi-get when requesting multiple segmentIds from segment…

5292a33

… index

use map for more efficient lookups during gathering segmentIndex values

2b71386

add one more conversion from set to seq for less map overhead

8bc345e

Update webknossos-tracingstore/app/com/scalableminds/webknossos/traci…

ab98fc4

…ngstore/tracings/volume/VolumeSegmentIndexBuffer.scala Co-authored-by: MichaelBuessemeyer <[email protected]>

Update webknossos-tracingstore/app/com/scalableminds/webknossos/traci…

a2bd486

…ngstore/tracings/volume/VolumeSegmentIndexBuffer.scala Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

Merge branch 'master' into segment-index-perf

a1ceb36

Merge branch 'segment-index-perf' into put-buffer

0e6f3bd

Base automatically changed from segment-index-perf to master April 3, 2025 06:26

Merge branch 'master' into put-buffer

ef589b2

fm3 marked this pull request as ready for review April 3, 2025 09:11

fm3 requested a review from MichaelBuessemeyer April 3, 2025 09:17

MichaelBuessemeyer mentioned this pull request Apr 4, 2025

Use multi-get; native bucket scanner in segment stats computation #8469

Merged

10 tasks

MichaelBuessemeyer approved these changes Apr 4, 2025

View reviewed changes

Merge branch 'master' into put-buffer

22894d1

fm3 enabled auto-merge (squash) April 7, 2025 08:01

fm3 merged commit 5ef0222 into master Apr 7, 2025
3 checks passed

fm3 deleted the put-buffer branch April 7, 2025 08:14

This was referenced Apr 7, 2025

Fix editable mapping updater’s putBuffer #8505

Merged

Profile + Optimize performance of saving volume buckets / updating segment index #8404

Closed

Fix concurrency bug in skeleton saving #8513

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Use FossilDBPutBuffer for more efficient saving of multiple fossil keys #8482

Use FossilDBPutBuffer for more efficient saving of multiple fossil keys #8482

Uh oh!

fm3 commented Mar 31, 2025 •

edited

Loading

Uh oh!

MichaelBuessemeyer left a comment

Uh oh!

fm3 commented Apr 7, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Use FossilDBPutBuffer for more efficient saving of multiple fossil keys #8482

Use FossilDBPutBuffer for more efficient saving of multiple fossil keys #8482

Uh oh!

Conversation

fm3 commented Mar 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

URL of deployed dev instance (used for testing):

Steps to test:

Issues:

Uh oh!

MichaelBuessemeyer left a comment

Choose a reason for hiding this comment

Uh oh!

fm3 commented Apr 7, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fm3 commented Mar 31, 2025 •

edited

Loading