-
Notifications
You must be signed in to change notification settings - Fork 29
Use FossilDBPutBuffer for more efficient saving of multiple fossil keys #8482
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
… with outdated mapping?)
…ngstore/tracings/volume/VolumeSegmentIndexBuffer.scala Co-authored-by: MichaelBuessemeyer <[email protected]>
…ngstore/tracings/volume/VolumeSegmentIndexBuffer.scala Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me as well. Testing went also well 👍
Just dumping thoughts here to comment on:
Currently, the buffer blocks while flushing. But having a list of buffers in the buffer would enable us to fill a second buffer while on is being flushed to the fossilddb and then the second buffer could be filled. This might not be necessary, but might save some waiting time while the buffer is being flushed. But this would also increase the complexity of this feature
Thanks for your review! You’re right, this might be a way to speed this up further but as you said, I’d rather avoid the complexity of having to ensure that everything is working correctly in parallel and correctly waiting for everything on the final flush. Also, since we are serving many users in parallel, maximum parallelization could actually also be a downside. While it would serve to answer individual requests more quickly, it would potentially also use up many threads, CPU, and RPC requests, so that some (possibly smaller) requests by other users might get answered considerably later. This is also why I mostly still use serialCombined, to limit parallelization inside of individual user requests. An alternative would be to use custom thread pools for fine-grained control, but that would add another layer of complexity. I hope this also answers #8469 (comment) |
Measured duplicate on a volume annotation.
Duplicating 923 volume buckets took 827ms on master, 360ms here (factor 2.2).
Similar speedups are to be expected for the other usages.
URL of deployed dev instance (used for testing):
Steps to test:
Issues: