Skip to content

[WIP] Reset invocation #3186

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

slinkydeveloper
Copy link
Contributor

@slinkydeveloper slinkydeveloper commented Apr 22, 2025

Fix #2890, depends on #2904

@slinkydeveloper slinkydeveloper changed the title [WIP] Trim and restart Time travel Apr 23, 2025
@slinkydeveloper
Copy link
Contributor Author

E2E test restatedev/e2e#377

Copy link
Contributor

@tillrohrmann tillrohrmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for creating this PR @slinkydeveloper. I think this is a very powerful feature that our users will like a lot :-) Before we can merge it we need to make sure that there isn't a problem with modifying the JournalMetadata while a concurrent retry is happening in the invoker. I suspect that there is a possibility that the invoker might read the JournalMetadata from before a trim operation and the journal from after a trim operation because it is not using a snapshot of RocksDB. Before this was never a problem because modifying the JournalMetadata stopped the retry mechanism (via the JournalTracker) but with the time travel command this might have changed.

@slinkydeveloper
Copy link
Contributor Author

Followup PRs will be:

Copy link
Contributor

@tillrohrmann tillrohrmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for updating this PR @slinkydeveloper. The changes look good to me. I mainly had minor comments. The ones that need some clarification are the following:

  1. What's the backwards compatibility story if this feature gets activated in the next released version? Alternatively, we could make the feature opt-in by waiving the right to go back. For the latter, how would this be enforced?
  2. We seem to always remove signals that are beyond the trim point. Aren't signals used by awakeables to complete them? If yes, can't it happen that we truncate the completion for an awakeable that won't get truncated? If yes, then we might end up being stuck because we might not re-run the logic to tell some other service to complete the awakeable again.
  3. Maybe it's worth unifying a little bit the terminology we are using. The feature to reset the journal is called "reset". Reset has a "truncate_from" index which gets translated into a TrimInvocation command which causes the journal to be rewritten. Maybe we can unify reset/truncate/trim or truncate and trim?

@slinkydeveloper
Copy link
Contributor Author

Will rebase this PR on #3227 when ready.

@slinkydeveloper slinkydeveloper changed the title Reset invocation [WIP] Reset invocation May 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Trimming journal
3 participants