Skip to content

Do not close and re-open TSDBs during ingester transfer #1854

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

pracucci
Copy link
Contributor

What this PR does:
The Cortex TSDB experimental integration currently close all TSDBs during the ingester shutdown, before blocks and WALs are transferred to a joining ingester. This has two main issues:

  1. Query requests received while closing TSDBs fail with error open index reader: block is closing
  2. Query requests received after the TSDBs have been closed issue a re-opening of TSDBs, which means re-playing the WAL (expensive operation)

Me and @codesome checked out the TSDB implementation and came to the conclusion which is safe to transfer blocks as WAL without closing the TSDB as far as:

  1. No more writes occur
  2. No compaction is on-going or will occur

In this PR I've introduced some changes to avoid closing and re-opening TSDBs on queries received during ingester blocks/WAL transfer, as well as some logic to guarantee no writes occur to the TSDB once the transfer starts.

I've tested the changes in our TSDB test cluster and looks working as expected (at least, experimentally).

Suggestion to the reviewer: enable "Hide whitespace changes" because there are a couple of large code blocks for which it has changed only the indentation.

Which issue(s) this PR fixes:
Fixes #1829

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

Copy link
Contributor

@pstibrany pstibrany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. I don't quite understand the decision to keep on transferring even if there are still writers though.

Copy link
Contributor

@pstibrany pstibrany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. There is one small change I think should be made (see comment), but still looks good ;-)

@pracucci
Copy link
Contributor Author

@thorfour May you check it out again and - if you don't have any further comment and you're good with it - approve it, please? Otherwise feel free to take more time to think about it or comment.

@thorfour
Copy link
Contributor

@pracucci will take a look first thing Monday

@pracucci pracucci force-pushed the fix-queries-while-transferring-tsdb-on-ingester-leaving branch from 0d5dd4c to 5e84cc4 Compare December 4, 2019 10:50
@pracucci
Copy link
Contributor Author

pracucci commented Dec 4, 2019

@gouthamve and @thorfour I've fixed the conflicts after rebasing. May you take a look and help this PR moving forward, please?

Copy link
Contributor

@thorfour thorfour left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@pracucci pracucci force-pushed the fix-queries-while-transferring-tsdb-on-ingester-leaving branch from 36819ba to 5d9e956 Compare December 11, 2019 14:18
Copy link
Contributor

@jtlisi jtlisi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@pracucci pracucci force-pushed the fix-queries-while-transferring-tsdb-on-ingester-leaving branch from 5d9e956 to f449a43 Compare December 12, 2019 08:47
@gouthamve gouthamve merged commit a32c3e1 into cortexproject:master Dec 12, 2019
@pracucci pracucci deleted the fix-queries-while-transferring-tsdb-on-ingester-leaving branch December 12, 2019 10:08
@pracucci pracucci mentioned this pull request Jan 22, 2020
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Do not close and re-open TSDBs during ingester transfer
5 participants