-
Notifications
You must be signed in to change notification settings - Fork 816
Proposal: Remove support for TSDB blocks transfer between ingesters #2966
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Have you considered doing the same as thanos-receive and just getting rid of the state entirely? It flushes the head block to and uploads it. Yes, this can produce non optimal blocks, however, if we put some work into chunk merging at compaction time, this can be optimized away. Just a thought and it could allow for Thanos and Cortex to be even more similar. :) |
Cortex has an option for flushing all TSDBs on shutdown, but problem is that ingester may have many TSDBs and flushing on shutdown may run out of time due to termination grace period. For scale-downs I prefer to use |
We have a kubernetes controller planned (actually the controller exists but doesn't do this yet) that ensures eventual upload should it not succeed on "normal" shutdown (essentially schedules jobs with the "previous" PVC until upload succeeds, cleans up the volume and recycles it). Practice will show if this is "too eventually consistent", but if not then I think more thought through config rollout schemes can solve this as well. |
Nice! Our flusher component can do the same (it's a job that flushes all TSDB data from PVC), but we don't have a controller for it. |
+1 for getting rid or transfers! |
Given the WAL already having an implied need for stateful set with block store, removing the transfer support between ingestors does not seem like it would cause any additional hurdles and would simplify things. |
When ingesters start and stop, they support transfer of data from leaving ingester to joining. This mechanism allows for fast rollouts of new ingesters, because otherwise ingesters would need to flush data from memory to storage, which can take a long time (up to 1h). Transfers were the primary solution to deal with slow rollouts, until WAL support for chunks was introduced in ingesters, which made transfers obsolete. Ingesters with WAL don't need to flush data to long-term storage, as they will reload same data into memory from WAL after restart.
When TSDB blocks support was introduced to ingesters, they also supported transfers.
In this issue, we suggest to remove support for transfers of TSDB blocks between ingesters. This doesn't affect transfer of chunks.
Reasons for this proposal:
The text was updated successfully, but these errors were encountered: