Proposal: Remove support for TSDB blocks transfer between ingesters #2966

pstibrany · 2020-07-31T07:21:24Z

When ingesters start and stop, they support transfer of data from leaving ingester to joining. This mechanism allows for fast rollouts of new ingesters, because otherwise ingesters would need to flush data from memory to storage, which can take a long time (up to 1h). Transfers were the primary solution to deal with slow rollouts, until WAL support for chunks was introduced in ingesters, which made transfers obsolete. Ingesters with WAL don't need to flush data to long-term storage, as they will reload same data into memory from WAL after restart.

When TSDB blocks support was introduced to ingesters, they also supported transfers.

In this issue, we suggest to remove support for transfers of TSDB blocks between ingesters. This doesn't affect transfer of chunks.

Reasons for this proposal:

Transfers are rarely used with blocks storage: Blocks are always meant to be used with persistent disks to store the blocks, and since each TSDB also has WAL, ingester restart doesn't require flushing of blocks to storage. Restarts are already quite fast.
Transfers complicate ingesters logic when using blocks. Removing transfers removes extra complexity, both in the code and operations.
Dropping transfer support of blocks would simplify design of solution for issue Blocks storage unable to ingest samples older than 1h after an outage #2366.
There is an integration test for blocks transfers, but apart from that, it's not clear if anyone uses the feature.
Blocks are still experimental feature, deprecating blocks transfer now doesn't break feature-guarantees. Once blocks are marked as production ready, it will be more difficult to remove this support.

brancz · 2020-07-31T09:34:08Z

Have you considered doing the same as thanos-receive and just getting rid of the state entirely? It flushes the head block to and uploads it. Yes, this can produce non optimal blocks, however, if we put some work into chunk merging at compaction time, this can be optimized away. Just a thought and it could allow for Thanos and Cortex to be even more similar. :)

pstibrany · 2020-07-31T09:38:31Z

Cortex has an option for flushing all TSDBs on shutdown, but problem is that ingester may have many TSDBs and flushing on shutdown may run out of time due to termination grace period.

For scale-downs I prefer to use /shutdown endpoint, which does the flush (without no termination timeout) and then stops ingester component. This also doesn't rely on flush-on-shutdown flag being set or not, /shutdown always does flush.

brancz · 2020-07-31T09:53:15Z

We have a kubernetes controller planned (actually the controller exists but doesn't do this yet) that ensures eventual upload should it not succeed on "normal" shutdown (essentially schedules jobs with the "previous" PVC until upload succeeds, cleans up the volume and recycles it). Practice will show if this is "too eventually consistent", but if not then I think more thought through config rollout schemes can solve this as well.

pstibrany · 2020-07-31T09:58:12Z

We have a kubernetes controller planned (actually the controller exists but doesn't do this yet) that ensures eventual upload should it not succeed on "normal" shutdown (essentially schedules jobs with the "previous" PVC until upload succeeds, cleans up the volume and recycles it). Practice will show if this is "too eventually consistent", but if not then I think more thought through config rollout schemes can solve this as well.

Nice! Our flusher component can do the same (it's a job that flushes all TSDB data from PVC), but we don't have a controller for it.

thorfour · 2020-07-31T12:52:49Z

+1 for getting rid or transfers!

ranton256 · 2020-07-31T16:23:36Z

Given the WAL already having an implied need for stateful set with block store, removing the transfer support between ingestors does not seem like it would cause any additional hurdles and would simplify things.

pracucci added the storage/blocks Blocks storage engine label Jul 31, 2020

codesome mentioned this issue Jul 31, 2020

Blocks storage unable to ingest samples older than 1h after an outage #2366

Open

pracucci mentioned this issue Aug 7, 2020

Removed ingesters blocks transfer support #2996

Merged

3 tasks

pracucci closed this as completed in #2996 Aug 10, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal: Remove support for TSDB blocks transfer between ingesters #2966

Proposal: Remove support for TSDB blocks transfer between ingesters #2966

pstibrany commented Jul 31, 2020 •

edited

Loading

brancz commented Jul 31, 2020

pstibrany commented Jul 31, 2020 •

edited

Loading

brancz commented Jul 31, 2020 •

edited

Loading

pstibrany commented Jul 31, 2020

thorfour commented Jul 31, 2020

ranton256 commented Jul 31, 2020

Proposal: Remove support for TSDB blocks transfer between ingesters #2966

Proposal: Remove support for TSDB blocks transfer between ingesters #2966

Comments

pstibrany commented Jul 31, 2020 • edited Loading

brancz commented Jul 31, 2020

pstibrany commented Jul 31, 2020 • edited Loading

brancz commented Jul 31, 2020 • edited Loading

pstibrany commented Jul 31, 2020

thorfour commented Jul 31, 2020

ranton256 commented Jul 31, 2020

pstibrany commented Jul 31, 2020 •

edited

Loading

pstibrany commented Jul 31, 2020 •

edited

Loading

brancz commented Jul 31, 2020 •

edited

Loading