Skip to content

Faster rolling upgrades #321

Closed
Closed
@tomwilkie

Description

@tomwilkie

Currently rolling upgrades involve flushing all in memory chunks, which takes ages - and as we get more users, will get worse.

Idea would be to have new ingesters come up in a "joining" state, which is then "claimed" by one of the shutting down ingesters, all in-memory content streamed to it, then the leaving ingesters tokens are claimed by the joining ingester.

Sequence would look like this:

Running/exiting ingester:

  1. Is signalled to exit.
  2. Enters "Leaving" state.
  3. Waits a fixed amount of time for an ingester to appear in "Joining" state.
    • if no ingester found, continues to flush all its state to chunk store
  4. Does RPC to Joining ingester, streaming all its chunks.
  5. When streaming RPC is done, waits for other ingester to claim its tokens, then removes itself from the ring and exits.

New ingester:

  1. Joins ring on startup.
  2. Enters "Joining" state - but does not add any tokens to the ring.
  3. Waits a fixed amount of time to be contacted by a "Leaving" ingester .
    • after timeout, add tokens to ring and enters normal state.
  4. When it receives a streaming RPC from ingester, "locks" itself to that ingester.
    • puts all chunks it received in memory
  5. When all chunks are received, claims leaving ingesters tokens.

The timeouts exist so that new ingesters added for the sake of scaling don't wait for ever. I would envisage they would be quite small.

If the RPC at step 4 fails, both ingesters go back to step 3.

WDYT?

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions