Added proposal for ingester+querier migration from chunks to blocks and back. #2717

pstibrany · 2020-06-11T14:22:45Z

What this PR does: This is a proposal for adding some options for making switch of ingesters and querier between chunks and blocks possible.

Signed-off-by: Peter Štibraný <[email protected]>

pracucci

Thanks! LGTM. I left few nits.

docs/proposals/ingesters-migration.md

Co-authored-by: Marco Pracucci <[email protected]> Signed-off-by: Peter Štibraný <[email protected]>

Signed-off-by: Peter Štibraný <[email protected]>

gouthamve

Sounds super risky tbh, but I can't come up with better ideas.

gouthamve · 2020-06-17T09:01:44Z

docs/proposals/ingesters-migration.md

+- Ingesters using WAL don’t flush in-memory chunks to storage on shutdown.
+- Rollout should be as automated as possible.
+
+How do we handle ingesters with WAL? There are several possibilities, but the simplest option seems to be adding a new flag to ingesters to flush chunks on shutdown. This is trivial change to ingester, and allows us to do automated migration by:


Or something simpler, just don't send any data for 30mins (idle-timeout) before shutting down :)

Oh wait, both the ingester deployments will share the same ring? That makes sense. Can we call out that we will be using the same statefulset and not going to create a new one for blocks?

We also have a shutdown endpoint on the chunks ingester which just flushes and shuts down.

We also have a shutdown endpoint on the chunks ingester which just flushes and shuts down.

Yes we do, but I don't quite see how to use it for automation. On calling /shutdown, ingester flushes everything and stops. Kubernetes will just restart it. But we need restart with new configuration.

I think it idles out, and doesn't stop iirc. @codesome ? But yes, I agree.

@gouthamve you are right. Kubernetes won't restart it. /shutdown endpoint will flush all the chunks and remove itself from the ring and stay idle.

gouthamve · 2020-06-17T10:19:46Z

So somethings that are missing here but would provide more context would be:

Mention that we are only tackling WAL (or statefulsets). Maybe a section on how it would work with deployments?
Mention that we share the same ring and queriers will only see a single set of ingesters (chunks + blocks) at times.
Mention that the queriers will be configured to read from blocks on object store + chunks on NoSQL store at the same time.
Mention how the potential schema config will look like, specially in the case of a rollback. Currently we support splitting it by the UTC day boundary, but how will that work because the rollout won't align to that. I see that for a particular time-range we need to query both.

These are some concerns I see and we can iterate on the proposal again once these are addressed?

codesome · 2020-06-17T10:28:26Z

Maybe a section on how it would work with deployments?

I think it will look similar to migration from deployments to WAL.

Signed-off-by: Peter Štibraný <[email protected]>

pstibrany · 2020-06-18T13:46:55Z

I think it will look similar to migration from deployments to WAL.

I think it's actually bit easier, since we cannot transfer data from chunks ingesters to blocks ingesters, so they don't need to be scaled up and down in lock-step. But chunks ingesters must be reconfigured to avoid transfers.

Signed-off-by: Peter Štibraný <[email protected]>

pstibrany · 2020-06-18T14:13:21Z

Mention how the potential schema config will look like, specially in the case of a rollback. Currently we support splitting it by the UTC day boundary, but how will that work because the rollout won't align to that. I see that for a particular time-range we need to query both.

Schemas are low-level thing for configuring chunks store. We will not use schema for anything in this proposal.

Querier needs single timestamp to decide whether to query chunks store or not. Blocks store can always be queried, because querier knows whether to hit any block based on block metadata.

See my PR #2747 which implements this already.

I've updated "Querying" section of the document to better explain this.

Signed-off-by: Peter Štibraný <[email protected]>

pstibrany · 2020-06-18T14:14:49Z

@gouthamve I've updated the document with your feedback. Please take a look again when you have time.

docs/proposals/ingesters-migration.md

pracucci · 2020-06-23T14:30:36Z

Let's merge!

Signed-off-by: Marco Pracucci <[email protected]>

Added ingesters-migration.md document.

f7c684a

Signed-off-by: Peter Štibraný <[email protected]>

pull-request-size bot added the size/M label Jun 11, 2020

clean white noise

96bf175

Signed-off-by: Peter Štibraný <[email protected]>

pracucci approved these changes Jun 12, 2020

View reviewed changes

pstibrany and others added 2 commits June 16, 2020 10:58

Apply suggestions from code review

72bb6af

Co-authored-by: Marco Pracucci <[email protected]> Signed-off-by: Peter Štibraný <[email protected]>

Updated docs based on feedback.

f9a4a23

Signed-off-by: Peter Štibraný <[email protected]>

gouthamve approved these changes Jun 17, 2020

View reviewed changes

pstibrany added 2 commits June 18, 2020 15:16

Mention that we reuse existing statefulset of ingesters.

66a977a

Signed-off-by: Peter Štibraný <[email protected]>

Added section about ingesters without WAL.

e1b6fdc

Signed-off-by: Peter Štibraný <[email protected]>

Update querying part.

b513965

Signed-off-by: Peter Štibraný <[email protected]>

whitenoise

92b7f79

Signed-off-by: Peter Štibraný <[email protected]>

pracucci approved these changes Jun 19, 2020

View reviewed changes

docs/proposals/ingesters-migration.md Outdated Show resolved Hide resolved

pstibrany mentioned this pull request Jun 22, 2020

Support querying multiple stores by Querier #2747

Merged

3 tasks

gouthamve approved these changes Jun 23, 2020

View reviewed changes

Update docs/proposals/ingesters-migration.md

5e47c42

Signed-off-by: Marco Pracucci <[email protected]>

pracucci merged commit e0cfad7 into cortexproject:master Jun 23, 2020

This was referenced Jun 24, 2020

Added flag to flush chunks to long-term storage even when using WAL. #2780

Merged

Added support for flushing blocks #2794

Merged

Added proposal for ingester+querier migration from chunks to blocks and back. #2717

Added proposal for ingester+querier migration from chunks to blocks and back. #2717

Uh oh!

Conversation

pstibrany commented Jun 11, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pracucci left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gouthamve left a comment

Choose a reason for hiding this comment

Uh oh!

gouthamve Jun 17, 2020

Choose a reason for hiding this comment

Uh oh!

gouthamve Jun 17, 2020

Choose a reason for hiding this comment

Uh oh!

pstibrany Jun 17, 2020

Choose a reason for hiding this comment

Uh oh!

gouthamve Jun 17, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

codesome Jun 17, 2020

Choose a reason for hiding this comment

Uh oh!

gouthamve commented Jun 17, 2020

Uh oh!

codesome commented Jun 17, 2020

Uh oh!

pstibrany commented Jun 18, 2020

Uh oh!

pstibrany commented Jun 18, 2020

Uh oh!

pstibrany commented Jun 18, 2020

Uh oh!

Uh oh!

pracucci commented Jun 23, 2020

Uh oh!

Uh oh!

pstibrany commented Jun 11, 2020 •

edited

Loading

gouthamve Jun 17, 2020 •

edited

Loading