-
Notifications
You must be signed in to change notification settings - Fork 818
Add flush job mode to ingester. #1885
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some thoughts:
- I would argue that flush mode "job" should not even register in the ring (even though the tokens are set to 0). This might create it's own problem (I guess?).
- There is a long wait before we start flushing even when we know it's a flush job (right?)
The conclusion that I got from this is that the t.Stop()
has to be initiated inside the cortex
package, and it hard to do it from the ingester package itself. I will check how I can combine ideas from this PR and to include the above 2 points.
@@ -238,7 +241,20 @@ func (t *Cortex) initModule(cfg *Config, m moduleName) error { | |||
|
|||
// Run starts Cortex running, and blocks until a signal is received. | |||
func (t *Cortex) Run() error { | |||
return t.server.Run() | |||
var err error | |||
done := make(chan struct{}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are not waiting for this anywhere (I will create a follow-up PR to your branch, just a note to myself)
Can you think of cases? I see this flush-job as something you run after the cluster has stabilised and I don't see any issues with it. Even if the cluster is in flux, I don't see it causing an major issues.
How / Where? |
If we assume this then should be fine. But it is still an assumption. I am afraid it might interfere with transfer otherwise? I prefer running this job without worrying about the state of the cluster, as that just adds another dependency for the job.
Apparently it does not after the |
9903110
to
f7c407d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will test this out next week
|
||
// Initiate shutdown if its a job to flush data. | ||
if t.cfg.Ingester.IsFlushJob { | ||
_ = t.Stop() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to make sure that calling t.Stop()
twice is not a problem. Currently, main.go
would also call this and try stopping the modules again.
I have rebased #1747 on top of this with a small fix. I think we can close this now. |
This is my attempt to simplify things compared to cortexproject#1747 /cc @jtlisi @codesome Signed-off-by: Goutham Veeramachaneni <[email protected]>
f7c407d
to
cf877d6
Compare
Closing this as most of the changes have been incorporated in #1747 |
This is my attempt to simplify things compared to #1747
/cc @jtlisi @codesome
Signed-off-by: Goutham Veeramachaneni [email protected]
What this PR does:
Which issue(s) this PR fixes:
Fixes #
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]