Skip to content

Flush mode for ingester #1747

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

codesome
Copy link
Contributor

This is as per this design. Ingester would be run in flush mode in case someone forgets to hit the shutdown endpoint as described in #1746 before scaling down the ingesters with WAL.

Still in draft mode, as I have to check about the ingester quitting gracefully and not just stopping the operations.

@codesome codesome force-pushed the flush-only-ingester branch from 1fc5f35 to a5af8ea Compare October 22, 2019 12:57
@codesome codesome marked this pull request as ready for review October 22, 2019 13:14
@codesome codesome force-pushed the flush-only-ingester branch from a5af8ea to d1c3880 Compare October 22, 2019 17:08
Copy link
Contributor

@bboreham bboreham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it's early to review but I took a look and made a couple of comments.

@codesome codesome force-pushed the flush-only-ingester branch 3 times, most recently from cb5956b to fc24cce Compare November 18, 2019 18:31
@jtlisi jtlisi self-assigned this Nov 18, 2019
Copy link
Contributor

@jtlisi jtlisi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to refactor this code so the shutdown is initiated by the ingester module?

That way the main.go should not need any changes, and the IsIngesterJob function could be removed?

if cfg.FlushMode {
// Flush and don't update the ring.
i.Flush()
return i, nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we initiate the shutdown here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have been staring at this for a while and I might need some pointers on how cortex can be shut down within ingester. I can only see signals as an option.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hrm, let me see. Can we pass t.Stop to the ingester when initiating and then just call that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will try that! Should work

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Causes an initialization loop :/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I think it's not safe to stop all modules while we are in the process of initing all the modules (additionally, the server is started after all modules are inited).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried another modification of my earlier attempt - the modules propagate the info of shutting down via a bool instead of cortex explicitly knowing about the ingester job.


runtime.KeepAlive(ballast)
err = t.Stop()

if t.IsIngesterJob() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to refactor this logic into the ingester module?

@codesome
Copy link
Contributor Author

Would it be possible to refactor this code so the shutdown is initiated by the ingester module?

One blocker that I faced was that the server was started by the cortex package, and the Run() function is blocked on that. I could not find a way to stop the server from the ingester, would it be sane to pass the server object into the ingester which can stop it there?

@codesome
Copy link
Contributor Author

I have a clarification to make here: The flushing happens even before the server starts. But would the metrics be served via the default registry even before the server starts? If not, then the code needs to be refactored so that the server starts before flushing.

@jtlisi
Copy link
Contributor

jtlisi commented Nov 22, 2019

I have a clarification to make here: The flushing happens even before the server starts. But would the metrics be served via the default registry even before the server starts? If not, then the code needs to be refactored so that the server starts before flushing.

Yea we are going to need the server to be running if we want metrics from this job.

if cfg.FlushMode {
// Flush and don't update the ring.
i.Flush()
return i, nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hrm, let me see. Can we pass t.Stop to the ingester when initiating and then just call that?

@codesome codesome force-pushed the flush-only-ingester branch from feac537 to 7c362db Compare December 4, 2019 12:06
@codesome
Copy link
Contributor Author

codesome commented Dec 5, 2019

I am curious how can we scrape the metrics here even if we have the server running. Reason: The flushing happens in cortex.New when the ingester is being created. The server is stopped right after it was started, leaving little chance for a scrape to gather any metrics.

@codesome
Copy link
Contributor Author

codesome commented Dec 5, 2019

I tested this with #1103 and seems to be working fine.

@gouthamve gouthamve mentioned this pull request Dec 5, 2019
3 tasks
This is my attempt to simplify things compared to cortexproject#1747

/cc @jtlisi @codesome

Signed-off-by: Goutham Veeramachaneni <[email protected]>
@jtlisi
Copy link
Contributor

jtlisi commented Dec 9, 2019

@codesome Ideally we use something like a push gateway for something like this. However, that may take a significant bit of refactoring. Otherwise we could add a wait period before shutting down.

@codesome codesome force-pushed the flush-only-ingester branch from d5efd1d to 212ebd3 Compare December 10, 2019 09:48
@codesome
Copy link
Contributor Author

I have now rebased this PR on top of #1885 with a small fix

@codesome
Copy link
Contributor Author

Tested it. Worked fine.

@bboreham
Copy link
Contributor

Sorry to come up with another suggestion after so many other discussions, but it seemed cleaner to me: how about doing it from a different -target ? That way Cortex.Run() doesn't need to change.

gouthamve added a commit to gouthamve/cortex that referenced this pull request Jan 3, 2020
This is my attempt to simplify things compared to cortexproject#1747

/cc @jtlisi @codesome

Signed-off-by: Goutham Veeramachaneni <[email protected]>
gouthamve
gouthamve previously approved these changes Jan 3, 2020
Copy link
Contributor

@gouthamve gouthamve left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not the most ideal but should do the job, @codesome could you test it again?

@gouthamve gouthamve dismissed their stale review January 3, 2020 13:52

Didn't see the conflicts

@codesome codesome mentioned this pull request Feb 4, 2020
3 tasks
@codesome
Copy link
Contributor Author

codesome commented Feb 4, 2020

Superseded by #2075

@codesome codesome closed this Feb 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants