Skip to content

Support Ruler to query Query Frontend #6151

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

SungJin1212
Copy link
Member

@SungJin1212 SungJin1212 commented Aug 12, 2024

What this PR does:
Add ruler.frontend-address to allow query to query frontends instead of ingesters. If -ruler.frontend-address is set, rulers query to query frontends for evaluating rules. It can support query frontend features like vertical query sharding.

Which issue(s) this PR fixes:
Fixes #5105

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

@SungJin1212 SungJin1212 marked this pull request as draft August 12, 2024 11:51
@SungJin1212 SungJin1212 force-pushed the add-query-frontend-queryable branch 7 times, most recently from cec96af to 4d42fcf Compare August 12, 2024 12:42
Copy link
Contributor

@yeya24 yeya24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of making Query Frontend a queryable, I think I prefer changing the engine query function https://github.com/cortexproject/cortex/blob/master/pkg/ruler/compat.go#L206.

Instead of using a local engine, we use a remote query frontend to evaluate the query results.

@SungJin1212 SungJin1212 force-pushed the add-query-frontend-queryable branch 3 times, most recently from 7ff9818 to 84d16b4 Compare August 13, 2024 12:42
@SungJin1212
Copy link
Member Author

@yeya24
I remake the pr. Could you please take a look?

@@ -226,6 +226,33 @@ func EngineQueryFunc(engine promql.QueryEngine, q storage.Queryable, overrides R
}
}

func QueryFrontendQueryFunc(promClient *promClient, overrides RulesLimits, userID string, lookbackDelta time.Duration) rules.QueryFunc {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we reuse the existing EngineQueryFunc function?

@SungJin1212 SungJin1212 force-pushed the add-query-frontend-queryable branch 2 times, most recently from d579b4b to 062885e Compare August 14, 2024 02:41
@SungJin1212
Copy link
Member Author

@yeya24
I have reflected the review you left. Thank you for the review!

@SungJin1212 SungJin1212 changed the title Add query frontend queryable Support Ruler to query Query Frontend Aug 14, 2024
@SungJin1212 SungJin1212 marked this pull request as ready for review August 14, 2024 04:13
@SungJin1212 SungJin1212 force-pushed the add-query-frontend-queryable branch from 062885e to aba8591 Compare August 16, 2024 08:13
@SungJin1212
Copy link
Member Author

@yeya24
I have fixed the pr. Could you please take a look?

@danielblando
Copy link
Contributor

We are in the process of releasing 1.18
Please rebase PR and change the changelog to master

@SungJin1212 SungJin1212 force-pushed the add-query-frontend-queryable branch 2 times, most recently from c78a066 to 35f8879 Compare August 17, 2024 05:31
@SungJin1212
Copy link
Member Author

SungJin1212 commented Aug 20, 2024

@yeya24
I think we need to quey to QFE via gRPC.
I'm thinking the step as follows:

  1. Define query request, response (including floating histogram) proto in cortex.pb
  2. Ruler has grpc client and query to QFE via the client.

@yeya24
Copy link
Contributor

yeya24 commented Aug 20, 2024

Make sense. I am thinking about the same thing. Query frontend already supports grpc as long as you wrap your request with httpgrpc. We don't need to define a new rpc in cortex.pb

For the protobuf change, we don't need to do it in this PR.

@SungJin1212
Copy link
Member Author

@yeya24
Thanks, I try implementing it.

@yeya24
Copy link
Contributor

yeya24 commented Aug 20, 2024

@SungJin1212 This is how query frontend converts a http request to httpgrpc request https://github.com/cortexproject/cortex/blob/master/pkg/frontend/transport/roundtripper.go#L39.

We can do something similar. But from Ruler to Query Frontend.

@SungJin1212 SungJin1212 requested a review from yeya24 August 28, 2024 10:45
@SungJin1212 SungJin1212 force-pushed the add-query-frontend-queryable branch 4 times, most recently from 248c350 to f1e2a3d Compare August 31, 2024 06:05
@SungJin1212
Copy link
Member Author

@yeya24
Could you please take a look?

@SungJin1212 SungJin1212 force-pushed the add-query-frontend-queryable branch 2 times, most recently from 006b710 to 0324695 Compare September 2, 2024 06:54
Copy link
Contributor

@yeya24 yeya24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SungJin1212 Thanks for the great work! I think this feature looks really solid now.
We can give it a try.
I think we also lack of some documentation but it is fine to do it in another PR.

@SungJin1212 SungJin1212 force-pushed the add-query-frontend-queryable branch from 0324695 to bb300a5 Compare September 3, 2024 06:33
Copy link
Member

@friedrichg friedrichg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no concerns. Only small nits:

  1. Mark it as experimental on:
    https://cortexmetrics.io/docs/configuration/v1guarantees/#experimental-features

  2. Architecture diagram needs update
    https://cortexmetrics.io/docs/architecture/

  3. Why is it better? I can see frontend caching is one is reason.
    You could add some a guide page similar to
    https://cortexmetrics.io/docs/guides/ruler-sharding/

@yeya24
Copy link
Contributor

yeya24 commented Sep 4, 2024

@SungJin1212 As @friedrichg said, let's mark this feature as experimental in cortexmetrics.io/docs/configuration/v1guarantees#experimental-features.

Architecture diagram needs update
cortexmetrics.io/docs/architecture

I think we can skip this in this pr. We need to update it anyway.

Why is it better? I can see frontend caching is one is reason.
You could add some a guide page similar to
cortexmetrics.io/docs/guides/ruler-sharding

We will add a doc but will create another issue to track that. This item doesn't need to be owned by @SungJin1212

@SungJin1212 SungJin1212 force-pushed the add-query-frontend-queryable branch from bb300a5 to 2d88c97 Compare September 5, 2024 02:36
@SungJin1212
Copy link
Member Author

SungJin1212 commented Sep 5, 2024

@yeya24 @friedrichg
I have added experimental flags and got the rebase. I think caching and using vertical sharding for evaluating rules is a main contribution of this pr.


# Timeout for downstream rulers.
# CLI flag: -ruler.frontendClient.remote-timeout
[remote_timeout: <duration> | default = 2m]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's this flag for? This seems not used. I wonder if we should consolidate this flag with frontend timeout as they seem to do the same thing.
However, the flag description seems hardcoded to Timeout for downstream rulers..

type frontendPool struct {
timeout time.Duration
prometheusHTTPPrefix string
grpcConfig ClientConfig
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should use grpcclient.Config instead?
ClientConfig seems used for Ruler to talk to each other only

@SungJin1212
Copy link
Member Author

@yeya24
There were some mistakes when rebasing. I have fixed it!

@SungJin1212 SungJin1212 force-pushed the add-query-frontend-queryable branch from 2d88c97 to 3a74a24 Compare September 5, 2024 05:45
@SungJin1212 SungJin1212 force-pushed the add-query-frontend-queryable branch from 3a74a24 to 83db99d Compare September 5, 2024 05:52
@SungJin1212 SungJin1212 requested a review from yeya24 September 6, 2024 01:10
@SungJin1212
Copy link
Member Author

@friedrichg
Could you please take a look?

Copy link
Member

@friedrichg friedrichg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm

@yeya24
Copy link
Contributor

yeya24 commented Sep 7, 2024

Thanks for the contribution!

@yeya24 yeya24 merged commit 1523080 into cortexproject:master Sep 7, 2024
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Switch Cortex Ruler to query Query Frontend
4 participants