-
Notifications
You must be signed in to change notification settings - Fork 819
Federated ruler draft #4520
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Federated ruler draft #4520
Conversation
…crease multitenant query max concurrency Signed-off-by: Rees Dooley <[email protected]>
* Push ruler metrics to a random subtenant if rule is multitenant * random seed is set based off labelset hash to ensure series always pushed to same subtenant * add changelog message about multitenant ruler support Signed-off-by: Rees Dooley <[email protected]>
* expand federated tenant alertmanager test case * Add tenant-federation.max-concurrency config opt (docs need regenerating later) * switch to hash mod instead of random for federated ruler subtenant selection Signed-off-by: Rees Dooley <[email protected]>
* implements the rest of the proposal cortexproject#4477 * bring in vendored prometheus rules and rulefmt code * add optional src and dest tenant fields * Block the creation of these federated rules behind a feature flag in ruler api * Original issue cortexproject#4403 * a large amount of documentation surrounding this expanded feature set is needed Signed-off-by: Rees Dooley <[email protected]>
@@ -72,7 +72,9 @@ func (a *PusherAppender) Commit() error { | |||
|
|||
// Since a.pusher is distributor, client.ReuseSlice will be called in a.pusher.Push. | |||
// We shouldn't call client.ReuseSlice here. | |||
_, err := a.pusher.Push(user.InjectOrgID(a.ctx, a.userID), cortexpb.ToWriteRequest(a.labels, a.samples, nil, cortexpb.RULE)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This reinjection of the pusherappendable's context was blocking the ability to set the destination tenant context appropriately during the rule group eval
srcTenants := rule.GetSrcTenants() | ||
if srcTenants != "" { | ||
queryCtx = user.InjectOrgID(ctx, srcTenants) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is one of the key additions here, setting a specific query context user if srcTenants is set
i := int(rule.Labels().Copy().Hash()) % len(tenantIDs) | ||
tenantID := tenantIDs[i] | ||
appCtx = user.InjectOrgID(ctx, tenantID) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the other key addition, setting a dest context based on the dest tenant if set, and if the dest tenant (or owning tenant) is composite, then consistently chosing a subtenant.
pkg/ruler/rules/manager.go
Outdated
tenantIDs, err := tenant.TenantIDs(appCtx) | ||
if err == nil && len(tenantIDs) > 1 { | ||
// Mod the hash of the series so same series always goes to same subtenant | ||
i := int(rule.Labels().Copy().Hash()) % len(tenantIDs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need to handle the hash here being negative producing a negative i
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This probably also needs a test to verify the expected behavior
Signed-off-by: Rees Dooley <[email protected]>
This is going to have to change as the proposal was updated closing |
What this PR does:
This PR introduces the features discussed in #4477 allowing the creation of federated rules which have explicit assignment of source and destination tenant for the rule, rather than assuming source and destination from rule ownership.
A quick example of this being used in a sandbox environment
In a multitenant sandbox where tenants
0
,1
, thru19
were fed data via avalanche I created two recording rules owned by tenant15
Then viewing this in a grafana configured to query the composite tenant

0|1|2|3|4|5|6|7|8|9|10|11|12|13|14|15|16|17|18|19
we can see that the
record_rule_eg
andno_dest
metrics match the expected value ofsum(avalanche_metric_mmmmm_1_0)
while being stored in tenants17
and15
respectively.As this is a new feature, there is extensive documentation needed to go alongside this code change. Because of that, I'm leaving this PR in draft for now, to be viewed alongside the proposal.
Which issue(s) this PR fixes:
Fixes #4403
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]