Skip to content

Federated ruler proposal #4477

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Jan 13, 2022

Conversation

rdooley
Copy link

@rdooley rdooley commented Sep 14, 2021

What this PR does:
Proposes a solution to allow federated rules, rules which query data from multiple tenants

Which issue(s) this PR fixes:
Proposes a solution for #4403

Checklist

  • Tests updated NA
  • Documentation added NA
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX] NA for proposal

@rdooley rdooley force-pushed the proposal-federated-ruler branch from beef98d to 0b3557c Compare September 14, 2021 17:44
@rdooley
Copy link
Author

rdooley commented Sep 14, 2021

cc @bboreham and @edma2

@rdooley rdooley mentioned this pull request Sep 14, 2021
3 tasks
Copy link
Contributor

@edma2 edma2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this proposal would address my use-cases for federated rule queries, but let's hear from others too

@rdooley
Copy link
Author

rdooley commented Oct 5, 2021

@edma2 Can you check that this still meets your use case?


#### Proposal

Since the current default is that a tenant should only be able to write rules against itself, we suggest a config option `ruler.allowed-federated-tenants`, a string slice of OrgIDs like `infra` or `0|1|2|3|4` which are allowed to write rules against all tenants. If a tenant `bar` attempts to create a federated rule, an error should be returned by the ruler api. Similarly an option `ruler.disallowed-federated-tenants` explicitly states a list of tenants for which federated rules are not allowed. Combining these in a `util.AllowedTenants` should allow us to quickly determine if federation is enabled or disabled for a given tenant at rule creation.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passing both disallowed and allowed flags at once would be ambiguous, so Cortex should only accept one or the other.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current proposal is similar to the enabled/disabled tenant logic for ruler seen here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

interesting, I guess the allowlist effectively takes priority over the disallow list.

rdooley pushed a commit to rdooley/cortex that referenced this pull request Oct 7, 2021
* implements the rest of the proposal cortexproject#4477
* bring in vendored prometheus rules and rulefmt code
* add optional src and dest tenant fields
* Block the creation of these federated rules behind a feature flag in
  ruler api
* Original issue cortexproject#4403
* a large amount of documentation surrounding this expanded feature set
  is needed

Signed-off-by: Rees Dooley <[email protected]>
@rdooley rdooley mentioned this pull request Oct 7, 2021
3 tasks
@rdooley
Copy link
Author

rdooley commented Oct 7, 2021

I have an initial draft of the implementation here and will be updating the proposal to links with this PR instead of the old one

@rdooley rdooley requested a review from pracucci October 19, 2021 22:46
@rdooley
Copy link
Author

rdooley commented Nov 11, 2021

Feedback from cortex community call, cc @pracucci
An explicit use case for destination tenant is required. I will work on an example use case for destination tenant. If there isn't one, I will remove the explicit dest_tenant field from the proposal and corresponding draft PR, reverting to the current behavior of dest_tenant inheriting from rule ownership.

@rdooley rdooley force-pushed the proposal-federated-ruler branch from 063540f to a7c771d Compare November 15, 2021 18:13

#### Proposal

For composite tenants a random but consistent subtenant of the multiple tenants owning the rule is chosen using a hashmod of the series label to determine the subtenant and for single tenants owning a federated rule the resulting series is saved in the tenant which owns the rule.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I understand what you're trying to actually do with this whole work (scale out a single big tenant logically splitting it into many smaller ones) but I think what you describe in this line may not fit for the general use case of federated rules.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I make a separate proposal for composite tenant support for ruler or is expanding this proposal to support federated rules and composite tenants in ruler ok? Composite tenant support shares a common element with federated rules (rules querying from multiple tenants), but is distinct.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going with dropping composite tenant support here

@rdooley rdooley force-pushed the proposal-federated-ruler branch from a7c771d to 39248ef Compare November 17, 2021 16:28

## Reasoning

There are two primary use cases for allowing federated rules which query data from multiple tenants; administration of cortex and composite tenants.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rdooley have you considered putting the src_tenants on a rule group level instead of rule level?

On one hand,

  • having it on group level reduces repetition between rules. I'd imagine some of the rules will likely read from or write to the same tenants, so it will make them less prone to bugs (and bugs like that can cause gaps in the rules data)
  • currently in prometheus rules in a group are evaluated sequentially. So rules listed earlier in a group can be used by the rules later in the group. If one of the earlier rules has a dest_tenant, then this invariant will be violated. This will slightly change behaviour.

On the other hand,

  • having it on rule level allows for more flexibility. Otherwise, federated and non-federated rules cannot be mixed in the same group.
  • it requires less changes to existing rule groups. Existing rule groups that users want to turn into federated rules can be amended simply by adding a line to a rule. In the case of group-level tenants, rules will likely need to be split up into multiple federated & non-federated groups.

I'm curious as to why you opted for having them per rule.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opted for making src_tenants a rule instead of group based field just for the most flexibility.
Since dest_tenant has been removed from this proposal the invariant of earlier rules' data being available for later rules still holds.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see the case for having src_tenants at the rule level and the group level, and I had a weakly held opinion prioritizing flexibility.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about allowing src_tenants on either rule or rule group? src_tenants on a rule would by default inherit the value from the rule group, but can be overwritten per rule.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since my opinion is weakly held, what if we just go with on the rule group instead of per rule as that seems the preference from Dimitar and Andrea?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fine with me. This was can always be expanded later to per-rule as well if/when needed.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll get that updated today and also remove the composite tenant bits to scope this down further. Composite tenant support can be worked on separately later if we need it.

@agardiman
Copy link

Hi @rdooley, thanks for this proposal! Federated rules is something we would really like to see in Cortex.
We are particularly interested in defining rules that pertain to the tenant where they are defined but that query from multiple tenants of choice, so your src_tenant option (we haven't encountered a need for the composite tenants yet) with a slightly preference on having that defined per rule group instead of per rule, to not repeat it on every single rule we have (hundreds), but not a super big deal.
Do you know what's the state of the proposal and if there is any help needed to push it forward?

@rdooley
Copy link
Author

rdooley commented Dec 16, 2021

Hi @rdooley, thanks for this proposal! Federated rules is something we would really like to see in Cortex. We are particularly interested in defining rules that pertain to the tenant where they are defined but that query from multiple tenants of choice, so your src_tenant option (we haven't encountered a need for the composite tenants yet) with a slightly preference on having that defined per rule group instead of per rule, to not repeat it on every single rule we have (hundreds), but not a super big deal. Do you know what's the state of the proposal and if there is any help needed to push it forward?

I think the only thing I'm waiting for is feedback from @pracucci around composite tenant support, and of course associated reviews

@rdooley rdooley force-pushed the proposal-federated-ruler branch from 9b84f05 to 8950adf Compare January 4, 2022 18:05
@rdooley
Copy link
Author

rdooley commented Jan 4, 2022

I've removed composite tenant support from this proposal in the interests of expediency and simplicity.

@bboreham
Copy link
Contributor

bboreham commented Jan 6, 2022

Linter is objecting to your spelling:

docs/proposals/federated-ruler.md:14:139: "refered" is a misspelling of "referred"

@rdooley rdooley force-pushed the proposal-federated-ruler branch from ded9da2 to f10830f Compare January 6, 2022 16:15
@rdooley
Copy link
Author

rdooley commented Jan 6, 2022

Looks like its failing build as well, fixing
EDIT
Bad slug my bad

@rdooley rdooley force-pushed the proposal-federated-ruler branch from f10830f to f788b7a Compare January 6, 2022 16:41
Rees Dooley and others added 9 commits January 6, 2022 10:53
* unsure on the status here
* cortexproject#4403

Signed-off-by: Rees Dooley <[email protected]>
* switch from `_` to `-` for feature flag consistency
* Consitency: use only `src_tenants` to match draft implementation

Signed-off-by: Rees Dooley <[email protected]>
Signed-off-by: Rees Dooley <[email protected]>
@rdooley rdooley force-pushed the proposal-federated-ruler branch from f788b7a to 63bdfd5 Compare January 6, 2022 19:43
Copy link
Contributor

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!


As some use cases will demand that a specific federated rule, querying tenant B and C, is stored in the owning teams tenant A, an option to allow explicit assignment of source tenants for a federated rule is needed.

To support this we suggest an additional field `src_tenants` on the rule group containing an OrgID string e.g. `t0|t1|...|ti` which when present determines which tenants to query for the given rule. Rule group is chosen as it reduces repetition between rules.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of a pipe separated string, you may consider having src_tenants to be an array. Looks a more robust solution to me.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me

Copy link
Contributor

@bboreham bboreham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved.

@bboreham bboreham enabled auto-merge (squash) January 12, 2022 18:12
@rdooley
Copy link
Author

rdooley commented Jan 12, 2022

The failing integration test seems unrelated to me

FAIL: TestAlertmanagerClustering/bucket_alertstore (1.71s)

@bboreham bboreham merged commit 08279f0 into cortexproject:master Jan 13, 2022
@rdooley rdooley deleted the proposal-federated-ruler branch January 13, 2022 17:24
alvinlin123 pushed a commit to ac1214/cortex that referenced this pull request Jan 14, 2022
* Federated ruler proposal

Signed-off-by: Rees Dooley <[email protected]>

Co-authored-by: Rees Dooley <[email protected]>
Signed-off-by: Alvin Lin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants