From b9fa9c229b61e5eefb8b00817fa065cd6e38beb3 Mon Sep 17 00:00:00 2001 From: Anand Rajagopal Date: Tue, 16 Apr 2024 16:26:56 +0000 Subject: [PATCH 1/3] Ruler HA - Proposal Signed-off-by: Anand Rajagopal --- docs/proposals/ruler-ha-new.md | 52 ++++++++++++++++++++++++++++++++++ 1 file changed, 52 insertions(+) create mode 100644 docs/proposals/ruler-ha-new.md diff --git a/docs/proposals/ruler-ha-new.md b/docs/proposals/ruler-ha-new.md new file mode 100644 index 00000000000..fe0c1c86358 --- /dev/null +++ b/docs/proposals/ruler-ha-new.md @@ -0,0 +1,52 @@ +--- +title: "Ruler HA" +linkTitle: "Ruler HA" +weight: 1 +slug: ruler-ha +--- + +- Author: [Anand Rajagopal](https://github.com/rajagopalanand) +- Date: April 2024 +- Status: Proposed +--- + +## Problem + +Rulers in Cortex currently run with a replication factor of 1, wherein each RuleGroup is assigned to exactly 1 ruler. This lack of redundancy creates the following risks: + +- Rule group evaluation + - Missed evaluations due to a ruler outage, possibly caused by a deployment, noisy neighbour, hardware failure, etc. + - Missed evaluations due to a ruler brownout due to other tenant rule groups sharing the same ruler (noisy neighbour) +- API + - inconsistent API results during resharding (e.g. due to a deployment) when rulers are in a transition state loading rule groups + +This proposal attempts to mitigate the above risks by enabling a ruler replication factor of greater than 1, allowing multiple rulers to evaluate the same rule group — effectively. + +## Proposal + +### Make ReplicationFactor configurable + +ReplicationFactor in Ruler is currently hardcoded to 1. Making this a configurable parameter is the first step to enabling HA in ruler, and would also be the mechanism for the user to turn the feature on. The parameter value will be 1 by default, equating to the feature being turned off by default. + +A replication factor greater than 1 will result in multiple rulers loading the same rule groups but only one ruler evaluating the rule group. The replicas are in "passive" state until it is necessary for them to become active + +This redundancy will allow for missed rule evaluations from single ruler outages to be covered by other instances evaluating the same rule groups. + +To avoid inconsistent rule group state, which is maintained by Prometheus, the author proposes making a change in Prometheus rule group evaluation logic as described below + +### Prometheus change + +The author proposes making a change to Prometheus to allow for pausing and resuming (or activating and deactivating) a rule group as described [here](https://github.com/prometheus/prometheus/issues/13630) + +If the proposal is not accepted by Prometheus community, the proposal is to maintain a fork of Prometheus for Cortex with modified rule group evaluation behavior. This [draft PR](https://github.com/prometheus/prometheus/pull/13858) +shows the changes required in Prometheus to support pausing and resuming a rule group evaluation + +### API HA + +An interim solution is addressed in this [#5773](https://github.com/cortexproject/cortex/issues/5773) PR. This will be modified such that the replicas will return both active and passive rule groups and the API handler will continue to de-duplicate the results. +The difference is that after Ruler HA, the replicas could potentially return proper rule group state if those replicas evaluated the rule group + +PRs: + +* Prometheus PR [#13858](https://github.com/prometheus/prometheus/pull/13858) [draft] +* For API HA [#5773](https://github.com/cortexproject/cortex/issues/5773) From 6508c4e4ca5292d10892aa0a9b5d9b9b6fd20cb0 Mon Sep 17 00:00:00 2001 From: Anand Rajagopal Date: Tue, 20 Aug 2024 21:19:15 +0000 Subject: [PATCH 2/3] Updated Ruler HA proposal Signed-off-by: Anand Rajagopal --- docs/proposals/ruler-ha-new.md | 21 +++++++++++---------- 1 file changed, 11 insertions(+), 10 deletions(-) diff --git a/docs/proposals/ruler-ha-new.md b/docs/proposals/ruler-ha-new.md index fe0c1c86358..a08ee45c02c 100644 --- a/docs/proposals/ruler-ha-new.md +++ b/docs/proposals/ruler-ha-new.md @@ -6,7 +6,7 @@ slug: ruler-ha --- - Author: [Anand Rajagopal](https://github.com/rajagopalanand) -- Date: April 2024 +- Date: Aug 2024 - Status: Proposed --- @@ -26,20 +26,21 @@ This proposal attempts to mitigate the above risks by enabling a ruler replicati ### Make ReplicationFactor configurable -ReplicationFactor in Ruler is currently hardcoded to 1. Making this a configurable parameter is the first step to enabling HA in ruler, and would also be the mechanism for the user to turn the feature on. The parameter value will be 1 by default, equating to the feature being turned off by default. +ReplicationFactor in Ruler is currently hardcoded to 1. Making this a configurable parameter is the first step to enabling HA in ruler. The parameter value will be 1 by default. To enable Ruler HA for rule group evaluation, a new flag will be created -A replication factor greater than 1 will result in multiple rulers loading the same rule groups but only one ruler evaluating the rule group. The replicas are in "passive" state until it is necessary for them to become active +A replication factor greater than 1 will result in the following -This redundancy will allow for missed rule evaluations from single ruler outages to be covered by other instances evaluating the same rule groups. + - Ring will pick R rulers for a rule group where R=RF + - The primary ruler (R1), when active, will take ownership of the rule group + - Non-primary ruler R2 will check if R1 is active. If R1 is not active, R2 will take ownership of the rule group + - Non-primary ruler R3 (if RF=3) will check if R1 and R2 are active. If they are both inactive/unhealthy, then R3 will take owership of the rule group + - Non-primary rulers will drop their ownership when R1 becomes active after an outage -To avoid inconsistent rule group state, which is maintained by Prometheus, the author proposes making a change in Prometheus rule group evaluation logic as described below +With this redundancy, the maximum duration of missed evaluations will be limited to the sync interval of the rule groups, reducing the impact of primary Ruler unavailability. ### Prometheus change -The author proposes making a change to Prometheus to allow for pausing and resuming (or activating and deactivating) a rule group as described [here](https://github.com/prometheus/prometheus/issues/13630) - -If the proposal is not accepted by Prometheus community, the proposal is to maintain a fork of Prometheus for Cortex with modified rule group evaluation behavior. This [draft PR](https://github.com/prometheus/prometheus/pull/13858) -shows the changes required in Prometheus to support pausing and resuming a rule group evaluation +No Prometheus change is required for this proposal ### API HA @@ -48,5 +49,5 @@ The difference is that after Ruler HA, the replicas could potentially return pro PRs: -* Prometheus PR [#13858](https://github.com/prometheus/prometheus/pull/13858) [draft] +* For Rule evaluation [#6129](https://github.com/cortexproject/cortex/pull/6129) * For API HA [#5773](https://github.com/cortexproject/cortex/issues/5773) From 14e6ddf10c45d27c16963312306cb5425ad8a528 Mon Sep 17 00:00:00 2001 From: Anand Rajagopal Date: Tue, 3 Sep 2024 20:53:48 +0000 Subject: [PATCH 3/3] Marked old ruler HA proposal as deprecated and changed the title for the new proposal Signed-off-by: Anand Rajagopal --- docs/proposals/ruler-ha-new.md | 8 ++++---- docs/proposals/ruler-ha.md | 4 +++- 2 files changed, 7 insertions(+), 5 deletions(-) diff --git a/docs/proposals/ruler-ha-new.md b/docs/proposals/ruler-ha-new.md index a08ee45c02c..bb8d0a4b874 100644 --- a/docs/proposals/ruler-ha-new.md +++ b/docs/proposals/ruler-ha-new.md @@ -1,8 +1,8 @@ --- -title: "Ruler HA" -linkTitle: "Ruler HA" +title: "Ruler High Availability" +linkTitle: "Ruler High Availability" weight: 1 -slug: ruler-ha +slug: ruler-high-availability --- - Author: [Anand Rajagopal](https://github.com/rajagopalanand) @@ -18,7 +18,7 @@ Rulers in Cortex currently run with a replication factor of 1, wherein each Rule - Missed evaluations due to a ruler outage, possibly caused by a deployment, noisy neighbour, hardware failure, etc. - Missed evaluations due to a ruler brownout due to other tenant rule groups sharing the same ruler (noisy neighbour) - API - - inconsistent API results during resharding (e.g. due to a deployment) when rulers are in a transition state loading rule groups + - Inconsistent API results during resharding (e.g. due to a deployment) when rulers are in a transition state loading rule groups This proposal attempts to mitigate the above risks by enabling a ruler replication factor of greater than 1, allowing multiple rulers to evaluate the same rule group — effectively. diff --git a/docs/proposals/ruler-ha.md b/docs/proposals/ruler-ha.md index 5e874e1ce0f..87d9d1b5828 100644 --- a/docs/proposals/ruler-ha.md +++ b/docs/proposals/ruler-ha.md @@ -7,11 +7,13 @@ slug: ruler-ha - Author: [Soon-Ping Phang](https://github.com/soonping-amzn) - Date: June 2022 -- Status: Proposed +- Status: Deprecated --- ## Introduction +_This proposal is deprecated in favor of the new [proposal](./ruler-ha-new.md)_ + This proposal consolidates multiple existing PRs from the AWS team working on this feature, as well as future work needed to complete support. The hope is that a more holistic view will make for more productive discussion and review of the individual changes, as well as provide better tracking of overall progress. The original issue is [#4435](https://github.com/cortexproject/cortex/issues/4435).