Skip to content

Ruler: Prevent counting 2xx responses as failed writes #6785

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jun 3, 2025

Conversation

rajagopalanand
Copy link
Contributor

@rajagopalanand rajagopalanand commented Jun 3, 2025

What this PR does:

In some cases, the Distributor returns a 2xx with error. On the ruler side, this causes failedWrites to be incremented. This PR prevents failedWrites counter from being incremented in cases where the distributor returns a 2xx with an error

@rajagopalanand rajagopalanand marked this pull request as ready for review June 3, 2025 18:49
@dosubot dosubot bot added the component/rules Bits & bobs todo with rules and alerts: the ruler, config service etc. label Jun 3, 2025
@yeya24
Copy link
Contributor

yeya24 commented Jun 3, 2025

In some cases, the Distributor returns a 2xx with error

What would be the error in this case? If it returns an error then it sounds indeed an error. Maybe there is something wrong when returning the error and the error is wrapped with a 2XX not 5XX

@rajagopalanand
Copy link
Contributor Author

In some cases, the Distributor returns a 2xx with error

What would be the error in this case? If it returns an error then it sounds indeed an error. Maybe there is something wrong when returning the error and the error is wrapped with a 2XX not 5XX

Just for posterity, the error is rpc error: code = Code(202) desc = replicas did not mach, rejecting sample ...

@rajagopalanand
Copy link
Contributor Author

rajagopalanand commented Jun 3, 2025

Distributor returns a 202 + error like I mentioned. We still should not increment this as a write failure on the ruler because distributor returned a 202. Hence this change. If there are other/better ways to fix this, let's discuss

Copy link
Contributor

@danielblando danielblando left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this makes sense as we already ignore 4xx as failure

@yeya24 yeya24 changed the title Prevent counting 2xx responses as failed writes Ruler: Prevent counting 2xx responses as failed writes Jun 3, 2025
rajagopalanand and others added 2 commits June 3, 2025 22:30
@yeya24 yeya24 enabled auto-merge (squash) June 3, 2025 23:01
@yeya24 yeya24 merged commit c9fa217 into cortexproject:master Jun 3, 2025
5 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/rules Bits & bobs todo with rules and alerts: the ruler, config service etc. size/XS
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants