Skip to content

Add alert for target down #11745

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 29, 2022
Merged

Add alert for target down #11745

merged 1 commit into from
Jul 29, 2022

Conversation

ArthurSens
Copy link
Contributor

Signed-off-by: ArthurSens [email protected]

Description

Add an alert for when Prometheus fails to scrape metrics from any target. (Actually, it ignores workspaces)

Related Issue(s)

Fixes #

How to test

Release Notes

NONE

Documentation

Werft options:

  • /werft with-preview

summary: 'Prometheus failed to scrape {{ $labels.job }}',
description: 'Prometheus couldn\'t scrape {{ printf "%.4g" $value }}% of the {{ $labels.job }} targets. Components could be unnavailable or we have some scraping misconfiguration.',
},
expr: '100 * (count(up{container!="workspace"} == 0) BY (job) / count(up{container!="workspace"}) BY (job)) > 10',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
expr: '100 * (count(up{container!="workspace"} == 0) BY (job) / count(up{container!="workspace"}) BY (job)) > 10',
expr: '(count(up{container!="workspace"} == 0) BY (job) / count(up{container!="workspace"}) BY (job)) > 0',

Copy link
Contributor Author

@ArthurSens ArthurSens Jul 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, makes sense the 0, but 100 * is there so we have a good message in slack 😛
Prometheus couldn't scrape {{ printf "%.4g" $value }}%....

Signed-off-by: ArthurSens <[email protected]>
@ArthurSens ArthurSens force-pushed the as/target-down-alert branch from 5ff81f4 to 5042cab Compare July 29, 2022 15:20
@roboquat roboquat merged commit 1041c76 into main Jul 29, 2022
@roboquat roboquat deleted the as/target-down-alert branch July 29, 2022 20:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants