Skip to content
This repository was archived by the owner on Apr 28, 2025. It is now read-only.

Commit ad5089d

Browse files
authored
Merge pull request #88 from grafana/improve-compactor-alerts-and-dashboard
Improved Cortex blocks compactor alerts and dashboard
2 parents 59dc335 + ac9cbdc commit ad5089d

File tree

2 files changed

+11
-28
lines changed

2 files changed

+11
-28
lines changed

cortex-mixin/alerts/compactor.libsonnet

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -4,33 +4,33 @@
44
name: 'cortex_compactor_alerts',
55
rules: [
66
{
7-
// Alert if the compactor has not successfully completed a run in the last 24h.
8-
alert: 'CortexCompactorHasNotSuccessfullyRun',
7+
// Alert if the compactor has not successfully cleaned up blocks in the last 24h.
8+
alert: 'CortexCompactorHasNotSuccessfullyCleanedUpBlocks',
99
'for': '15m',
1010
expr: |||
11-
(time() - cortex_compactor_last_successful_run_timestamp_seconds{%s} > 60 * 60 * 24)
11+
(time() - cortex_compactor_block_cleanup_last_successful_run_timestamp_seconds{%s} > 60 * 60 * 24)
1212
and
13-
(cortex_compactor_last_successful_run_timestamp_seconds{%s} > 0)
13+
(cortex_compactor_block_cleanup_last_successful_run_timestamp_seconds{%s} > 0)
1414
||| % [$.namespace_matcher(''), $.namespace_matcher('')],
1515
labels: {
1616
severity: 'critical',
1717
},
1818
annotations: {
19-
message: 'Cortex Compactor {{ $labels.namespace }}/{{ $labels.instance }} has not successfully completed a run in the last 24 hours.',
19+
message: 'Cortex Compactor {{ $labels.namespace }}/{{ $labels.instance }} has not successfully cleaned up blocks in the last 24 hours.',
2020
},
2121
},
2222
{
23-
// Alert if the compactor has not successfully completed a run since its start.
24-
alert: 'CortexCompactorHasNotSuccessfullyRunSinceStart',
23+
// Alert if the compactor has not successfully cleaned up blocks since its start.
24+
alert: 'CortexCompactorHasNotSuccessfullyCleanedUpBlocksSinceStart',
2525
'for': '24h',
2626
expr: |||
27-
cortex_compactor_last_successful_run_timestamp_seconds{%s} == 0
27+
cortex_compactor_block_cleanup_last_successful_run_timestamp_seconds{%s} == 0
2828
||| % $.namespace_matcher(''),
2929
labels: {
3030
severity: 'critical',
3131
},
3232
annotations: {
33-
message: 'Cortex Compactor {{ $labels.namespace }}/{{ $labels.instance }} has not successfully completed a run in the last 24 hours.',
33+
message: 'Cortex Compactor {{ $labels.namespace }}/{{ $labels.instance }} has not successfully cleaned up blocks in the last 24 hours.',
3434
},
3535
},
3636
{

cortex-mixin/dashboards/compactor.libsonnet

Lines changed: 2 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,8 @@ local utils = import 'mixin-utils/utils.libsonnet';
99
.addPanel(
1010
$.textPanel('', |||
1111
- **Per-instance runs**: number of times a compactor instance triggers a compaction across all tenants its shard manage.
12-
- **Per-tenant runs**: number of times a compactor instance triggers the compaction for a single tenant's blocks.
12+
- **Compacted blocks**: number of blocks generated as a result of a compaction operation.
13+
- **Per-block compaction duration**: time taken to generate a single compacted block.
1314
|||),
1415
)
1516
.addPanel(
@@ -22,24 +23,6 @@ local utils = import 'mixin-utils/utils.libsonnet';
2223
$.bars +
2324
{ yaxes: $.yaxes('ops') },
2425
)
25-
.addPanel(
26-
$.successFailurePanel(
27-
'Per-tenant runs / sec',
28-
'sum(rate(cortex_compactor_group_compactions_total{%s}[$__interval])) - sum(rate(cortex_compactor_group_compactions_failures_total{%s}[$__interval]))' % [$.jobMatcher('compactor'), $.jobMatcher('compactor')],
29-
'sum(rate(cortex_compactor_group_compactions_failures_total{%s}[$__interval]))' % $.jobMatcher('compactor'),
30-
) +
31-
$.bars +
32-
{ yaxes: $.yaxes('ops') },
33-
)
34-
)
35-
.addRow(
36-
$.row('')
37-
.addPanel(
38-
$.textPanel('', |||
39-
- **Compacted blocks**: number of blocks generated as a result of a compaction operation.
40-
- **Per-block compaction duration**: time taken to generate a single compacted block.
41-
|||),
42-
)
4326
.addPanel(
4427
$.panel('Compacted blocks / sec') +
4528
$.queryPanel('sum(rate(prometheus_tsdb_compactions_total{%s}[$__interval]))' % $.jobMatcher('compactor'), 'blocks') +

0 commit comments

Comments
 (0)