Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

## master
* [ENHANCEMENT] Add bigger tenants and configure default compactor tenant shards
* [ENHANCEMENT] Add alert `CortexCompactorWriteVisitMarkerIsFailing` to monitor compactors

## 1.17.1 / 2024-10-23
* [CHANGE] Use cortex v1.17.1
Expand Down
16 changes: 16 additions & 0 deletions cortex-mixin/alerts/compactor.libsonnet
Original file line number Diff line number Diff line change
Expand Up @@ -102,6 +102,22 @@
||| % $._config,
},
},
{
// Alert if compactor are not able to update the visit-marker.
alert: 'CortexCompactorBlockVisitMarkerIsFailing',
'for': '2h',
expr: |||
sum(increase(cortex_compactor_block_visit_marker_write_failed{job=~".+/%(compactor)s"}[2h]))>0
||| % $._config.job_names,
labels: {
severity: 'critical'
},
annotations: {
message: |||
Cortex compactors are not able to update the visit marker, double check logs to see what is happening
|||
}
}
],
},
],
Expand Down
11 changes: 11 additions & 0 deletions cortex-mixin/docs/playbooks.md
Original file line number Diff line number Diff line change
Expand Up @@ -379,6 +379,17 @@ How to **investigate**:
- Ensure ingesters are successfully shipping blocks to the storage
- Look for any error in the compactor logs

### CortexCompactorWriteVisitMarkerIsFailing

Only applies to compactors when using shuffle sharding.
This alert fires if the compactor is not able to update the visit marker across all tenants.
The marker file is a very small json file that should never have any problems getting updated.

How to **investigate**:
- Verify the logs for the compactors, they should show the exact reason
- If you see the `context canceled` or any other timeouts in the logs,
consider increasing `-compactor.compaction-visit-marker-timeout` and `-compactor.compaction-visit-marker-file-update-interval`.

### CortexCompactorHasNotSuccessfullyRunCompaction

This alert fires if the compactor is not able to successfully compact all discovered compactable blocks (across all tenants).
Expand Down
Loading