-
Notifications
You must be signed in to change notification settings - Fork 832
metrics for monitoring delete requests #2445
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
metrics for monitoring delete requests #2445
Conversation
Signed-off-by: Sandeep Sukhani <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you mention the new metrics in the CHANGELOG
, please?
pkg/chunk/purger/purger.go
Outdated
}, []string{"user"}) | ||
m.deleteRequestsProcessingFailures = prometheus.NewGaugeVec(prometheus.GaugeOpts{ | ||
Namespace: "cortex", | ||
Name: "purger_delete_requests_processing_failure", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think to add an s
to failure
?
Name: "purger_delete_requests_processing_failure", | |
Name: "purger_delete_requests_processing_failures", |
pkg/chunk/purger/purger.go
Outdated
func newPurgerMetrics(r prometheus.Registerer) *purgerMetrics { | ||
m := purgerMetrics{} | ||
|
||
m.deleteRequestsProcessedTotal = prometheus.NewCounterVec(prometheus.CounterOpts{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of calling r.MustRegister()
below, could you create the metrics with promauto.With(r).NewCounterVec()
(and same for NewGaugeVec()
)? It's the pattern we're heading to.
pkg/chunk/purger/request_handler.go
Outdated
func newDeleteRequestHandlerMetrics(r prometheus.Registerer) *deleteRequestHandlerMetrics { | ||
m := deleteRequestHandlerMetrics{} | ||
|
||
m.deleteRequestsReceivedTotal = prometheus.NewCounterVec(prometheus.CounterOpts{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
promauto.With(r)
here as well, please.
pkg/chunk/purger/purger.go
Outdated
Name: "purger_delete_requests_chunks_selected_total", | ||
Help: "Number of chunks selected while building delete plans per user", | ||
}, []string{"user"}) | ||
m.deleteRequestsProcessingFailures = prometheus.NewGaugeVec(prometheus.GaugeOpts{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed offline, this should be a counter.
Signed-off-by: Sandeep Sukhani <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, with some nitpicking comments.
"strings" | ||
|
||
"github.com/prometheus/client_golang/prometheus" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove new line please, and merge these two import groups.
pkg/chunk/purger/purger.go
Outdated
}, []string{"user"}) | ||
m.deleteRequestsProcessingFailures = promauto.With(r).NewCounterVec(prometheus.CounterOpts{ | ||
Namespace: "cortex", | ||
Name: "purger_delete_requests_processing_failures", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing _total
suffix for counter. I don't like it, but that's what the convention is.
// groups chunks together by unique label sets i.e all the chunks with same labels would be stored in a group | ||
// chunk details are stored in groups for each unique label set to avoid storing them repetitively for each chunk | ||
func groupChunks(chunks []chunk.Chunk, deleteFrom, deleteThrough model.Time) []ChunksGroup { | ||
func groupChunks(chunks []chunk.Chunk, deleteFrom, deleteThrough model.Time, includedChunkIDs map[string]struct{}) ([]ChunksGroup, map[string]struct{}) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it more like processedChunkIDs
?
Also there is no need to return it back from the function – modifications are already visible to the caller. Function comment should however mention this fact (that passed map is modified).
Signed-off-by: Sandeep Sukhani <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @sandeepsukhani ! Let's fix the CHANGELOG and then we can merge it 🎉
CHANGELOG.md
Outdated
- `purger_delete_requests_received_total`: Number of delete requests received per user. | ||
- `purger_delete_requests_processed_total`: Number of delete requests processed per user. | ||
- `purger_delete_requests_chunks_selected_total`: Number of chunks selected while building delete plans per user. | ||
- `purger_delete_requests_processing_failures_total`: Number of delete requests processing failures per user. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- `purger_delete_requests_received_total`: Number of delete requests received per user. | |
- `purger_delete_requests_processed_total`: Number of delete requests processed per user. | |
- `purger_delete_requests_chunks_selected_total`: Number of chunks selected while building delete plans per user. | |
- `purger_delete_requests_processing_failures_total`: Number of delete requests processing failures per user. | |
- `cortex_purger_delete_requests_received_total`: Number of delete requests received per user. | |
- `cortex_purger_delete_requests_processed_total`: Number of delete requests processed per user. | |
- `cortex_purger_delete_requests_chunks_selected_total`: Number of chunks selected while building delete plans per user. | |
- `cortex_purger_delete_requests_processing_failures_total`: Number of delete requests processing failures per user. |
Signed-off-by: Sandeep Sukhani <[email protected]>
4603c91
to
f1a4dce
Compare
What this PR does:
Adds some metrics for monitoring delete requests. It also eliminates duplicate entries of chunks which overlap multiple plans/days.
purger_delete_requests_received_total
: Number of delete requests received per user.purger_delete_requests_processed_total
: Number of delete requests processed per user.purger_delete_requests_chunks_selected_total
: Number of chunks selected while building delete plans per user.purger_delete_requests_processing_failure
: Delete requests processing failure for each user.Which issue(s) this PR fixes:
Fixes #
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]