-
Notifications
You must be signed in to change notification settings - Fork 169
wip: support native histograms along classic histograms in panels #1121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Changes from all commits
Commits
Show all changes
16 commits
Select commit
Hold shift + click to select a range
91c5fc3
wip: support native histograms along classic histograms in panels
krajorama da3e596
Merge branch 'master' into krajo/native-histogram-mixin
krajorama ebefc98
Use named parameters in query template for clarity
krajorama dc6e242
Merge branch 'master' into krajo/native-histogram-mixin
krajorama 0394f96
add latencyRecordingRulePanelNativeHistogram to utils
krajorama 6f2903c
Merge branch 'master' into krajo/native-histogram-mixin
krajorama 1bc39b5
make fmt
krajorama 9042b90
Merge branch 'master' into krajo/native-histogram-mixin
krajorama 99a6d5a
Add low level definitions for getting count, quantile, sum
krajorama e2abbf4
Fix wording in function comment
krajorama c5e2af8
Remove undefined vars
krajorama 190a5ae
Remove stray curly braces
krajorama e5a8012
Fix groupping and average calc
krajorama 134bffd
Remove tray closing brace and make percentile lowercase
krajorama 256e498
Fix parenthesis in qps label
krajorama 197e35b
Don't require curly braces around selectors
krajorama File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,38 @@ | ||
local g = import 'grafana-builder/grafana.libsonnet'; | ||
|
||
{ | ||
// The classicNativeHistogramQuantile function is used to calculate histogram quantiles from native histograms or classic histograms. | ||
// Metric name should be provided without _bucket suffix. | ||
nativeClassicHistogramQuantile(percentile, metric, selector, sum_by=[], rate_interval='$__rate_interval'):: | ||
local classicSumBy = if std.length(sum_by) > 0 then ' by (%(lbls)s) ' % { lbls: std.join(',', ['le'] + sum_by) } else ' by (le) '; | ||
local nativeSumBy = if std.length(sum_by) > 0 then ' by (%(lbls)s) ' % { lbls: std.join(',', sum_by) } else ' '; | ||
'histogram_quantile(%(percentile)s, sum%(nativeSumBy)s(rate(%(metric)s{%(selector)s}[%(rateInterval)s]))) or histogram_quantile(%(percentile)s, sum%(classicSumBy)s(rate(%(metric)s_bucket{%(selector)s}[%(rateInterval)s])))' % { | ||
classicSumBy: classicSumBy, | ||
metric: metric, | ||
nativeSumBy: nativeSumBy, | ||
percentile: percentile, | ||
rateInterval: rate_interval, | ||
selector: selector, | ||
}, | ||
|
||
// The classicNativeHistogramSumRate function is used to calculate the histogram sum of rate from native histograms or classic histograms. | ||
// Metric name should be provided without _sum suffix. | ||
nativeClassicHistogramSumRate(metric, selector, rate_interval='$__rate_interval'):: | ||
'histogram_sum(rate(%(metric)s{%(selector)s}[%(rateInterval)s])) or rate(%(metric)s_sum{%(selector)s}[%(rateInterval)s])' % { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. for reviewers e.g., in context of a sum:
|
||
metric: metric, | ||
rateInterval: rate_interval, | ||
selector: selector, | ||
}, | ||
|
||
// The classicNativeHistogramCountRate function is used to calculate the histogram count of rate from native histograms or classic histograms. | ||
// Metric name should be provided without _count suffix. | ||
nativeClassicHistogramCountRate(metric, selector, rate_interval='$__rate_interval'):: | ||
'histogram_count(rate(%(metric)s{%(selector)s}[%(rateInterval)s])) or rate(%(metric)s_count{%(selector)s}[%(rateInterval)s])' % { | ||
metric: metric, | ||
rateInterval: rate_interval, | ||
selector: selector, | ||
}, | ||
|
||
histogramRules(metric, labels, interval='1m'):: | ||
local vars = { | ||
metric: metric, | ||
|
@@ -96,6 +128,75 @@ local g = import 'grafana-builder/grafana.libsonnet'; | |
], | ||
}, | ||
|
||
// not in use yet | ||
// latencyRecordingRulePanelNativeHistogram(metric, selectors, extra_selectors=[], multiplier='1e3', sum_by=[]):: | ||
// local labels = std.join('_', [matcher.label for matcher in selectors]); | ||
// local selectorStr = $.toPrometheusSelector(selectors + extra_selectors); | ||
// local sb = ['le']; | ||
// local legend = std.join('', ['{{ %(lb)s }} ' % lb for lb in sum_by]); | ||
// // sumBy is used in the averge calculation and also for native histograms where 'le' is not used | ||
// local sumBy = if std.length(sum_by) > 0 then ' by (%(lbls)s) ' % { lbls: std.join(',', sum_by) } else ''; | ||
// local sumByHisto = std.join(',', sb + sum_by); | ||
// { | ||
// nullPointMode: 'null as zero', | ||
// yaxes: g.yaxes('ms'), | ||
// targets: [ | ||
// { | ||
// expr: | ||
// ||| | ||
// (histogram_quantile(0.99, sum by (%(sumBy)s) (%(labels)s:%(metric)s:sum_rate%(selector)s)) or | ||
// histogram_quantile(0.99, sum by (%(sumByHisto)s) (%(labels)s:%(metric)s_bucket:sum_rate%(selector)s))) * %(multiplier)s | ||
// ||| % { | ||
// labels: labels, | ||
// metric: metric, | ||
// selector: selectorStr, | ||
// multiplier: multiplier, | ||
// sumBy: sumBy, | ||
// sumByHisto: sumByHisto, | ||
// }, | ||
// format: 'time_series', | ||
// legendFormat: '%(legend)s99th percentile' % legend, | ||
// refId: 'A', | ||
// step: 10, | ||
// }, | ||
// { | ||
// expr: | ||
// ||| | ||
// (histogram_quantile(0.50, sum by (%(sumBy)s) (%(labels)s:%(metric)s:sum_rate%(selector)s)) or | ||
// histogram_quantile(0.50, sum by (%(sumByHisto)s) (%(labels)s:%(metric)s_bucket:sum_rate%(selector)s))) * %(multiplier)s | ||
// ||| % { | ||
// labels: labels, | ||
// metric: metric, | ||
// selector: selectorStr, | ||
// multiplier: multiplier, | ||
// sumBy: sumBy, | ||
// sumByHisto: sumByHisto, | ||
// }, | ||
// format: 'time_series', | ||
// legendFormat: '%(legend)s50th percentile' % legend, | ||
// refId: 'B', | ||
// step: 10, | ||
// }, | ||
// { | ||
// expr: | ||
// ||| | ||
// %(multiplier)s * (histogram_sum(sum(%(labels)s:%(metric)s:sum_rate%(selector)s)%(sumBy)s) or sum(%(labels)s:%(metric)s_sum:sum_rate%(selector)s)%(sumBy)s) / | ||
// (histogram_count(sum(%(labels)s:%(metric)s:sum_rate%(selector)s)%(sumBy)s) or sum(%(labels)s:%(metric)s_count:sum_rate%(selector)s)%(sumBy)s) | ||
// ||| % { | ||
// labels: labels, | ||
// metric: metric, | ||
// selector: selectorStr, | ||
// multiplier: multiplier, | ||
// sumBy: sumBy, | ||
// }, | ||
// format: 'time_series', | ||
// legendFormat: '%(legend)sAverage' % legend, | ||
// refId: 'C', | ||
// step: 10, | ||
// }, | ||
// ], | ||
// }, | ||
|
||
selector:: { | ||
eq(label, value):: { label: label, op: '=', value: value }, | ||
neq(label, value):: { label: label, op: '!=', value: value }, | ||
|
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for reviewers: e.g.:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this can be improved.
Imagine you have a fairly long
rate_interval
(the problem exists for all intervals, but the longer, the more serious it gets), e.g. multiple days or so. While migrating to native histograms, you ingest classic and native histograms in parallel. Very soon, the first leg of the query above will yield a result, but it will be based just on a few samples of native histograms, or the last few minutes of the multi-day range.So what you want is to use the leg of the query that has a complete coverage of the
rate_interval
(and in doubt prefer the native histogram). I'll try to come up with a way to do that in the next comment.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this could work:
This assumes the migration is already going from classic to native histograms. First, the query checks if there has been a native histogram 1w ago. If so, it goes for native histograms. Second, it checks if there is a classic histogram now, and uses them. As the third and final option, it uses native histograms (which is relevant for a new metric that never had classic histograms).
Of course, the stupid PromQL engine will calculate all three legs, even if only one will ever be used.