-
Notifications
You must be signed in to change notification settings - Fork 816
Update prometheus to v2.16.0 #2088
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update prometheus to v2.16.0 #2088
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Just update the logic for the temp directory in the ruler_test so it's called after the test is completed not the newTestRuler
function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job @pstibrany ! Changes look good to me. I left few comments I would be glad if you could take a look.
I've rebased and addressed your feedback. Now instead of creating temp dir for Active Query Tracker, we disable it, if no dir is configured. Dir defaults to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @pstibrany for addressing my comments. I left a last comment in the code changes to fix the YAML field name for the query tracker.
I went through the vendor/github.com/prometheus
changes and a think catched my attention. The diff in vendor/github.com/prometheus/prometheus/web/api/v1/api.go
shows that two new fields have been added to the exported rules:
EvaluationTime float64 `json:"evaluationTime"`
LastEvaluation time.Time `json:"lastEvaluation"`
To keep Prometheus compatibility, I think we should add the support for these two new fields to the ruler API endpoints. @jtlisi ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Peter! Changes LGTM.
I went through the vendor/github.com/prometheus changes and a think catched my attention. The diff in vendor/github.com/prometheus/prometheus/web/api/v1/api.go shows that two new fields have been added to the exported rules.
About this comment, to me we can address it in a separate PR. It doesn't break the current logic, it's just a new "feature".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Withdrawing my approval because we found out that the read path is failing with the error missing port in address
.
There is now workaround for this, basically we fake the address in I've also sent PR to Prometheus to be less strict about missing or invalid |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @pstibrany! I'm OK keeping the temporarily fakeRemoteAddr()
until the fix won't be upstream.
….0-rc.0) Signed-off-by: Peter Štibraný <[email protected]>
Signed-off-by: Peter Štibraný <[email protected]>
Signed-off-by: Peter Štibraný <[email protected]>
Signed-off-by: Peter Štibraný <[email protected]>
Signed-off-by: Peter Štibraný <[email protected]>
That change is currently incompatible with our serialization code, so until that is fixed, I've copied previous state of rulefmt from Prometheus (github.com/prometheus/prometheus/pkg/rulefmt). It's a small package, and I've included tests as well. Signed-off-by: Peter Štibraný <[email protected]>
Signed-off-by: Peter Štibraný <[email protected]>
Signed-off-by: Peter Štibraný <[email protected]>
Signed-off-by: Peter Štibraný <[email protected]>
Signed-off-by: Peter Štibraný <[email protected]>
Signed-off-by: Peter Štibraný <[email protected]>
Also explains why we use temp dir, if it's not configured. Signed-off-by: Peter Štibraný <[email protected]>
Signed-off-by: Peter Štibraný <[email protected]>
…fe654cc) Signed-off-by: Peter Štibraný <[email protected]>
Signed-off-by: Peter Štibraný <[email protected]>
Disable AQT if no directory is set. Signed-off-by: Peter Štibraný <[email protected]>
Signed-off-by: Peter Štibraný <[email protected]>
Since our Select methods already return sorted results, it makes sense to call them SelectSorted, and let Select simply call SelectSorted, rather then the other way around. Signed-off-by: Peter Štibraný <[email protected]>
Signed-off-by: Peter Štibraný <[email protected]>
Signed-off-by: Peter Štibraný <[email protected]>
…Tracker Signed-off-by: Peter Štibraný <[email protected]>
Signed-off-by: Peter Štibraný <[email protected]>
Prometheus requires non-empty RemoteAddr, but only uses it when Engine actually has QueryLogger set. Which it doesn't, in Cortex. Signed-off-by: Peter Štibraný <[email protected]>
Signed-off-by: Peter Štibraný <[email protected]>
Signed-off-by: Peter Štibraný <[email protected]>
Signed-off-by: Peter Štibraný <[email protected]>
Signed-off-by: Peter Štibraný <[email protected]>
Signed-off-by: Peter Štibraný <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @pstibrany for addressing the rebase! The changes after the rebase LGTM.
Signed-off-by: Peter Štibraný <[email protected]>
What this PR does: This PR updates Prometheus dependency to v2.16.0 (b90be6f32a) (was -rc.0 and -rc.1 before). This is required to finish work on issue #2003.
Recent Prometheus changes introduced some incompatibilities:
promql.EngineOpts
struct had itsMaxConcurrent
field removed, and functionality replaced with active query tracker. Cortex' querier therefore enables the tracker to keep-querier.max-concurrent
flag still working. Since tracker needs a directory to write files to, Cortex uses./active-query-tracker
by default. Update: If directory is explicitly set to empty string by using-querier.active-query-tracker-dir=""
, Active Query Tracker is disabled. (Previously, random temp directory was created for empty dir)Changes in rulefmt package modified
RuleNode
type, which replaced previously usedRule
. Unfortunately, while this provides better error reporting when doing deserialization, it makes it impossible to serialize this structure back to YAML because of problem in yaml.v3 package. Until we find a better solution, I've made a copy of rulefmt package before the change into Cortex (as "legacy_rulefmt"). It's a very tiny package, but admittedly, this is not a good solution.Prometheus also added
SelectSorted
method tostorage.Querier
interface, but fortunately all our queriers already sorted their output inSelect
method anyway, so all implementations just call that.promql.ParseErr
has lost itsLine
andPos
fields and now only hasPositionRange
with start and end location. We use this in/api/prom/validate_expr
API endpoint to report query syntax problems. This endpoint doesn't seem to be documented anywhere and was added by request from Grafana Observability team. It doesn't seem to be used as of now. If it's not used anywhere else, it might be possible to remove it completely (it's not provided by Prometheus). For now I've changed endpoint to returnstart
andend
position instead ofline
andcol
fields. /cc @davkal. Update: this functionality has been completely removed by Remove unused /validate_expr endpoint. #2152.Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]