Skip to content

Conversation

jtlisi
Copy link
Contributor

@jtlisi jtlisi commented Apr 7, 2020

What this PR does:

This PR gives the operator the option to add an evaluation delay interval to the Ruler. This can be configured to ensure the ruler does not evaluate rules before the data required for the evaluation of those rules is written to Cortex.

Which issue(s) this PR fixes:
Fixes #1723

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

Signed-off-by: Jacob Lisi <[email protected]>
@jtlisi jtlisi changed the title ensure rules are evaluated on a delayed interval Configurable delay on rule evaluation Apr 7, 2020
cfg.ExternalURL.URL, _ = url.Parse("") // Must be non-nil
f.Var(&cfg.ExternalURL, "ruler.external.url", "URL of alerts return path.")
f.DurationVar(&cfg.EvaluationInterval, "ruler.evaluation-interval", 1*time.Minute, "How frequently to evaluate rules")
f.DurationVar(&cfg.EvaluationDelay, "ruler.evaluation-delay", 0, "Set interval to delay the evaluation of rules to ensure they underlying metrics have been pushed to cortex. Default ")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
f.DurationVar(&cfg.EvaluationDelay, "ruler.evaluation-delay", 0, "Set interval to delay the evaluation of rules to ensure they underlying metrics have been pushed to cortex. Default ")
f.DurationVar(&cfg.EvaluationDelay, "ruler.evaluation-delay", 0, "Set interval to delay the evaluation of rules to ensure they underlying metrics have been pushed to cortex. ")

// the given engine, after subtracting the provided delay from the instant query timestamp.
// It converts scalar into vector results.
// Based on https://github.com/prometheus/prometheus/blob/ecda6013edf58bf645c6661b9f78ccce03b1f315/rules/manager.go#L162-L187
func engineQueryFunc(engine *promql.Engine, q storage.Queryable, delay time.Duration) rules.QueryFunc {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole function is a copy of promRules.EngineQueryFunc() just to add t = t.Add(-delay) as first thing. Why don't we simply wrap promRules.EngineQueryFunc() instead of duplicating it? Something like (pseudo-code):

func engineQueryFunc(engine *promql.Engine, q storage.Queryable, delay time.Duration) rules.QueryFunc {
orig := promRules.EngineQueryFunc(engine, queryable)

return func(ctx context.Context, qs string, t time.Time) (promql.Vector, error) {
  return orig(ctx, qs, t.Add(-delay))
}
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do.

# Set interval to delay the evaluation of rules to ensure they underlying
# metrics have been pushed to cortex. Default
# CLI flag: -ruler.evaluation-delay
[evaluation_delay: <duration> | default = 0s]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

evaluation_delay_time/evaluation_delay_duration?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

evaluation_delay_duration makes sense

// the given engine, after subtracting the provided delay from the instant query timestamp.
// It converts scalar into vector results.
// Based on https://github.com/prometheus/prometheus/blob/ecda6013edf58bf645c6661b9f78ccce03b1f315/rules/manager.go#L162-L187
func engineQueryFunc(engine *promql.Engine, q storage.Queryable, delay time.Duration) rules.QueryFunc {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Signed-off-by: Jacob Lisi <[email protected]>
Copy link
Contributor

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing my feedback. LGTM (remember make doc)

Signed-off-by: Jacob Lisi <[email protected]>
@jtlisi jtlisi merged commit 226adc6 into cortexproject:master Apr 8, 2020
@jtlisi jtlisi deleted the 20200407_ruler_eval_delay branch April 8, 2020 18:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Ruler doesn't take account of delays in incoming data

3 participants