Skip to content

Add log message for partial failure on push #4853

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

danielblando
Copy link
Contributor

@danielblando danielblando commented Sep 7, 2022

What this PR does:
Add partial failure logs to ingester push. We currently have logs on grpc_logging which just log the first error. This gives us more visibility on what is occurring in the ingester.

New log:

level=warn ts=2022-09-07T06:59:57.34613661Z caller=ingester.go:1009 org_id=XXXXXXX msg="partial failure to push" err="cannot add series: ingesters's max series limit reached"

Existent log:

level=warn ts=2022-09-07T06:59:58.084669264Z caller=grpc_logging.go:38 method=/cortex.Ingester/Push duration=83.84µs err="user=xxxxxxxx: cannot add series: ingesters's max series limit reached" msg="gRPC\n"

Which issue(s) this PR fixes:

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

@danielblando danielblando force-pushed the improve-logging-ingester branch from 36aea64 to 7097003 Compare September 7, 2022 16:27
@danielblando danielblando marked this pull request as ready for review September 7, 2022 17:04
@@ -1006,6 +1006,7 @@ func (i *Ingester) Push(ctx context.Context, req *cortexpb.WriteRequest) (*corte
// of it, so that we can return it back to the distributor, which will return a
// 400 error to the client. The client (Prometheus) will not retry on 400, and
// we actually ingested all samples which haven't failed.
level.Warn(logutil.WithContext(ctx, i.logger)).Log("msg", "partial failure to push", "err", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am curious on if this log would be too noisy?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't expect it to be. This is just in failures cases to append to tsdb, because of limit or storage issue. The idea behind is to use this with the new header to log feature (#4803). That would allow us to have a tracker between distributor and ingester issue.
Currently ingester does not log any message on this type of failures, and the log in the server middleware does not implement the header logs feature from cortex, which seems correct as it is a generic logging.

@alvinlin123 alvinlin123 merged commit 1cf71ba into cortexproject:master Sep 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants