Skip to content

Conversation

gotjosh
Copy link
Contributor

@gotjosh gotjosh commented Mar 26, 2020

What this PR does:

Now that Prometheus remote write supports sending of Metadata, this change allows Cortex to receive the metadata and shard it accordingly.

Which issue(s) this PR fixes:

This partially implements the proposal accepted as part of #2311

Checklist

  • e2e test added
  • Unit tests for the distributors
  • TODOs resolved
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

I've decided to split this across two PRs to manage the changes made and make it easy on the reviewer. There are still some lingering questions that I'm hoping to resolve with the help of any maintainer.

Signed-off-by: gotjosh [email protected]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only here temporary, I'd like to some feedback on what I've done before I open a PR with the rest of the change.

Copy link
Contributor

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job @gotjosh (and sorry for the delay in the initial review). The overall design LGTM. I'm a bit dubious about the increased complexity of Distributor.Push() but I would like to discuss it in a second pass. My main feedback is that I've the feeling PreallocMetricMetadata is unncessary and just makes things more complicated.

@gotjosh gotjosh force-pushed the distributors-metadata branch 2 times, most recently from f5a9ac0 to 9743ba9 Compare April 1, 2020 20:09
@gotjosh gotjosh requested a review from pracucci April 2, 2020 09:45
@gotjosh gotjosh force-pushed the distributors-metadata branch from 9743ba9 to 073aa36 Compare April 2, 2020 10:49
Copy link
Contributor

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good! I just left few small comments. A part from this, I would be glad to see the Push() logic covered in a unit test if it's not too complicated (given the ingester logic has not been built yet), otherwise you can do it in the next PR (up to you).

Comment on lines 526 to 528
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's true that a push request can only contain series or metadata, why don't we return error? I would cost us nearly zero effort to make it not a best-effort from the Cortex side.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's true that a push request can only contain series or metadata, why don't we return error?

It is for now, but more importantly, it is something that we think will change in the future (given other Prometheus integrations e.g. Stackdriver) require metadata to be propagated with the series. The key takeaways here are a) We don't want to couple that implementation with Cortex and b) We want to prioritise samples over metadata.

I think we can this conversation again when I submit the refactor to send series + samples together to ingesters as the problem will be pushed there, but for now, I would feel more confident to just log the error for now and make sure we ingester the samples. WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the code above there's the following comment. I think it's misleading. Either we assume a request can contain both or not. I'm OK if we do, but we should simply reflect that comment just saying A WriteRequest can include samples and metadata.

	// A WriteRequest can now include samples and metadata. However, at the moment of writing,
	// we can only receive samples and no metadata or metadata and no samples.

About whether returning error or not, I'm OK keeping it as is (just logging error). We can always change it in the future to more stricter policy if required.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the code above there's the following comment. I think it's misleading. Either we assume a request can contain both or not.

It's a bit tricky, but I see your point. I'll change to state:

A WriteRequest can only contain series or metadata but not both. This might change in the future.

@gotjosh gotjosh force-pushed the distributors-metadata branch from 9f96443 to df60bc9 Compare April 2, 2020 21:30
@gotjosh gotjosh requested a review from pracucci April 3, 2020 01:34
Copy link
Contributor

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @gotjosh for your hard work! You did a great job and this first PR LGTM 👏

Copy link
Contributor

@pstibrany pstibrany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't finished my review yet, but I don't want to sit on feedback I have so far.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method has changed significantly, and mixes handling of samples and metadata. Would it make sense to split Push into two, to handle these two cases separately? (Push calling pushMetadata and pushSamples or something similar).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I plan to add another change that merges the two. Since ingesters receive a WriteRequest, they can just go together. I just wanted to avoid a big refactor upfront.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, ok. Push is quite big at the moment and could use some trimming.

gotjosh and others added 10 commits April 3, 2020 13:11
Now that Prometheus remote write supports sending of Metadata, this change allows Cortex to receive the metadata and shard it acordingly.

I've decided to split this across two PRs to manage the changes made and make it easy on the reviewer.

Signed-off-by: gotjosh <[email protected]>
Co-Authored-By: Marco Pracucci <[email protected]>
Signed-off-by: gotjosh <[email protected]>
Co-Authored-By: Marco Pracucci <[email protected]>
Signed-off-by: gotjosh <[email protected]>
Co-Authored-By: Marco Pracucci <[email protected]>
Signed-off-by: gotjosh <[email protected]>
Co-Authored-By: Marco Pracucci <[email protected]>
Signed-off-by: gotjosh <[email protected]>
Signed-off-by: gotjosh <[email protected]>
- Continue validating the rest of the metadata on error.
- Move the subRing initialisationg after the null check.
- Supply a "msg" key to the metadata failure log line.
- Remove TODO about the rate limits

Signed-off-by: gotjosh <[email protected]>
gotjosh added 2 commits April 3, 2020 13:28
- Removed the checks around metadata/samples iteration. Don't need them
  anymore.
- Changed the comment wrt WriteRequest samples/metadata handling.

Signed-off-by: gotjosh <[email protected]>
Signed-off-by: gotjosh <[email protected]>
@gotjosh gotjosh force-pushed the distributors-metadata branch from 2ea3164 to ca5e188 Compare April 3, 2020 12:47
Signed-off-by: gotjosh <[email protected]>
@gotjosh gotjosh requested a review from pstibrany April 3, 2020 13:27
@pull-request-size pull-request-size bot added size/L and removed size/XL labels Apr 3, 2020
Comment on lines +357 to +360
func (d *Distributor) validateMetadata(m *ingester_client.MetricMetadata, userID string) error {
return validation.ValidateMetadata(d.limits, userID, m)
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this method brings anything at the moment. Should we get rid of it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you're going to work on this, let's move it to the next PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, ok. Push is quite big at the moment and could use some trimming.

@pstibrany
Copy link
Contributor

pull-request-size bot added size/L and removed size/XL labels 3 hours ago

haha, good bot :)

@pstibrany
Copy link
Contributor

@gotjosh Nice job, looks good to me. Can we fix CHANGELOG entry formatting? :)

Signed-off-by: Peter Štibraný <[email protected]>
@pracucci pracucci merged commit 8fe97dc into cortexproject:master Apr 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants