Skip to content

[Experimental] Memberlist singleton and support for multiple codecs #2016

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 29 commits into from
Feb 12, 2020
Merged

[Experimental] Memberlist singleton and support for multiple codecs #2016

merged 29 commits into from
Feb 12, 2020

Conversation

pstibrany
Copy link
Contributor

@pstibrany pstibrany commented Jan 21, 2020

This PR changes experimental memberlist implementation of KV store in following ways:

  • memberlist.Client is now only a small wrapper around memberlist.KV. KV is a top-level cortex component (but cannot be run on its own), and is a singleton. KV is only initialized if any component actually wants to use memberlist as its KV store.
  • Instead of using custom binary protocol, memberlist KV store now exchanges protobuf messages defined in kv.proto
  • memberlist.KV now supports multiple codecs. For now, this is only used in tests. Codecs must be registered during KV initialization.

These changes allow Cortex in single-binary mode to use memberlist as KV store. Previously, such attempts would fail with panic, since all components (distributor/ingester/querier) tried to create a memberlist instance, and it failed with metrics registration. Even if it didn't fail there, it wouldn't work because each instance would try to open same port for listening.

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

@pstibrany pstibrany changed the title Memberlist multicodec [Experimental] Memberlist singleton and support for multiple codecs Jan 23, 2020
@pstibrany pstibrany marked this pull request as ready for review January 23, 2020 07:34
@tomwilkie tomwilkie requested a review from jtlisi February 11, 2020 09:07
Copy link
Contributor

@jtlisi jtlisi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM besides some small nits

Copy link
Contributor

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job @pstibrany! I left few minor comments.

Would you be up to add a integration test with single-binary Cortex and gossip to (hopefully) freeze the current implementation from future regressions?

@pstibrany
Copy link
Contributor Author

Would you be up to add a integration test with single-binary Cortex and gossip to (hopefully) freeze the current implementation from future regressions?

Added integration test that verifies that Cortex in single-binary mode works using memberlist setup.

This allows multiple clients to use the same memberlist.KV instance.

This is another step towards supporting multiple codecs per KV.

Signed-off-by: Peter Štibraný <[email protected]>
To support multiple codecs, we need to add a new field. As we're
going to break the format anyway, we may as well switch to protobuf.

Signed-off-by: Peter Štibraný <[email protected]>
Codec is now stored in the client, while KV has a registry of codecs that are used
when receiving values.

Signed-off-by: Peter Štibraný <[email protected]>
Signed-off-by: Peter Štibraný <[email protected]>
Signed-off-by: Peter Štibraný <[email protected]>
It was introduced for memberlist Client, but it will not be
needed anymore as memberlist.KV will be a top-level component
with its own lifecycle, separated from the client.

Signed-off-by: Peter Štibraný <[email protected]>
It is initialized if any component actually uses it, but not sooner.
If initialized, Stop method is called when server shuts down.

Signed-off-by: Peter Štibraný <[email protected]>
Client now only verifies that KV knows about coded that
client wants to use, but doesn't register it anymore.

Signed-off-by: Peter Štibraný <[email protected]>
Signed-off-by: Peter Štibraný <[email protected]>
Signed-off-by: Peter Štibraný <[email protected]>
Signed-off-by: Peter Štibraný <[email protected]>
Signed-off-by: Peter Štibraný <[email protected]>
Signed-off-by: Peter Štibraný <[email protected]>
Signed-off-by: Peter Štibraný <[email protected]>
Signed-off-by: Peter Štibraný <[email protected]>
Signed-off-by: Peter Štibraný <[email protected]>
Signed-off-by: Peter Štibraný <[email protected]>
Signed-off-by: Peter Štibraný <[email protected]>
and all distributors can see all ingesters and their tokens.

Signed-off-by: Peter Štibraný <[email protected]>
Added note on how to upgrade.

Signed-off-by: Peter Štibraný <[email protected]>
Signed-off-by: Peter Štibraný <[email protected]>
Signed-off-by: Peter Štibraný <[email protected]>
Signed-off-by: Peter Štibraný <[email protected]>
Signed-off-by: Peter Štibraný <[email protected]>
Copy link
Contributor

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

I've found out this test is quite slow due to the table-manager stop. Look at the timestamps (I've enabled info log level):

cortex-1: level=info ts=2020-02-12T15:50:19.5615697Z caller=cortex.go:281 msg=stopping module=table-manager
cortex-1: level=info ts=2020-02-12T15:50:36.8263139Z caller=cortex.go:281 msg=stopping module=runtime-config

This is caused by:

time.Sleep(time.Duration(rand.Int63n(int64(m.cfg.DynamoDBPollInterval))))

Would you mind fixing it in a separate PR, please? I should be just a matter of changing the table manager initial time.Sleep() into a select which returns if m.done is closed.

@pracucci pracucci merged commit 08a5e87 into cortexproject:master Feb 12, 2020
@pstibrany pstibrany deleted the memberlist_multicodec branch February 13, 2020 08:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants