Skip to content

Ingester to wait for some time before joining gossip based ring #1903

Closed
@codesome

Description

@codesome

This issue is particularly of interest when the ingester names remain the same across rollout/restarts, which is going to be the thing when using WAL with statefulsets.

In my tests I have found out that as the gossip ring takes some time to propagate its changes across all pods, if an ingester terminates and starts quickly enough, it picks can pick up the old LEAVING state from the ring and cause issues (just stuck).

One possible (hacky) solution can be to wait before contacting the ring for it to settle (which is what I have in place in my tests for now).

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions