Stop early-return after majority success in distributor writes

Suppose we have a replication factor of 3, then the writing process is:
- Distributor replicates the data three times and fires up three goroutines to deliver the data.
- Once two of the calls have returned from ingesters, distributor returns success.
- Context is cancelled (in distributor, because the call ended; in ruler, to avoid leaking timers)
- Third call may or may not have updated its ingester.

I already think this causes #670; the idea that each replica may be missing some samples makes me more queasy.

In a typical `Push()` call containing 100 samples, the distributor will be writing to all ingesters (in different combinations for each metric), which will make the missing samples issue rare. And until #673 there was no timeout on ruler updates, so that context was never cancelled.

Let's just stop it - wait for all calls to return, or a timeout.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Stop early-return after majority success in distributor writes #730

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Stop early-return after majority success in distributor writes #730

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions