Description
Describe the bug
With extend-writes enabled with AZ-aware replication on ingester, remote_write can fail when multiple ingesters fail in the same AZ.
Consider a cluster with 4 ingesters. ingester-A(az-1), and ingester-B(az-1), ingester-C(az-2), ingester-D(az-3). Ingester-A is in the leaving
state, while ingester-B is in unhealthy
state due unclean shutdown(OOM for example), ingester-C, and ingester-D are healthy
In https://github.com/cortexproject/cortex/blame/84f240e058eaa0e50889252f60ce72643b5a62c8/pkg/ring/ring.go#L387 we'll select all ingesters, even though ingesters A and B are in the same AZ, because ingester-A is not in a healthy state.
In https://github.com/cortexproject/cortex/blame/84f240e058eaa0e50889252f60ce72643b5a62c8/pkg/ring/replication_strategy.go#L36, since we pass in 4 ingesters, minSuccess is now (4/2) + 1 = 3. However, we only have 2 healthy instances, because ingesters in AZ-1 are in degraded state. This will trigger https://github.com/cortexproject/cortex/blame/84f240e058eaa0e50889252f60ce72643b5a62c8/pkg/ring/replication_strategy.go#L54, and fail the write immediately.
There is another similar issue, but with ingester-A in leaving
state, while ingester-B is in active
state with unclean shutdown, and has not reached the heartbeat timeout.
In https://github.com/cortexproject/cortex/blame/84f240e058eaa0e50889252f60ce72643b5a62c8/pkg/ring/replication_strategy.go#L70, distributor will require 3 ingesters for a successful write because minSuccess is 3, and instances is also 3. Distributor will attempt to write to ingester-B, ingester-C, and ingester-D, and will fail since ingester-B is actually unavaible.
To Reproduce
Steps to reproduce the behavior:
- Start Cortex (SHA or version)
- Perform Write Operations
- Trigger an unclean shutdown for 1 ingester, and start shutting down another ingester in the same AZ
Expected behavior
I expect the extend-writes to work with just a quorum of available ingesters, since extend-writes should be a best-effort
Environment:
- Infrastructure: kubernetes
- Deployment tool: helm
Storage Engine
- Blocks
- Chunks
Additional Context