Description
Describe the bug
This problem happens because memberlist ignores HeartBeat timestamps that are olders than the timestamp in memory.
This logic is implemented in the Merge function
Lines 189 to 201 in 23aed55
The logic itself is correct, but when a ingester have a clock skew and send an information that might be in the future (eg. 1 day ahead) that info will be valid until that timestamp is reached. Also when using the UI "Cortex Ring Status" to call action "Forget" that do not work as others will ignore the information.
The ingester cannot come back to the ring until that timestamp as well. If we try to remove the ingester, fix the clock and add it back, no changes to the ring would be noticed and the info is ignored.
To Reproduce
Steps to reproduce the behavior:
- Start Cortex (master@25726a168a9b5a9394c9e7c7ff67cdffed66a347)
- Start a new ingester in a new instance with a clock in the future (complicated to reproduce)
- Check you cortex ring status to validate the new ingester entry in the future
- Kill the new ingester
- Check the new ingester will continue in the ring.
- Even trying to forget it does not work as the existent date is fresher
Expected behavior
I would expect the ring to handle clock skew. It could either accept the information, but have another way to forcefully remove the bad ingester or deny the information that comes way ahead of time.
Environment:
- Infrastructure: Kubernetes
- Deployment tool: helm
Storage Engine
- Blocks
- Chunks
Additional Context