Description
Issue tracker is used for reporting bugs and discussing new features. Please use
stackoverflow for supporting issues.
After updating from v9.3.0 to v9.3.1, we started seeing issues where a pipeline to create a Redis Stream is no longer properly setting up the Stream. This issue persists on v9.4.0
Expected Behavior
Redis Ring should be able to run a pipeline to create a stream, then read from the same stream.
Current Behavior
Redis Ring sporadically fails. We run 80+ servers and a small percentage (1-5%) of them seem to write/read from mismatched shards.
Steps to Reproduce
- Set up a Ring shard
- Run an
xgroup
command and notice thefirstKeyPosition
is 1 instead of 2
Context (Environment)
We are running Redis Ring with 3 shards, with 80+ servers reading from them.
Detailed Description
I went through the v9.3.1 release one commit at a time to see which produced the issue, and I landed on 8c69548 as the source of the issue (cc @ofekshenawa as this was from your PR). Once I pinned that down, I focused in on some specific commands we are having issues with:
- Pipeline (These are used to create a Stream and set an expiration on it. This is a bit of a hack to get an empty stream created.)
- xgroup create
- xgroup destroy
- expire
- exists (this checks to verify the stream exists and isn't expired)
After the changes from 8c69548, these commands execute without errors, but the exists
check returns false on a small percentage of our servers. To me, the most likely reason is that the stream was created on a different shard than it's being read from.
I did a little digging locally and noticed that the changes to cmdFirstKeyPos
do indeed change the first key position for the xgroup
command. When it checked from CommandInfo
, it got a key position of 2
, but with the removal of CommandInfo
, it now receives a key position of 1
(the default return value).
Possible Implementation
In terms of a fix, we could either bring back the CommandInfo
being passed in to cmdFirstKeyPos
, or we could add another case to the switch within cmdFirstKeyPos
to explicitly handle the xgroup
command and return 2