Skip to content

Discussion: Blocking operations #1961

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
NickCraver opened this issue Jan 24, 2022 · 5 comments
Open

Discussion: Blocking operations #1961

NickCraver opened this issue Jan 24, 2022 · 5 comments

Comments

@NickCraver
Copy link
Collaborator

We've gotten asks for this in things like #1474, #1593, and #1675.

The general ask is support for blocking operations like WAIT but also others like XREAD. The general problem here is with a multiplexed connection, all commands behind the blocking command are waiting on it and an indefinite time period, and so this is a very easy footgun to wield.

Proposals

One idea has been something like an IBlockingDatabase or similar where a user would have to explicitly cast to an interface to use it, but the more I think about it maybe that's not the correct abstraction. For one: these are per-client, per database, per-stream, and other things - so there would be multiple. But overall, the effect is per connection (or specifically in our internal nomenclature, per bridge).

What if instead this was on ConfigurationOptions to allow a blocking command. We could have one bool there that allows that multiplexer to perform blocking operations. Any attempt to call a blocking operation without this enabled would throw an informative error to the user. As these would be new commands or overloads not in play today, there wouldn't be any API break. This error message could point to our GitHub Pages help informing people about what this means and how to be careful.

Connection pooling I have worries about with respect to ordering but I believe these use cases to be completely orthogonal to ordering, so if we get pooling in this same options approach could be used to spin up n bridges or whatever approach we take there. Today we wouldn't want to do that because many have connection limits and don't want the subscriber one either (which isn't something we can disable today).

I'm curious what others thing of this: basically an option that says "yes, I accept the disclaimer" to allow blocking commands?

Downstream problems

Of note though, there is a downstream delineation of impact here because we're on sockets that have TTLs. For something like XREAD we could have an IAsyncEmulerable API with steam results streaming out, but internally we could say GET X items every N ms (e.g. XREAD COUNT 50 BLOCK 200), and only block for that long, then try again in a while loop, basically asynchronously blocking that multiplexer - this is fairly straightforward if the N ms is under 1,000.

Beyond that 1000 ms, we run into pile-up problems with heartbeats keeping the connection alive. The problem set is basically < 1000 ms and > 1000 ms, with the latter being "up to indefinite". Indefinite is a problem I don't have a solution for because we know that many environments including paid hosting will drop the socket if there's no data being sent for too long a time period. Socket TTLs of the system is not something we can readily work around, to my knowledge. We could send noise that piles up for execution behind the WAIT to the Redis server from the client, but nothing coming back on any defined time period is ensured as the timeout is indefinite and the pile up itself is also unbounded.

We could back off if we know the pending/current command is a block (with a flag?) and only KeepAlive once a minute or something to lower the load, but again that only solves socket death in one direction. But for limited blocks that have a millisecond duration specifier like XREAD, this isn't something we have to solve to unlock. Even without connection pooling or unbounded numbers of bridges, users could create a dedicated multiplexer with this option to read a stream with the proposal above. There could be 10 other problems with this, just wanted to get the discussion into a single issue!

cc @mgravell @philon-msft @davidfowl

@JesperTreetop
Copy link

(Rando alert: Very sporadic StackExchange.Redis user who has been watching Redis Streams with interest.)

It seems to me that the safest and easiest-to-explain thing is to put all the troublesome operations that would ruin/block an entire connection in a completely different part of the API space. So, something like:

var db = /* IDatabase */;

// fork off an equivalent, new connection to the same place
var blocking = db.GetBlockingOperations(); 

await foreach (var xReadItems in blocking.XRead(...)) {
    // ...
}

If what you want is to use blocking XREAD and you're handling it in the intended way, you're already resigned to spinning up a dedicated connection for that. No reason that the library shouldn't do that in the best way it knows how, and fence stuff off. Potentially, it could also have some sort of collision detection - hey, you started a new blocking XRead when a blocking XRead is already going and hasn't ended or been cancelled - that would be extremely surprising in the regular API, but would guide you towards the pit of success for the blocking use case.

@mgravell
Copy link
Collaborator

Not wanting to derail, but I'd also invite thoughts on a secondary use-case here.

To implement blocking operation support, we'd (as alluded) need to implement a pool under the bonnet; this also enables two other performance related improvements. Basically, at the moment I'm seeing a lot of evidence of head-of-line blocking due to the single reader, especially on non-cluster scenarios (since all the load is effectively forced through one reader). If we have multiple connections, we also have the ability to shard traffic between multiple backends. To avoid semantic changes (in particular non-awaited operations), we could effectively have multiple calls to GetDatabase() return tokens that default to different underlying connections (but as a weak binding - if it becomes unavailable: meh). Or if we accept the semantic change (re unawaited operations), we could also allow different operations on the same IDatabase to go to different underlying connections, which gets much richer load sharding, but with a huge caveat along the lines of "if you don't await operations (or use a batch/transaction): order is not guaranteed"

@wj8400684
Copy link

Is it supported now?

@mgravell
Copy link
Collaborator

mgravell commented Mar 27, 2023 via email

@josh-endries
Copy link

Dang, just ran into this. Off to find another lib. :(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants