Skip to content
This repository was archived by the owner on Apr 14, 2022. It is now read-only.

Allow closing connection pools and responses #4

Closed
wants to merge 1 commit into from

Conversation

pquentin
Copy link
Member

@pquentin pquentin commented Jan 2, 2018

No description provided.

with self.lock:
# Copy pointers to all values, then wipe the mapping
values = list(itervalues(self._container))
self._container.clear()

if self.dispose_func:
for value in values:
self.dispose_func(value)
await self.dispose_func(value)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately I think RecentlyUsedContainer will need a larger rewrite, because dispose_func also gets called in other places... and in particular from __setitem__ and __delitem__, which can't be marked async. Fortunately, it seems to be a private interface, so I guess we can give it a new interface and update the callers. (I'm guessing urllib3 doesn't actually use the full MutableMapping interface here...)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, the attribute PoolManager.pools I guess is nominally public (no underscore). I think our options are to either add an underscore, or else go back and make closing things synchronous after all. It doesn't seem like it's intended to be a public interface, so maybe adding an underscore makes sense, dunno. Since we're still in proof-of-concept mode we should probably pick whichever is easiest for now.

@njsmith
Copy link
Member

njsmith commented Jan 2, 2018

Oh, here's another complication of asynchronous close: it can raise an exception. Trio's convention is that this means that the close still happened, and this is really the only workable convention (see), so we don't have to worry about this causing problems for that particular connection. But the caller still needs to do something sensible with that exception.

The tricky thing here is that PoolManager.aclose ends up doing a loop repeatedly calling ConnectionPool.aclose, and ConnectionPool.aclose ends up doing a loop repeatedly calling aclose on individual connections. If one of these raises an exception, like Cancelled, then we still need to close the rest of them. This requires some care and cleverness. There are a few places in trio where I use a little recursion like:

async def _aclose_all(resources):
    if not resources:
        return
    resource = resources.pop()
    try:
        await resource.aclose()
    finally:
        await _aclose_all(resources)

the nice thing about this is that it invokes Python's implicit exception chaining, so that if multiple things raise exceptions you get some sensible result. The downside is that because it uses recursion, it's possible to run out of stack, so it's not really suitable when you're potentially dealing with a lot of items.

It is possible to do an iterative version of this, but it's really ugly.

If we only cared about trio, the aclose methods could use cancel shielding to avoid getting Cancelled exceptions. We could add some with backend.cancel_shielded(): ... API, that on non-trio backends is a no-op. Or maybe it wouldn't be a no-op -- twisted, asyncio, etc. also have cancellation. On asyncio there's asyncio.shield (somewhat awkward because the shielded code has to be a separate function). I guess Twisted probably has something similar, though I don't know what.

We could go back to synchronous close methods. Earlier I came down in favor of async close, mainly because in both Twisted and asyncio you can't actually close anything synchronously, the best you can do is to arrange for it to be closed on the next loop iteration, and aesthetically I feel like aclose ought not to return until the socket is actually, y'know, closed. This is not the aesthetic that twisted and asyncio follow though, and in this specific case it's pretty harmless; even if we don't wait for the closure then it still does happen deterministically and ASAP, it's not one of these cases where something might or might not happen at some indefinite future point. Hacking a synchronous close into the TrioBackend is also pretty easy, I think it's just:

def close(self):
    s = self._stream
    # Unwind any layers of SSLStream (in practice it's either 0 or 1, but why not future-proof things...)
    while hasattr(s, "transport_stream"):
        s = s.transport_stream
    # Now that we have a SocketStream, close the underlying socket
    s.socket.close()

(See TrioBackend.is_readable for an almost-identical hack.)

Now that we've run into the issues with closing multiple connections at once and the need to rewrite RecentlyUsedContainer, I'm currently leaning towards the synchronous approach being less hassle overall, so the opposite of what I said earlier. (Sorry! Though I guess this kind of learning is the point of doing this kind of proof-of-concept exploration...)

I'm going to bed now. Maybe things will look different in the morning :-)

@pquentin
Copy link
Member Author

pquentin commented Jan 2, 2018

Sounds good to me! Based on your explanations, I believe a synchronous close makes more sense here.

Another small argument against the recursive _aclose_all with exception chaining is that exception chaining is not supported natively in Python 2. And a synchronous close will sound slightly less strange when bleaching to Python 2.

njsmith referenced this pull request Jan 3, 2018
There are arguments either way, but the arguments for async close are
largely aesthetic, and the arguments against are that it creates some
tricky problems with exception handling, and extra churn in urllib3
interfaces. For more details see:
  https://github.com/njsmith/urllib3/pull/4#issuecomment-354729966
@njsmith njsmith closed this Jan 3, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants