-
-
Notifications
You must be signed in to change notification settings - Fork 32k
Implement asyncio.timeout() context manager #90927
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Now that TaskGroup is merged (see bpo-46752) we might consider adding some form of cancel scopes (another Trio idea). There's a sensible implementation we could use as a starting point in @asvetlov's async-timeout package (https://github.com/aio-libs/async-timeout). |
I've essentially forked The differences are:
Let me know if this sounds interesting. |
Oh, and Trio's |
Sure, we should create the best possible solution. We have no CI in the stdlib that checks type annotations, so those should probably be moved to a stub file in typeshed. (Ditto for asyncio taskgroups.py.) Using the new .cancelling()/.uncancel() API added to Task you might be able to avoid hacks using the cancel msg (check how it's used in the new TaskGroup). |
A brief explanation of cancel scopes for the uninitiated: A cancel scope can enclose just a part of a coroutine function, or an entire group of tasks. They can be nested within each other (by using them as context managers), and marked as shielded from cancellation, which means cancellation won't be propagated (i.e. raised in the coroutine function) from a cancelled outer scope until either the inner scope's shielding is disabled or the inner scope is exited or cancelled directly. The fundamental problem in implementing these on top of asyncio is that native task cancellation can throw a wrench in these gears. Since a cancel scope works by catching a cancellation error and then (potentially) allowing the coroutine to proceed, it would have to know, when catching a cancellation error, if the cancellation error was targeted at a cancel scope or the task itself. A workaround for this, made possible in Python 3.9, is to (ab)use cancellation messages to include the ID of the target cancel scope. This only solves half of the problem, however. If the task is already pending a cancellation targeted at a cancel scope, the task itself cannot be cancelled anymore since calling cancel() again on the task is a no-op. This would be solved by updating the cancel message on the second call. The docs don't say anything about the behavior on the second call, so it's not strictly speaking a change in documented behavior. Then, on the subject of level cancellation: level cancellation builds upon cancel scopes and changes cancellation behavior so that whenever a task yields while a cancelled cancel scope is in effect, it gets hit with a CancelledError every time, as opposed to just once in asyncio's "edge" style cancellation. Another very important difference is that with level cancellation, even a task that starts within a cancelled scope gets to run up until the first yield point. This gives it an opportunity to clean up any resources it was given ownership of (a connected socket in a socket server is a common, practical example of this). This is what the asyncio documentation states about Task.cancel(): "This arranges for a CancelledError exception to be thrown into the wrapped coroutine on the next cycle of the event loop. The coroutine then has a chance to clean up or even deny the request by suppressing the exception with a try … … except CancelledError … finally block. Therefore, unlike Future.cancel(), Task.cancel() does not guarantee that the Task will be cancelled, although suppressing cancellation completely is not common and is actively discouraged." This is, however, only true for a task that has started running. A Task that gets cancelled before even entering the coroutine is silently dropped. As asyncio does not allow for custom task instances without overriding the entire task factory, it leaves libraries like AnyIO some less desirable options for implementing level cancellation:
Having low level machinery for injecting a custom Task instance to the event loop would probably solve this problem. |
Alex, the goal here is not to replicate every Trio feature or behavior. For example I don't think that asyncio is likely to get level cancellation in 3.11, but we can certainly see if we can do something about some of the edge cases you mention, like the case of a task that is cancelled before it has started running, where you say that the task should be allowed to run until its first await. It would be nice to have a native asyncio example that demonstrates this, so we have a concrete goal. I am thinking it is something like this: async def send_from_open_file(f, s):
data = f.read()
f.close()
await s.send(data)
async def send_filename(filename, s):
f = open(filename)
t = asyncio.create_task(send_from_open_file(f, s))
t.cancel()
await asyncio.sleep(1) This is an interesting edge case and I can see why you'd rather see this run until the I don't even know if this would be easy to change if we decided it was a good change. Thoughts? (Possibly this discussion belongs in a new issue, since it's not directly related to adding cancel scopes.) |
I'm not trying to argue that asyncio should be changed to have level cancellation or even cancel scopes as built-in (at this point), but expanding the low level API to make implementing these features possible in third party libraries without the awkward hacks we have now. As for async-timeout, it suffers from the same problem as AnyIO and Quattro: that cancellations of the entire task can be inadvertently swallowed by the async context manager in edge cases. I hadn't even thought of the possibility of this happening until one of AnyIO's users reported just such a problem: agronholm/anyio#374 I just couldn't think of any way to correctly support such things without at least _some_ changes to the task cancellation behavior, and allowing .cancel() to update the cancel message seemed like the least invasive option. I'm all ears if someone has a better solution. |
OK.
|
Thanks, I will take a look at .uncancel() and .cancelling(). I saw that work happening in my email feed but couldn't figure out right away how it helped, but I will definitely look into the new TaskGroup code to see how it's used there and will get back to you after that. |
Oh, wait. The new Task.cancelling() API helps tell the difference between the *parent* task being cancelled "from outside" vs. the task group itself cancelling the parent task so as to break out of an await like the following: async with TaskGroup() as g:
g.create_task(...)
await <something> when the await is cancelled, __aexit__() is called with a CancelledError, and we need to tell whether it was cancelled from the outside or by the completion callback on one of the tasks managed by the task group. The EdgeDB TaskGroup monkey-patched the parent task's cancel() method, and the new asyncio.TaskGroup instead checks parent.cancelled(). However, AFAICT when *any* of the task managed by the TaskGroup exits with CancelledError, this is *ignored* (in both the EdgeDB version and in asyncio.TaskGroup). The assumption here seems to be that the only reason a managed task raises CancelledError is because it was cancelled by the TaskGroup. A fix for that would be to separately keep track (maybe in a separate weak dict, or some other data structure -- maybe a flag on the task itself?) of which tasks are successfully cancelled by the TaskGroup. We can then treat a CancelledError bubbling out of a managed task that we *didn't* cancel as any other exception, causing it to abort the task group (i.e., cancel all other tasks). Is that what you are looking for? (But I think this could be solved even in 3.9 without resorting to cancel messages.) |
The use of the cancel message is a hack, yeah. But it's what I had to work with. We could introduce a new, proper cancellation context to Tasks instead, which could be attached to the CancelError when it's raised. On the topic of multiple cancellations being applied to a task before it gets to run: could we just treat the cancellation context as a stack, and when the task gets to run, we throw them all in, one by one? Other options would involve somehow figuring out what the highest priority cancellation context is, or combining all the cancellation contexts into the CancelError somehow. On the topic of a task getting to run at least once before being cancelled: sure, I guess. I've personally never needed this but I can see how it'd be useful. Either have a flag on the Task instance or build that logic into the cancellation context handling? |
On the topic of TaskGroups needing to know whether to swallow a child CancelledError or no: just use the same nonce/cancellation context trick? I remember asking Nathaniel about why in Trio nurseries and cancel scopes were linked (nurseries contain a cancel scope there), whereas in Quattro they are completely separate, and not really understanding the answer. I think I'm getting it now. |
I hesitate to add yet another stack at this fundamental level. The Task.cancel() method returns a bool which indicates whether any state changed. When multiple cancellations happen concurrently, all but the first will return False, and anything that would like to cancel a task but finds that t.cancel() returns False can know that the task was either already cancelled or has already terminated. (To tell the difference, check t.done() first.) What would be the use case of wanting to cancel multiple times and having each cancellation be delivered separately? I know of one use case, where a task somehow decides to catch and *ignore* CancelledError (this should not be done lightly but it is supported -- like shielding in Trio). An impatient user or task manager might want to cancel such a thread a second time. This is what .uncancel() is for -- the thread must call .uncancel() to signal that it has truly ignored the cancellation (as opposed to being busy with cleanup that it deems uncancellable). But in this case the second cancellation (if it is to have any effect) should be delivered *after* .uncancel() is called. Your proposal of a cancel context or stack seems to be suggesting that there's a use case for mutliple *concurrent* cancellations. But I find it difficult to imagine such a use case, so I need your help. Even if we ignore the stack idea, could you provide some code showing how the cancel context would be used? I just learn better from code examples. |
Ok, first the happy path. We have a task running this: async def a():
with move_on_after(5):
await a_long_op()
await something_else()
So now 5 seconds pass, the callback cancels the task, and Sad path. Same scenario, except the event loop is kinda busy since we're running in production. Turns out this task was spawned by a web server, and there's a 5 second timeout (or the client disconnected, or something else). So now we have 2 callbacks that want to cancel this task: the one from The one from the web server is more important, since it's a higher level cancellation. But the callback from Line 206 in 6f1efd1
So now the task actually gets to run, |
I just tried to write a snippet to demonstrate the issue in TaskGroup, but that doesn't seem possible since TaskGroup never swallows a CancelledError. It always raises/propagates _some_ exception out of __aexit__() unless of course all the child tasks run to completion successfully. I was also surprised to notice that TaskGroup doesn't have a .cancel() method, so in cases where one would launch multiple tasks and cancel the rest when one succeeds, one would have to store all the child tasks separately and then iterate over them and cancel one by one. The Happy Eyeballs algorithm is one such use case (also used in AnyIO this way). |
@Tin The sad path is just a race, right? If the web server had disconnected just a tad later, __aexit__() would already have returned and await something_else() would already be running. So you can't make any promises if you write your code that way anyway. @alex For "happy eyeballs" you could also raise an exception or cancel the parent task once you've saved the winning result somewhere. Maybe you could show example code written using different paradigms so we can compare. |
@guido Imagine something_else() takes a long time and a lot of server resources (like a heavy query). If the web server disconnected a tad later and avoided the race condition, the task would have gotten cancelled very soon after the start of something_else() and stopped running it. But since the race did happen, the query avoids cancellation (potentially ~forever). |
Hm, I see. So the problem is that in the interval between move_on's calls to t.cancel() and t.uncancel(), if the web server calls t.cancel() that will just return False. So the web server would have to implement some other mechanism for cancelling operations. That's indeed unfortunate. Maybe we should just roll back that aspect of the TaskGroup PR -- in particular, remove these two lines: if self._cancel_requested:
return False from Task.cancel(). These lines don't matter for TaskGroup (it works without them), and they weren't there before yesterday, so the fallout would be very localized. @asvetlov What do you think? |
FWIW it looks like this part of taskgroups is vulnerable to a similar race: cpython/Lib/asyncio/taskgroups.py Lines 212 to 232 in 6f1efd1
Deleting the two lines I mentioned won't fix it here; a hack using the cancel message might be more appropriate. (I note that there is no documented way to retrieve the cancel message; you're supposed to access the protected |
Oh, it's passed to the CancelledError() constructor. But that's not documented either (I had to find the original issue that introduced this to figure it out -- bpo-31033). |
Does retrieving it from the CancelledError that is bubbling up suffice? Or do you need to be able to obtain it from the future object? |
I'm not sure yet (if anything I'd need it for a task, not a future). But it's also not documented that it gets passed to the exception (at least not in the Task docs -- I didn't check the Future docs). |
(By future, I also meant task, as task inherits from future.) For now, I think it would be safer to get the message from the CancelledError, if possible, since how it gets there truly is an implementation detail. It would be okay to document that the msg argument gets passed to the CancelledError via the constructor, as that was always the intent. See also bpo-45390 and the message I wrote there on how to make that API work better (given that the msg is only available from the leaf exception in the exception chain, and the current implementation creates intermediate exceptions, I believe unnecessarily): https://bugs.python.org/issue45390#msg403570 |
Instead of using the message for the nonce we could have a dedicated field for it. I have a proposal though. It's about putting a little logic into the cancellation nonce handling. Let's define the nonce as a float. If you just call Task.cancel(), the nonce defaults to 0.0. We add an argument to Task.cancel, so you can give it a nonce: task.cancel(nonce=nonce). The cancel() method has logic to keep track of the nonce with the smallest value. When it's time for the CancelledError to be thrown into the task, this nonce (default 0.0 if not set by anthing) is attached to the error. Then we change
How this fixes the sad path in my example: Both the web server and This also handles nested cancel scopes. I'm not sure how it works with the task catching the cancel to do a little cleanup. |
We still need docs for this. There were also some mutterings that perhaps it's not fully cooked. During the beta period I suppose it should be possible to still tweak the API and behavior if testing shows problems. But first and foremost, we need docs. (Writing docs gives you good karma. It can also help reveal weaknesses in the design.) (@agronholm @Tinche -- adding you back since the migration might have dropped you from the nosy list.) |
I'm totally down to write docs but after mid June, is that OK? |
Yeah, unless someone beats you to it, mid June will be give us plenty of time (the last beta is July 9, the final release Oct 3 -- see PEP 664). |
@gvanrossum yes sir. I have not written docs for CPython yet, are there any instructions on how do I get started or should I just figure it out? |
The devguide has a lot of information in this section: https://devguide.python.org/#contributing |
FWIW I assume that it should be added to the top of this section: https://docs.python.org/3.12/library/asyncio-task.html#timeouts |
Great, thanks. I got COVID so this is a little paused, but I'll get started ASAP |
Sorry to hear it. Get well soon! |
@pablogsal This is just lacking documentation. Does that qualify for it to be a (deferred) release blocker? The lack of documentation doesn't affect stability! |
Yes, there is no problem with this one. This can be left as a deferred blocker and will not affect any betas or RCs 👍 |
Though I may have missed critical timing, I'd like to suggest to add ability to customize Thanks to It would be nice if we have this in Python 3.12 if it's not possible to add this option in 3.11. |
Can’t happen in 3.11 due to the feature freeze at 3.11b1, and in fact we decided to deprecate the cancel message. There’s some discussion about it (maybe in this issue, maybe in a linked one). |
I totally understand now the 3.11 features are frozen. I just want to note that using cancel messages sometimes greatly helps me to debug and understand what's happening when writing asyncio-based libraries. Of course, it is also easy to abuse. 😉 |
You should probably start a new discussion if you really want to revert the deprecation of the message parameter. E.g. on discuss.python.org. |
I've created a thread in discuss.python.org! |
I've started on the docs. I'll be using the docs of the Where should I put these docs? I'm leaning towards a new section in We might want to spin out docs on cancellation to a separate page in the future, since I feel like very few people understand the implications of it, and it's (in my opinion) the second most important asyncio superpower. |
I agree it belongs in Eventually we may have an "explanation" about cancellation (using the Diataxis term), but that seems a separate doc change. |
How about this for a start: Tinche@63221c9 I have a very small section on task cancellation, and the timeout context managers below, but before I feel like a bunch of sections (including the new TaskGroup section) keep mentioning cancellation, so we should at least introduce it fairly quickly. |
And now that the docs have landed this issue can be laid to rest. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: