Skip to content

uasyncio timeout functionality (wait_for) fiasco #215

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
pfalcon opened this issue Oct 8, 2017 · 23 comments
Closed

uasyncio timeout functionality (wait_for) fiasco #215

pfalcon opened this issue Oct 8, 2017 · 23 comments
Labels
rfc Request for Comment

Comments

@pfalcon
Copy link
Contributor

pfalcon commented Oct 8, 2017

Despite ideas on adding additional features to uasyncio not present in upstream asyncio (e.g. micropython/micropython#2989), it has more mundane problems, like no support for features which upstream offers. One of such feature is ability to cancel coroutine execution on timeout (asyncio.wait_for() https://docs.python.org/3/library/asyncio-task.html#asyncio.wait_for). With the original uasyncio's usecase, writing webapps, it's kind of not needed, but of course it's required for generic applications, or even for UDP networking (e.g. implementing DNS resolver).

Before continuing, I'd like to remind that uasyncio's goal has always been to implement both runtime- and memory-efficient async scheduler. One of the means to achieve memory efficiency was basing uasyncio solely on the native coroutines, and avoiding intermediate objects which upstream has, like Future or Task.

So, let's consider how wait_for() can be implemented. First of all, we somehow need to track timeout expiration. Well, there's little choice but to use standard task queue for that, and actually, that's just the right choice - the task queue is intended to execute tasks at the specified time, and we don't want to event/use another mechanism to track times specifically for timeout. So, we'd need to schedule some task to occur at timeout's time. The simplest such task would be a callback which would cancel target coro. And we would need to wrap the original coro, and cancel the timeout callback if the coro finishes earlier. So, depending on what happens first - a timeout or coro completion, it would cancel the other thing.

So far so good, and while this adds bunch of overhead, that's apparently the most low-profile way to implement a timeout support without adhoc features. But there's a problem already - the processing above talks about cancelling tasks, but uasyncio doesn't actually support that. The upstream asyncio returns a handle from functions which schedule a task for execution, but uasyncio doesn't. Suppose it would, but then operation of removing a task from queue by handle would be inefficient, requiring scanning thru the
queue, O(n).

But the problems only start there. What does it really mean to cancel a coroutine? Per wait_for() description, it raises TimeoutError exception on timeout, and a natural way to achieve that would be to inject
TimeoutError into a coroutine, to give it a chance to handle it, and then let propagate upwards to wait_for() and its caller. There's a .throw() method for coroutines which exactly injects an exception into a coro,
but it doesn't work as required here. From the above, this would happen in a timeout callback. And .throw() works by injecting an exception and starting to run a coro. If timeout callback calls .throw(), it gets
TimeoutError exception immediately bubble up, and the whole application terminated, because it's not handled.

What's needed is not calling .throw() on coroutine right away, but recording the fact that a coroutine should receive TimeoutError and calling .throw() in the future, in the mainloop context. And that "future" really should be "soon" (as timeout has already expired), so the coro needs to rescheduled to the top of the queue.

That "future" work should give a hint - the object which has the needed behavior is exactly called Future (and upstream wraps coros in Task, which is subclass of Future).

So, uasyncio isn't going to acquire a bloaty Future/Task wrappers. Then the talk is how to emulate that behavior with pure coroutines. One possible way would be to store the "overriding exception" in the task queue, and .throw() it into a coro (instead of .send()ing a normal value) when main loop is about to execute it. That means adding a new field to each task queue entry, unused in majority of cases. Another approach would be to mark a coro itself as "throw something on next run", i.e. move Future
functionality into it.

And all this only talks about cancelling a CPU-bound coroutine, and doesn't talk about cancelling I/O-bound coroutines, which aren't even in the task queue, and instead in I/O poll queue.

@pfalcon pfalcon added the rfc Request for Comment label Oct 8, 2017
@pfalcon
Copy link
Contributor Author

pfalcon commented Oct 8, 2017

@dpgeorge , @peterhinch , Similarly to #214, something I wanted to write down. Comments are welcome.

@peterhinch
Copy link
Contributor

From a user perspective the one major problem I've encountered using uasyncio is the lack of a mechanism whereby one CPU-bound coroutine can stop another. Fixing this would be a big improvement.

I've not yet encountered a need to stop an I/O-bound coroutine.

As for implementation I don't entirely follow your ideas. The problem seems to be one of avoiding the O(n) search required to locate a coroutine in the queue so that it can be removed from it. You discuss moving the coroutine to the top of the queue - wouldn't locating the coroutine to be moved involve a search?

An option might be to maintain a list of coroutines to be stopped. When a coroutine becomes due it is checked for presence on the list. If present it is discarded and the list amended. The list could be a fixed length, avoiding allocation.

If you have a mechanism for throwing an exception to a coroutine when the main loop is about to execute it, why not simply decline to execute it? Do you envisage a user benefit in throwing an exception?

@pfalcon
Copy link
Contributor Author

pfalcon commented Oct 21, 2017

I've not yet encountered a need to stop an I/O-bound coroutine.

Well, e.g. for UDP usecases it will be needed.

As for implementation I don't entirely follow your ideas. The problem seems to be one of avoiding the O(n) search required to locate a coroutine in the queue so that it can be removed from it.

That's one part of the problem. Another to just find a way to stop in a way prescribed by wait_for().

An option might be to maintain a list of coroutines to be stopped. When a coroutine becomes due it is checked for presence on the list. If present it is discarded and the list amended.

So, that doesn't work. That's because if coro foo executes await bar, then from Python side, it's still foo (and only foo) is running, so any checks for bar will fail. Because delegation to bar happens on lower level, thus check for pending exception should happen there too. That's how micropython/micropython#3380 was written.

If you have a mechanism for throwing an exception to a coroutine when the main loop is about to execute it, why not simply decline to execute it? Do you envisage a user benefit in throwing an exception?

  1. That's how asyncio and its wait_for() work. 2) Of course, exceptions are there to allow user to handle them, e.g. do cleanup. That may be a way to implement cancellation of I/O-bound coroutines based on generic cancellation capabilities.

@pfalcon
Copy link
Contributor Author

pfalcon commented Oct 22, 2017

Uncleaned implementation of wait_for() using .pend_throw() posted as #221 .

@peterhinch
Copy link
Contributor

Points taken.
If one coroutine were able to throw an exception to another, implementing timeouts becomes a simple special case. (A timeout coroutine cancels the target coroutine, with the target cancelling the timeout if it succeeds.)
A 'micro' solution to that general problem without the Future bloat would be very useful.

@pfalcon
Copy link
Contributor Author

pfalcon commented Oct 24, 2017

Moar fiascos, moar dirty magic. (This will be a complete thought stream, writing down so at least I could remember it tomorrow or the other day).

So, the cute pend_throw patch actually isn't needed. The key is to track top of the current coroutine chain, aka current executed coroutine as seen by Python side. Whatever will be thrown into it, will reach bottom per the generator delegation (yield from) property. So, timeout func would just need to .call_soon(current_coro, TimeoutError()), and the main loop just need to .throw() that (instead of .send()).

Right? Not so. Exceptions delivered in such way will be asynchronous, and have a great chance to throw off scheduling. Because well, all previous schedulings are still there, and we just add another asynchronous scheduling. So, the thing waited for next scheduled time will be cancelled, but the scheduling itself isn't, so it will still happen, but will be delivered to the thing which doesn't expect it.

So, .call_soon() doesn't work - we really need to deliver exception on the next scheduled call to coro, just feed an exception instead of whatever value was scheduled. That's doable in Python, but inefficiently, so the cute pend_throw() patch is still needed, because it does that efficiently.

But now the problem of cancelling I/O blocked coros. Such coros aren't in the scheduling queue at all. They're in I/O queue, waiting for I/O to complete and schedule them to the queue again. So, pend_throw on such coro will never be noticed, and for such, I/O blocked coro, we actually need to schedule it on timeout explicitly (emulating an I/O completion, just this completion is by timeout - and leaving it to coro to cleanup still pending I/O queue scheduling afterwards). But we can't schedule CPU blocked coros in such way, as explained above (because they're already scheduled there). So, need to separate I/O vs CPU scheduled coros - as usual, efficiently, without all the expensive maps.

And here an idea for great abuse of .pend_throw() comes - let it return previous pended value. Then set I/O scheduled coro to some implausible value, e.g. .pend_throw(Ellipsis). Then timeout func will be able to test it and queue-schedule if needed.

The proof of concept for that seems to work, though who knows what's still lurking in darkness.

@jpochyla
Copy link

jpochyla commented Oct 26, 2017

Isn't it a problem if cancellation tasks are scheduled into the future and not actually removed from the queue upon completion of the cancellable? I've hit this problem earlier (when running perf tests actually).

Edit: what I'm trying to say is, I think some efficient way to remove items from utimeq will be needed in the end.

@pfalcon
Copy link
Contributor Author

pfalcon commented Oct 26, 2017

@jpochyla : Which problem have you hit exactly?

To remove an arbitrary task, it first needs to be found in the queue. The heap data structure doesn't support efficient way to do that, so that operation is O(n). The whole idea is to avoid that.

Otherwise, the reasoning is that cancellation by timeout (and cancellation in general) is an exceptional operation. Under normal circumstances, a task would wait until its scheduled time and continue to run. Cancelled task will wait until its scheduled time to just be removed. That won't work too well if someone likes to guaranteedly cancel a task before its completion and spawn 10 new copies of it instead, repeated. Besides not doing that, there's a simple solution: adjust the queue size accordingly - no operations on it is O(n), so it costs you only a memory.

Note that maintaining a single scheduling queue is another point. For example, I personally do feel sad that running a task with timeout requires allocating 2 times more objects: besides the task itself, also a timeout object. It gives thought "what if we optimize timeout handling?". But why, if we already have an efficient time management queue. Having 2 queues is less efficient, because 2nd queue must have the same size as the main one for extreme case of every task scheduled with timeout, and will be empty if timeouts aren't used, wasting memory.

But you're more than welcome to propose an alternative solution. (Then implement it, then assess real-world performance differences ;-)).

@jpochyla
Copy link

jpochyla commented Oct 26, 2017

@jpochyla : Which problem have you hit exactly?

It's not a problem with uasyncio per se, but a similar architecture. Basically creating a lot of short-lived tasks with timeouts.

To remove an arbitrary task, it first needs to be found in the queue. The heap data structure doesn't support efficient way to do that, so that operation is O(n). The whole idea is to avoid that.

Unfortunately, yes. I'm not trying to dispute that. But right now you also need to copy the queue, that's what I mean by inefficiency.

Otherwise, the reasoning is that cancellation by timeout (and cancellation in general) is an exceptional operation.

Implying that if tasks usually end before the timeout, scheduled cancellation tasks will stay in the queue, no?

Besides not doing that, there's a simple solution: adjust the queue size accordingly - no operations on it is O(n), so it costs you only a memory.

I think a good compromise is to unset task pointer in the queue entry, so the task can get at least garbage collected. Then, when you hit a queue limit on scheduling, first cleanup the queue from these dead entries, and see if any space is reclaimed.

But you're more than welcome to propose an alternative solution. (Then implement it, then assess real-world performance differences ;-)).

I don't have the above solution implemented yet, right now I just remove the entries. Depends what you think about it :)

@pfalcon
Copy link
Contributor Author

pfalcon commented Oct 26, 2017

But right now you also need to copy the queue, that's what I mean by inefficiency.

What do you mean by that?

@jpochyla
Copy link

What do you mean by that?

This abomination:
https://github.com/trezor/trezor-core/blob/master/src/trezor/loop.py#L40-L52

@pfalcon
Copy link
Contributor Author

pfalcon commented Oct 26, 2017

Implying that if a tasks usually ends before the timeout, scheduled cancellation tasks will stay in the queue, no?

Yes, that's why I wrote:

I personally do feel sad that running a task with timeout requires allocating 2 times more objects: besides the task itself, also a timeout object.

I think a good compromise is to unset task pointer in the queue entry, so the task can get at least garbage collected. Then, when you hit a queue limit on scheduling, first cleanup the queue from these dead entries, and see if any space is reclaimed.

I so far didn't have much experience with queue overflow to have an opinion on that. uasyncio may end up acquiring capability to actually remove task entries. So far, I just brag how I cunningly keep avoiding doing that ;-).

@pfalcon
Copy link
Contributor Author

pfalcon commented Oct 26, 2017

@jpochyla
Copy link

I think a good compromise is to unset task pointer in the queue entry, so the task can get at least garbage collected. Then, when you hit a queue limit on scheduling, first cleanup the queue from these dead entries, and see if any space is reclaimed.

Disregard that, sorry :) I've just realized that looking up the entry would also be O(n), so it's probably better to have a direct remove() op.

@peterhinch
Copy link
Contributor

Viewing this from the point of view of a user, the ability to cancel a task (or, better, throw a pended exception to it) is arguably worth the expense of an O(n) search. Consider the alternative, which I recently encountered.

An exceptional condition occurs in an application requiring it to revert to a known state, which involves cancelling several continuously running coroutines. So the main coroutine sets a quit flag. Each dependent coroutine polls the flag, quits if it is set, and signals to the main coroutine that it has in fact terminated. The main coroutine waits until all dependent coroutines have stopped: the application can then resume. To maintain any level of performance, if a dependent coroutine needs to pause it must repeatedly await a "short" duration while polling the flag.

This really is repulsive and makes an O(n) search look fast, slick and efficient.

To put this into context, many firmware applications have fewer than a dozen coroutines. I think few users would object if throwing an exception to a coroutine came with a government health warning about performance.

@pfalcon
Copy link
Contributor Author

pfalcon commented Oct 29, 2017

Viewing this from the point of view of a user, the ability to cancel a task (or, better, throw a pended exception to it) is arguably worth the expense of an O(n) search.

As explained above, ability to cancel a task doesn't depend on O(n) search.

Otherwise, as a note, @peterhinch, in one place you complain about latency, up to calling for special adhoc methods to address that, in another, you say that O(n) algorithms are acceptable. That's a polar, contradictory approach, because O(n) algos employed for primitive scheduling operations are the source of (random) latency. I find my approach to be more centered and consistent: a) there's no need to introduce any special measures for "high priority" things - everything should work well enough already (in general, for any adhoc things any adhoc solutions are ok); b) there's no need to introduce any things which add latency - not before benefits provably outweight the drawbacks.

(As a disclaimer, there're of course O(n) algos currently involved in uasyncio scheduling, but they're localized, and of course scheduled to be optimized out - with the higher priority than adding more O(n) algos).

@peterhinch
Copy link
Contributor

@pfalcon

uasyncio may end up acquiring capability to actually remove task entries. So far, I just brag how I cunningly keep avoiding doing that ;-).

I took this to mean that you were only implementing timeouts rather than a general capability for one coroutine to throw a pended exception to another. If you do intend to implement this my above comments are irrelevant.

I agree that performance is vital and O(n) searches are to be avoided where possible. My point was that in the current uasyncio implementation, hacking a workround at application level is grossly inefficient. This is the one significant issue I've encountered with uasyncio's functionality in quite extensive use.

@pfalcon
Copy link
Contributor Author

pfalcon commented Oct 29, 2017

Sure, with pend_thow(), it's possible to implement general-purpose cancellation too. In CPython, it's the Task.cancel() method, but as there's no Task, it apparently should be an event loop method.

My point was that in the current uasyncio implementation, hacking a workround at application level is grossly inefficient.

Sure, that's why there's always an invitation to work towards proper solution.

@pfalcon
Copy link
Contributor Author

pfalcon commented Dec 2, 2017

Explaining to myself again why stuff based on:

+                    elif isinstance(args[0], Exception):
+                        print("Throwing %r into %r" % (args, cb))
+                        cb.throw(*args)

isn't going to work.

Suppose, we have a coroutine sleeping, == scheduled to run in some time. Coros are scheduled via top-level coro, aka task. So, now we call_soon(task, TimeoutError()) into it. Ok, task got woken up and forwards exception into target coro. Then perhaps the task doesn't die, but moves on, perhaps into another coro. But scheduling we had for the original coro is still there! It will fire, wake up task, which will deliver it into completely different coro.

To deal with that, we'd need to remove the original scheduling from the queue. But we can't do that. So, again, we need to deliver exception instead of the original scheduling, not in addition to it. And that's exactly what .pend_throw(), to be renamed to .cancel() does.

@pfalcon
Copy link
Contributor Author

pfalcon commented Dec 2, 2017

So, that could be worked around by storing a map of coro's which should be .throw()n into instead of being next(), but that's not efficient, so won't do.

Also, interesting if deque + deadline queue design would alleviate that, but I don't see how it would, a cancel map would still be needed.

@pfalcon
Copy link
Contributor Author

pfalcon commented Dec 2, 2017

.pend_throw(), to be renamed to .cancel()

Or maybe it won't be. Projects like https://github.com/python-trio/trio shape up the idea that following asyncio semantics for a sane async lib hardly makes sense/possible. But then following its API too close is also questionable. While "consistency" and "familiarity" are definitely good things it can bring, it can also bring confusion, and extra legwork to follow that API closely. In this case, the choice is:

  1. Name new generator method .cancel(), because that will allow to use to use it as asyncio's Task.cancel() method - if new, (u)asyncio specific exception, CancelledError, is also added to uPy core.
  2. Or name it for what it actually does, pend_throw(), and let people use it directly (which is recommended way for uPy), or provide a wrapper. All not compatible with asyncio, but there're more and more libs appear which are proud of that.

@pfalcon
Copy link
Contributor Author

pfalcon commented Dec 2, 2017

Name new generator method .cancel(), because that will allow to use to use it as asyncio's Task.cancel() method

Well, and yeah, that won't work with I/O-pending coroutines, so a wrapper will be needed anyway.

@pfalcon
Copy link
Contributor Author

pfalcon commented Dec 15, 2017

wait_for() fully implemented in pfalcon/pycopy-lib@203cc48 and pfalcon/pycopy-lib@f6555ba

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
rfc Request for Comment
Projects
None yet
Development

No branches or pull requests

3 participants