Skip to content

Infinite loop in the async state machine when the zones async error handler throws an error #45616

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
natebosch opened this issue Apr 7, 2021 · 6 comments
Assignees
Labels
area-core-library SDK core library issues (core, async, ...); use area-vm or area-web for platform specific libraries. customer-google3 dart2js-ddc-discrepancy When dev and production compilations have different semantics library-async P1 A high priority bug; for example, a single project is unusable or has many test failures type-bug Incorrect behavior (everything from a crash to more subtle misbehavior)

Comments

@natebosch
Copy link
Member

An async method gets compiled to a state machine. In Dart2js this uses an exception which is thrown and caught here for some state transitions.

In the case where the Future of an async is going to complete as an error, this falls through to _propagateToListeners as it tries to complete the future. There we try to handle the error in that future's error handling zone.

source._zone
.handleUncaughtError(asyncError.error, asyncError.stackTrace);

If this throws, the exception ends up in the catch block for _wrapJsFunctionForAsync which was only supposed to catch the exceptions thrown for the sake of moving through states.

This ends up in an infinite loop in dart2js. The state variables aren't updated before the exception is thrown like they would be for the intended exceptions - and we end up back in the same state trying to complete the _Future as an error again. This time it's already complete and we (depending on compile options) either will hit an assert about completing an already complete future, or a type error when listenerOrValue isn't a listener, or another more vague error stemming from trying to use an error value as if it were a future listener. In any case the error reaches the problematic catch again to keep the infinite loop going.

This can be demonstrated with the following. In both DDC and the VM this will not loop forever. With dart2js this will loop forever.

import 'dart:async';

void main() {
  runZoned(() {
    somethingAsync();
  }, zoneSpecification: ZoneSpecification(handleUncaughtError: handleError));
}

void handleError(
    Zone _, ZoneDelegate __, Zone ___, Object error, StackTrace stackTrace) {
  print('I see the error: $error');
  throw 'extra error';
}

void somethingAsync() async {
  await Future.delayed(const Duration(seconds: 1));
  throw 'sad';
}

cc @lrhn - Do you think there anything we should be looking at in our implementation of the zone error handling, or futures, to prevent this?

@natebosch natebosch added web-dart2js type-bug Incorrect behavior (everything from a crash to more subtle misbehavior) area-web-js Issues related to JavaScript support for Dart Web, including DDC, dart2js, and JS interop. dart2js-ddc-discrepancy When dev and production compilations have different semantics labels Apr 7, 2021
@sigmundch sigmundch added the P1 A high priority bug; for example, a single project is unusable or has many test failures label Apr 7, 2021
@lrhn
Copy link
Member

lrhn commented Apr 8, 2021

As far as I can see, we are assuming zone functions are zone agnostic (they should use their selfZone, currentZone or parentDelegate arguments to do zone operations, not Zone.current), and that they do not throw.

Because of that, we do not change the zone they run in, we just run them in whichever zone is current at the time they are triggered, and there is no try/catch wrapper. So, if an uncaught error handler throws, it becomes an uncaught error in the same zone.

There are several options, all potentially breaking if someone depends on the current underspecified behavior:

  • Document that zone functions must not throw, and that they should only use zone functionality through self, parent and zone.
  • Run all zone functions in the parent zone of the self-zone (where the function was introduced). That is, switch the current zone to be the parent zone while running the function. This makes sense because the parent zone was the current zone at the time the new zone was created. Big impact, more consistent behavior, some overhead. The current zone can be any zone since you can call zone.handleUncaughtError from anywhere, so this would make it much more predictable which zone a registered zone function runs in.
  • Use try/catch for all zone function invocations and propagate errors to the parent zone's uncaught error handler.
  • Add a try/catch to the handleUncaughtError zone function invocation, and propagate uncaught errors of that to the parent zone. Uncaught errors in any other zone function goes to the current zone's uncaught error handler. This would be a "double-fault handler". When it reaches the root zone, that handler won't throw. This has minimal impact, and would only affect the behavior of a throwing uncaught error handler, which should be rare. Only introduces an overhead on error handling, not on all the other zone operations. Doesn't help with other handlers not being run in any consistent zone.

We can even do the usual trick of checking whether the thrown exception is the same as the incoming exception, and if so consider it a rethrow. This would be the only way for an async error to leave its error zone, by rethrowing in the error zone's error handler (because that handler should be running in the parent zone).

I think we should consider documentation and the double-fault handler, and not touch the rest unless necessary.

@natebosch
Copy link
Member Author

If the error handler causes an async error it causes an infinite cycle on all platforms. In this case there is at least user visible behavior to make it more clear what the bug is, instead of being in a cycle that doesn't involve user code.

I think the second option above to run all zone functions in the parent zone would also resolve this case and could be a bit more user friendly. To repro change the throw in the handler to Future.error.

import 'dart:async';

void main() {
  runZoned(() {
    somethingAsync();
  }, zoneSpecification: ZoneSpecification(handleUncaughtError: handleError));
}

void handleError(
    Zone _, ZoneDelegate __, Zone ___, Object error, StackTrace stackTrace) {
  print('I see the error: $error');
  Future.error('extra error');
}

void somethingAsync() async {
  await Future.delayed(const Duration(seconds: 1));
  throw 'sad';
}

@lrhn
Copy link
Member

lrhn commented Apr 13, 2021

Running all Zone functions in the parent zone is a better design, but it's also a bigger change (almost certainly breaking some code, people tend to make assumptions about which zone their code runs in far too often), and it introduces a bigger overhead. Changing zone before calling the function, then setting it back, would be something like:

T result;
var currentZone = Zone._current;
Zone._current = parentDelegate._zone;
try {
  result = zoneFunction.function(zoneFunction.zone, parentDelegate, currentZone, arg1, arg2);
} finally {
  Zone._current = currentZone;
}
return result;

instead of just

  return zoneFunction.function(zoneFunction.zone, parentDelegate, Zone._current, arg1, arg2);

This overhead would happen on each intercepted zone function. We'd probably add more tests to avoid the overhead for the root zone (but likely we already do that anyway):

if (identical(zoneFunction.zone, Zone._root)) {
  return rootZoneFunction(arg1, arg2);
}
T result;
var currentZone = Zone._current;
Zone._current = parentDelegate._zone;
try {
  result = zoneFunction.function(zoneFunction.zone, parentDelegate, currentZone, arg1, arg2);
} finally {
  Zone._current = currentZone;
}
return result;

Changing only the uncaught error handler reduces the breaking change risk, and still avoids the infinite loop risk.
It's also practically useful as a way to explicitly let an error leave an error zone, although you could always call parentDelegate.handleUncaughtError directly. (I really, really want to remove the concept of error zones!)

@sigmundch
Copy link
Member

Thanks for all the details Lasse and Nate!

As for next steps. It sounds like you are leaning towards option (4): to only fix the calls to handleUncaughtError. I believe this means to change the dart:async implementation directly ( in future_impl.dart within _propagateToListeners), right?

I will update this bug's label and assignment assuming that. However, if there there are changes in the dart2js lowering strategy that we need to revise to properly handle this, let me know so we can jump back on this.

@sigmundch sigmundch added area-core-library SDK core library issues (core, async, ...); use area-vm or area-web for platform specific libraries. library-async and removed area-web-js Issues related to JavaScript support for Dart Web, including DDC, dart2js, and JS interop. web-dart2js labels Apr 14, 2021
@sigmundch sigmundch assigned lrhn and natebosch and unassigned rakudrama Apr 14, 2021
@lrhn
Copy link
Member

lrhn commented Apr 15, 2021

I'll just fix it in the zone itself instead of finding all the call points.
https://dart-review.googlesource.com/c/sdk/+/194402

dart-bot pushed a commit that referenced this issue Apr 21, 2021
…r parent zone.

Avoids infinite recursion when the uncaught error is handled by the same,
potentially still failing, uncaught error handler.

Bug: #45616, #45617
Change-Id: I60ee0f1220b7345f4a41e1f1b323b8da47ed326e
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/194402
Reviewed-by: Nate Bosch <[email protected]>
Reviewed-by: Sigmund Cherem <[email protected]>
Auto-Submit: Lasse R.H. Nielsen <[email protected]>
Commit-Queue: Lasse R.H. Nielsen <[email protected]>
@sigmundch
Copy link
Member

Thanks again Lasse and Nate, I'll mark this issue as closed that that the change landed and relanded in 5776d57 :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-core-library SDK core library issues (core, async, ...); use area-vm or area-web for platform specific libraries. customer-google3 dart2js-ddc-discrepancy When dev and production compilations have different semantics library-async P1 A high priority bug; for example, a single project is unusable or has many test failures type-bug Incorrect behavior (everything from a crash to more subtle misbehavior)
Projects
None yet
Development

No branches or pull requests

4 participants