-
-
Notifications
You must be signed in to change notification settings - Fork 206
Interruption checks are very expensive, especially with RStudio #940
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@krlmlr Is this something we should talk to RStudio about, in addition to trying to mitigate it on our side? Possible mitigations:
Overall we are limited by R's poor design. They do not provide a fast way to check for interruption without risking a longjump. Could we take this up with R core? |
I wonder if this can be solved in the C core instead, by calling the interrupt handler less frequently for |
Raising this upstream feels like an uphill battle. Might be worthwhile if it affected multiple other packages. |
That's not realistic. Every function would need to be considered separately, and updated in such a way that checks are frequent enough yet not too frequent. When there was a doubt about the frequency of particular interruption checks in the past, I did do benchmarking to tune it properly, and there were no performance issues with Mathematica's and Python's interrupt checkers. It's possible that the issue with R is coming from two sources:
I just looked specifically at the sources of https://github.com/igraph/igraph/blob/master/src/connectivity/components.c#L523 For each interruption check, there's an |
I did a small benchmark using |
We have a workaround here. What's the proposed course of action? How much can the C core do to reduce the number of calls? |
I don't think this is feasible in the short term. The only good solution I can see is to consider functions case by case and try to make improvements. That's a slow, gradual process. As we discussed, I tried to use timing-based solutions, but I couldn't make it work well: timing functions are non-standard and their performance varies widely. There is nothing we can consistently rely on, trust that it works on most systems, and trust that it won't introduce a much worse performance issue on some exotic system. I still think it's worth complaining to RStudio given that this is mostly an issue only with that system. It does not affect the Python or Mathematica interfaces, as those have much more performant interrupt checkers. It does affect plain R a little bit, but interruption checks with RStudio are over an order of magnitude slower than with plain R. tl;dr Improvements are slow but ongoing in the C core. Keep the workaround in R for now. Bring this up with RStudio at some point. |
To bring this up with Posit/RStudio, a reproducible example would be nice. |
Can we, as a rule, offload all computation to another thread? |
Describe the bug
Interruption checks are very expensive, especially with RStudio.
Note that "interruption checks" don't just check for interruption. They also hand back control to the GUI event loop. E.g. on Windows the R GUI would lock up during computations without interruption checks.
To reproduce
Do not use a timing mechanism that runs the command multiple times, as
is_connected
is cached on the 0.10 branch.With interruption checks disabled completely, this runs in 1.419 seconds.
Otherwise I get 2.982 s on the command line and 24.634 s in RStudio. With the R GUI, I get 3.649 s. All of these are on macOS / arm64.
These measurements are with the 0.10 branch, but
main
is similar.Version information
Issue present both on the
main
andigraph-0.10
branches.The text was updated successfully, but these errors were encountered: