-
-
Notifications
You must be signed in to change notification settings - Fork 31.9k
Short repeated regex patterns can skip signal handling #109631
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I don't think we guarantee exactly when signals are handled, as long as they are handled eventually (meaning, reasonably promptly in pure-Python code). For example, on recent Python versions, the following function will only ever return def f():
try:
while True:
x = 0
x = 1
x = 2
x = 3
except:
return x Same idea in your code: even if the signal was delivered during the So I'm not sure that this is a bug, but it does look like something about the way this pattern runs is bypassing the signal handling in the matching engine (if I had to guess, it's probably because we're repeating a single small operation instead of dispatching between multiple operations). That's probably worth improving, perhaps by adding @serhiy-storchaka, thoughts? |
Ah, I think I misunderstood the cause. We're performing many short matches, and none of them actually increment the counter enough to trigger the signal handling branch in the matching engine? |
Counting for signal checking now continues in new match from the point where it ended in the previous match instead of starting from 0.
There is two things:
So the only issue is (1). #109867 fixes it. Thank you for your report @pan324. It was an interesting issue. |
…ythonGH-109867) Counting for signal checking now continues in new match from the point where it ended in the previous match instead of starting from 0. (cherry picked from commit 8ac2085) Co-authored-by: Serhiy Storchaka <[email protected]>
…ythonGH-109867) Counting for signal checking now continues in new match from the point where it ended in the previous match instead of starting from 0. (cherry picked from commit 8ac2085) Co-authored-by: Serhiy Storchaka <[email protected]>
…H-109867) (GH-109885) Counting for signal checking now continues in new match from the point where it ended in the previous match instead of starting from 0. (cherry picked from commit 8ac2085) Co-authored-by: Serhiy Storchaka <[email protected]>
Thanks! After posting I had gotten a bit anxious whether it actually qualified as a bug. Thank you both for the good info on signals and interruptibility! |
…ythonGH-109867) Counting for signal checking now continues in new match from the point where it ended in the previous match instead of starting from 0.
…H-109867) (#109886) gh-109631: Allow interruption of short repeated regex matches (GH-109867) Counting for signal checking now continues in new match from the point where it ended in the previous match instead of starting from 0. (cherry picked from commit 8ac2085) Co-authored-by: Serhiy Storchaka <[email protected]>
…ythonGH-109867) Counting for signal checking now continues in new match from the point where it ended in the previous match instead of starting from 0.
Bug report
Bug description:
I mentioned regex but this probably applies to other modules as well. In the code below, if the
#
line right after the findall is commented out (as it is right now), the code raises the exception past the try-except block on theprint("Safely reached EOF.")
statement. The exception happens one statement later than expected. NB: I tested all Python versions on Windows but only 3.11 on Linux.CPython versions tested on:
3.9, 3.10, 3.11, 3.12, CPython main branch
Operating systems tested on:
Linux, Windows
Linked PRs
The text was updated successfully, but these errors were encountered: