-
Notifications
You must be signed in to change notification settings - Fork 5.2k
Fix handling of late-discovered atomic lazy loops in compiler / source generator #117629
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…e generator Lazy loops can be made automatically atomic in some situations by the optimizer, in which case their handling significantly simplifies, because a lazy atomic loop just becomes a repeater for the min iteration count. Most viable lazy loops are caught by the optimizer, but some aren't yet are determined to be treatable as atomic at emit time. EmitLazy was handling such cases incorrectly, resulting in a missing branch target and compilation failing. This fixes that two fold: 1. The optimizer is improved, so the discovered tests cases that were triggering this case no longer do. 2. The distinction is eliminated from EmitLazy, as the case is rare and it would take a lot more code to optimize for that case.
Tagging subscribers to this area: @dotnet/area-system-text-regularexpressions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Fixes handling of late-discovered atomic lazy loops by improving the optimizer and removing special‐case logic in lazy emission.
- Adds new functional tests for various lookbehind and lazy‐loop scenarios.
- Refactors RegexNode optimizations to consistently use
rootNode.Options
and apply RTL checks per case. - Simplifies
EmitLazy
in both compiler and generated emitter by removing theisAtomic
branch.
Reviewed Changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.
File | Description |
---|---|
Regex.Match.Tests.cs | Added tests covering edge cases in lookbehinds and lazy loops. |
RegexNode.cs | Updated FinalOptimize and EliminateEndingBacktracking guards and RTL logic. |
RegexCompiler.cs | Removed isAtomic checks and assertion in EmitLazy . |
RegexGenerator.Emitter.cs | Mirrored compiler changes in generated emitter, removed isAtomic branch. |
Comments suppressed due to low confidence (2)
src/libraries/System.Text.RegularExpressions/src/System/Text/RegularExpressions/RegexNode.cs:414
- By only checking for NonBacktracking here, RTL loops will still enter backtracking elimination and may receive Oneloop/Setloop atomic optimizations that haven’t been vetted for RTL. Consider restoring the RTL guard or adding per-case RTL checks to prevent incorrect behavior in right-to-left mode.
(Options & RegexOptions.NonBacktracking) != 0)
src/libraries/System.Text.RegularExpressions/src/System/Text/RegularExpressions/RegexNode.cs
Show resolved
Hide resolved
/ba-g browser wasm timeouts |
Lazy loops can be made automatically atomic in some situations by the optimizer, in which case their handling significantly simplifies, because a lazy atomic loop just becomes a repeater for the min iteration count. Most viable lazy loops are caught by the optimizer, but some aren't yet are determined to be treatable as atomic at emit time. EmitLazy was handling such cases incorrectly, resulting in a missing branch target and compilation failing. This fixes that two fold:
Best reviewed without whitespace (indentation changed on a few large sections).
Fixes #117601