-
-
Notifications
You must be signed in to change notification settings - Fork 31.9k
gh-120317: Lock around global state in the tokenize module #120318
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
lysnikolaou
commented
Jun 10, 2024
•
edited by bedevere-app
bot
Loading
edited by bedevere-app
bot
- Issue: Using internal tokenize module's TokenizerIter in multiple threads crashes #120317
78df829
to
abf568a
Compare
This LGTM. Can you add the reproducer to |
@Fidget-Spinner I added more locking around global state. Reviewing each commit seprately might make it easier. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a refactor, so good so far.
Feedback addressed. @Fidget-Spinner I used |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Feel free to ignore the following comments if this part of the tokenizer is not perf sensitive.
All feedback addressed. Thanks a lot for the reviews and the help @Fidget-Spinner and @erlend-aasland! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
||
|
||
@threading_helper.requires_working_threading() | ||
class TestTokenize(unittest.TestCase): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested this locally: it crashes spectacularly without the fix :)
Python/Python-tokenize.c
Outdated
@@ -194,7 +252,13 @@ tokenizeriter_next(tokenizeriterobject *it) | |||
} | |||
if (it->done || type == ERRORTOKEN) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we need to protect the read? Otherwise we may get weird partial reads no?
Actually, one question here: shouldn't we be protecting all reads of the |
For example, this read is not protected, no?:
|
20c7cb2
to
1c51daf
Compare
Co-authored-by: Pablo Galindo <[email protected]>
1c51daf
to
ca466fa
Compare
After some offline discussion with @pablogsal, we decided it makes more sense to just lock around all of |
try: | ||
r = next(it) | ||
tokens.append(tokenize.TokenInfo._make(r)) | ||
time.sleep(1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make this test faster? It's going to run on every CI job. I think we should aim for 0.1 seconds or less for these sorts of tests. (It's currently ~3 seconds).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed the amount of time each thread sleeps to 0.03 secs, which means that the whole test now takes about 0.1 seconds. Also verified that the test still fails without the fix and succeeds with it.
@pablogsal Can you do a final review here when you have some spare cycles? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed the PR and the comments and it LGTM
I also ran the tests locally in free threaded mode to confirm
Thanks @lysnikolaou for the PR 🌮🎉.. I'm working now to backport this PR to: 3.13. |
…honGH-120318) (cherry picked from commit 8549559) Co-authored-by: Lysandros Nikolaou <[email protected]> Co-authored-by: Pablo Galindo <[email protected]>
GH-121841 is a backport of this pull request to the 3.13 branch. |
…-120318) (#121841) (cherry picked from commit 8549559) Co-authored-by: Lysandros Nikolaou <[email protected]> Co-authored-by: Pablo Galindo <[email protected]>
…hon#120318) Co-authored-by: Pablo Galindo <[email protected]>