-
-
Notifications
You must be signed in to change notification settings - Fork 2.8k
config: avoid stat storm in _getconftestmodules #9532
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
# _getconftestmodules()'s call to _get_directory() causes a stat | ||
# storm when it's called potentially thousands of times in a test | ||
# session (#9478), often with the same path, so cache it. | ||
self._get_directory = lru_cache(256)(_get_directory) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to remove the bounds here, using None
instead of 256?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just remembered I forgot to answer this.
I think it's always good to have limits on lru caches to avoid them growing too large. We can increase it though if it's too low for someone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work!
Any opinions whether this should be in 7.0.0 (due to #9478) or 7.0.1? I'd like to avoid adding more to the pile, but given that this is part of solving a regression and seems simple enough I guess we can take it in. |
I think it should go into 7.0.0, as it is simple and fixes a regression, like you commented. 👍 |
Looks like the backport failed due to conflicts: https://github.com/pytest-dev/pytest/runs/4938147553?check_suite_focus=true |
config: avoid stat storm in _getconftestmodules (cherry picked from commit 5c69ece) Conflicts: src/_pytest/config/__init__.py
Fix #9478.
I'll also say Closes #9125 since it shouldn't be that slow anymore (also with next version of pluggy). I ran a profile on pandas collection with this PR and
gethookproxy
(which #9125 caches) comes out at ~1.5s out of ~30s.I've looked at various ways to speed this up, but they were either too ugly/intrusive or not as effective. In the end, the conftest data structure
_dirpath2confmods
is pretty efficient for what it does, and the API with theis_file
/parent
is mostly unavoidable given that. So a cache it is.Note that this cache is safe (assuming a path does not change from file <-> dir during the run, which is already assumed), unlike the previous caches which worked on the entire
_getconftestmodules
function.