Skip to content

Improve import performance of assertion rewrite. Fixes #3918. #3919

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Sep 5, 2018

Conversation

fabioz
Copy link
Contributor

@fabioz fabioz commented Aug 31, 2018

Some notes:

  • On my use case there are many imports (around 3000 imports) and there are many entries in sys.path (67 to be more exact) -- this makes imp.find_import very slow (it does way too many calls to os.stat in the scenario) -- on my use case imports are around 30% faster with my patch in place in Py 2 and 40% faster on Py 3 on my measurements (granted my case is probably an edge case for imp.find_import).

  • I took a look at https://www.python.org/dev/peps/pep-0302/ and it says the following:

    The following set of methods may be implemented if support for (for example) Freeze-like tools is desirable. It consists of three additional methods which, to make it easier for the caller, each of which should be implemented, or none at all:

    loader.is_package(fullname)
    loader.get_code(fullname)
    loader.get_source(fullname)

    So, as the loader had is_package and not the others I simply removed is_package and things still seem to be working -- which is good because it was also calling imp.find_module (only on Python 3, it seems it was unused on Python 2 -- which also means imports were slower on Python 3 than on Python 2).

Fix #3918

Copy link
Member

@nicoddemus nicoddemus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work @fabioz, thanks!

@fabioz fabioz force-pushed the master branch 2 times, most recently from f40c9d8 to d53e449 Compare August 31, 2018 15:23
Copy link
Member

@nicoddemus nicoddemus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fixed the tests and added unittests for the early bailout.

It would be nice to have a second set of eyes from other maintainers over this change. 👍

@codecov
Copy link

codecov bot commented Sep 1, 2018

Codecov Report

Merging #3919 into master will decrease coverage by 1.24%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #3919      +/-   ##
==========================================
- Coverage   96.24%   94.99%   -1.25%     
==========================================
  Files         108      108              
  Lines       23353    23437      +84     
  Branches        0     2332    +2332     
==========================================
- Hits        22475    22265     -210     
- Misses        878      881       +3     
- Partials        0      291     +291
Flag Coverage Δ
#doctesting 28.68% <53.4%> (-3.2%) ⬇️
#nobyte 90.05% <100%> (-3.47%) ⬇️
#numpy 28.32% <51.13%> (-3.11%) ⬇️
#pexpect 47.83% <72.34%> (-5.75%) ⬇️
#pluggymaster 93.74% <100%> (-1.33%) ⬇️
#py27 93.04% <100%> (-1.57%) ⬇️
#py34 92.28% <100%> (-1.6%) ⬇️
#py35 92.29% <100%> (-1.6%) ⬇️
#py36 93.29% <100%> (-1.54%) ⬇️
#py37 92.46% <100%> (-1.6%) ⬇️
#trial 31.25% <51.13%> (-2.85%) ⬇️
#xdist 93.22% <100%> (-1.42%) ⬇️
Impacted Files Coverage Δ
src/_pytest/assertion/rewrite.py 96% <100%> (-1.17%) ⬇️
testing/test_assertrewrite.py 83.78% <100%> (+0.02%) ⬆️
src/_pytest/main.py 96.19% <100%> (-2.03%) ⬇️
testing/freeze/runtests_script.py 0% <0%> (-25%) ⬇️
testing/freeze/tox_run.py 0% <0%> (-14.29%) ⬇️
testing/freeze/create_executable.py 0% <0%> (-12.5%) ⬇️
src/_pytest/assertion/__init__.py 86.36% <0%> (-10.61%) ⬇️
testing/test_modimport.py 80% <0%> (-10%) ⬇️
testing/test_argcomplete.py 65.62% <0%> (-7.82%) ⬇️
src/_pytest/_code/_py2traceback.py 85.71% <0%> (-7.15%) ⬇️
... and 61 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 019e33e...eec7081. Read the comment docs.

@nicoddemus nicoddemus changed the title Fix import performance on assertion rewrite. Fixes #3918. Improve import performance on assertion rewrite. Fixes #3918. Sep 1, 2018
@nicoddemus nicoddemus changed the title Improve import performance on assertion rewrite. Fixes #3918. Improve import performance of assertion rewrite. Fixes #3918. Sep 1, 2018
@coveralls
Copy link

coveralls commented Sep 1, 2018

Coverage Status

Coverage increased (+0.03%) to 94.081% when pulling eec7081 on fabioz:master into 019e33e on pytest-dev:master.

Copy link
Contributor Author

@fabioz fabioz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the def is_package(self, name): was added back, it should probably use the new self._imp_find_module instead of imp.find_module.

Copy link
Contributor Author

@fabioz fabioz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, imp.find_module is actually deprecated (seeing 3.6 docs) -- if it's only going to be used for imp.find_module, its contents can probably be extracted just for the PKG_DIRECTORY case (there's a pure-python version which may be seen in imp.find_module).

@nicoddemus
Copy link
Member

As the def is_package(self, name): was added back, it should probably use the new self._imp_find_module instead of imp.find_module.

I added it back because a test was failing without it. Good catch about is_package using imp.find_module, just changed it. 👍

@nicoddemus
Copy link
Member

Also, imp.find_module is actually deprecated (seeing 3.6 docs) -- if it's only going to be used for imp.find_module, its contents can probably be extracted just for the PKG_DIRECTORY case (there's a pure-python version which may be seen in imp.find_module).

Yeah, as we talked in person, it is nightmarish to keep this code working across all Python versions we support, so we are kind of forced to use the deprecated module. We will probably revisit the rewriting hook completely when we can drop Python 2 in a year or so. 😁

@fabioz
Copy link
Contributor Author

fabioz commented Sep 3, 2018

Also, imp.find_module is actually deprecated (seeing 3.6 docs) -- if it's only going to be used for imp.find_module, its contents can probably be extracted just for the PKG_DIRECTORY case (there's a pure-python version which may be seen in imp.find_module).

Yeah, as we talked in person, it is nightmarish to keep this code working across all Python versions we support, so we are kind of forced to use the deprecated module. We will probably revisit the rewriting hook completely when we can drop Python 2 in a year or so. 😁

Well, in this case I was actually talking about the is_package use case (not the other case) -- and in this situation it seems to be used only on Python 3 ;)

@nicoddemus
Copy link
Member

Well, in this case I was actually talking about the is_package use case (not the other case) -- and in this situation it seems to be used only on Python 3 ;)

Oh sorry, I did not catch the entirety of the message, my bad. Please feel free to experiment and push other commits here if you want. 👍

@fabioz
Copy link
Contributor Author

fabioz commented Sep 5, 2018

Well, in this case I was actually talking about the is_package use case (not the other case) -- and in this situation it seems to be used only on Python 3 ;)

Oh sorry, I did not catch the entirety of the message, my bad. Please feel free to experiment and push other commits here if you want. 👍

I'll do that later on (I'd like to get this one merged first and then experiment more on that idea afterwards as one improvement is not dependent on the other).

@nicoddemus
Copy link
Member

Sounds good!

@RonnyPfannschmidt RonnyPfannschmidt merged commit 410d576 into pytest-dev:master Sep 5, 2018
@RonnyPfannschmidt
Copy link
Member

@thedrow pytest is already internally using pathlib and keeps adding more uses (using the backports in pathlib2 for python <3.6) - so we can do this right now

@jeffwidman
Copy link
Contributor

@thedrow / @RonnyPfannschmidt was the pathlib optimization ever implemented?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants