Skip to content

E0401 (import-error) checks perform a lot of repeated stat calls #9310

@correctmost

Description

@correctmost

Bug description

I run pylint on a repo that is mounted via SSHFS, which leads to slow I/O speeds.

While profiling a run, I noticed that the import-error checks perform a lot of repeated stat calls because they check for the presence of various .py, .pyc, .so, .cpython-311-x86_64-linux-gnu.so, etc. files.

Many of these presence checks are repeated, so I'm wondering if it would be possible to improve performance by eliminating repeated checks or caching the results of previous calls.

I have prepared a repo that illustrates the issue. (The example repo contains ~60 files, whereas the repo I noticed the performance issue with contains ~2000 files.)

I noticed that pylint's performance can be improved by adding "missing" __init__.py files to the repo, but I'm hoping pylint itself can be tuned to increase performance even further.

Configuration

[MAIN]
jobs=1

[MESSAGES CONTROL]
disable=all
enable=E0401

[REPORTS]
reports=no
score=no

Command used

Steps to reproduce

git clone --branch import-error-stats https://github.com/correctmost/pylint-corpus.git
cd pylint-corpus

python ./profile_pylint.py
head -n 20 profiler_stats

Analysis

Notice that one of the top results is for posix.stat:

--> 27668    0.185    0.000    0.187    0.000 {built-in method posix.stat}

posix.stat is called by isfile, which is called most often by find_module in astroid:

--> <frozen genericpath>:27(isfile) <-   15128    0.044    1.282  astroid/interpreter/_import/spec.py:129(find_module)

There is evidence of repeated stats from strace:

$ strace -e trace=%%stat python profile_pylint.py 2>&1 | sort | uniq -c | sort -nr | less

1314 newfstatat(AT_FDCWD, "pylint-corpus/src/__init__.py", {st_mode=S_IFREG|0644, st_size=0, ...}, 0) = 0
 904 newfstatat(AT_FDCWD, "pylint-corpus/src/resources/__init__.pyc", 0x7ffd4b370690, 0) = -1 ENOENT (No such file or directory)
 904 newfstatat(AT_FDCWD, "pylint-corpus/src/resources/__init__.py", 0x7ffd4b370690, 0) = -1 ENOENT (No such file or directory)
 811 newfstatat(AT_FDCWD, "pylint-corpus/src/resources", {st_mode=S_IFDIR|0755, st_size=4096, ...}, 0) = 0
 710 newfstatat(AT_FDCWD, "pylint-corpus", {st_mode=S_IFDIR|0755, st_size=4096, ...}, 0) = 0
 553 newfstatat(AT_FDCWD, "pylint-corpus/src/sites/hierarchy/cat1/subcat1", {st_mode=S_IFDIR|0755, st_size=4096, ...}, 0) = 0
 552 newfstatat(AT_FDCWD, "pylint-corpus/src/__init__.so", 0x7ffd4b370690, 0) = -1 ENOENT (No such file or directory)
 552 newfstatat(AT_FDCWD, "pylint-corpus/src/__init__.cpython-311-x86_64-linux-gnu.so", 0x7ffd4b370690, 0) = -1 ENOENT (No such file or directory)
 552 newfstatat(AT_FDCWD, "pylint-corpus/src/__init__.abi3.so", 0x7ffd4b370690, 0) = -1 ENOENT (No such file or directory)
 550 newfstatat(AT_FDCWD, "pylint-corpus/src/sites/hierarchy/cat1/subcat1/src.so", 0x7ffd4b370690, 0) = -1 ENOENT (No such file or directory)
 550 newfstatat(AT_FDCWD, "pylint-corpus/src/sites/hierarchy/cat1/subcat1/src.pyc", 0x7ffd4b370690, 0) = -1 ENOENT (No such file or directory)
 550 newfstatat(AT_FDCWD, "pylint-corpus/src/sites/hierarchy/cat1/subcat1/src.py", 0x7ffd4b370690, 0) = -1 ENOENT (No such file or directory)
 550 newfstatat(AT_FDCWD, "pylint-corpus/src/sites/hierarchy/cat1/subcat1/src/__init__.pyi", 0x7ffd4b370690, 0) = -1 ENOENT (No such file or directory)
 550 newfstatat(AT_FDCWD, "pylint-corpus/src/sites/hierarchy/cat1/subcat1/src/__init__.pyc", 0x7ffd4b370690, 0) = -1 ENOENT (No such file or directory)
 550 newfstatat(AT_FDCWD, "pylint-corpus/src/sites/hierarchy/cat1/subcat1/src/__init__.py", 0x7ffd4b370690, 0) = -1 ENOENT (No such file or directory)
 550 newfstatat(AT_FDCWD, "pylint-corpus/src/sites/hierarchy/cat1/subcat1/src.cpython-311-x86_64-linux-gnu.so", 0x7ffd4b370690, 0) = -1 ENOENT (No such file or directory)
 550 newfstatat(AT_FDCWD, "pylint-corpus/src/sites/hierarchy/cat1/subcat1/src.abi3.so", 0x7ffd4b370690, 0) = -1 ENOENT (No such file or directory)

Pylint output

There is no output, just reduced performance

Expected behavior

Improved performance via caching or reduced file-presence checks

Pylint version

pylint 3.0.3
astroid 3.0.2
Python 3.11.6 (main, Nov 14 2023, 09:36:21) [GCC 13.2.1 20230801]

OS / Environment

Arch Linux

Additional dependencies

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    Enhancement ✨Improvement to a componentNeeds PRThis issue is accepted, sufficiently specified and now needs an implementationperformance

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions