Exceptions slow in 3.11, depending on location #109181

pochmann · 2023-09-09T12:35:57Z

Bug report

Bug description:

Consider these two functions:

def short():
    try:
        if 0 == 1:
            unreached
        raise RuntimeError
    except RuntimeError:
        pass

def long():
    try:
        if 0 == 1:
            unreached; unreached; unreached; unreached; unreached; unreached; unreached; unreached; unreached; unreached
            unreached; unreached; unreached; unreached; unreached; unreached; unreached; unreached; unreached; unreached
            unreached; unreached; unreached; unreached; unreached; unreached; unreached; unreached; unreached; unreached
            unreached; unreached; unreached; unreached; unreached; unreached; unreached; unreached; unreached; unreached
            unreached; unreached; unreached; unreached; unreached; unreached; unreached; unreached; unreached; unreached
            unreached; unreached; unreached; unreached; unreached; unreached; unreached; unreached; unreached; unreached
            unreached; unreached; unreached; unreached; unreached; unreached; unreached; unreached; unreached; unreached
            unreached; unreached; unreached; unreached; unreached; unreached; unreached; unreached; unreached; unreached
            unreached; unreached; unreached; unreached; unreached; unreached; unreached; unreached; unreached; unreached
            unreached; unreached; unreached; unreached; unreached; unreached; unreached; unreached; unreached; unreached
        raise RuntimeError
    except RuntimeError:
        pass

The only difference is that long() has 100 unreached statements instead of just one. But it takes much longer in Python 3.11 (and a bit longer in Python 3.10). Times from @jamestwebber (here):

Python: 3.11.5 | packaged by conda-forge | (main, Aug 27 2023, 03:34:09) [GCC 12.3.0]
 176.5 ±  0.4 ns  short
 644.7 ±  0.6 ns  long

Python: 3.10.12 | packaged by conda-forge | (main, Jun 23 2023, 22:40:32) [GCC 12.3.0]
 150.7 ±  0.1 ns  short
 167.0 ±  0.2 ns  long

Why? Shouldn't it just jump over them all and be just as fast as short()?

Benchmark script

Attempt This Online!):

from timeit import timeit
from time import perf_counter as time
from statistics import mean, stdev
import sys


def short():
    try:
        if 0 == 1:
            unreached
        raise RuntimeError
    except RuntimeError:
        pass


def long():
    try:
        if 0 == 1:
            unreached; unreached; unreached; unreached; unreached; unreached; unreached; unreached; unreached; unreached
            unreached; unreached; unreached; unreached; unreached; unreached; unreached; unreached; unreached; unreached
            unreached; unreached; unreached; unreached; unreached; unreached; unreached; unreached; unreached; unreached
            unreached; unreached; unreached; unreached; unreached; unreached; unreached; unreached; unreached; unreached
            unreached; unreached; unreached; unreached; unreached; unreached; unreached; unreached; unreached; unreached
            unreached; unreached; unreached; unreached; unreached; unreached; unreached; unreached; unreached; unreached
            unreached; unreached; unreached; unreached; unreached; unreached; unreached; unreached; unreached; unreached
            unreached; unreached; unreached; unreached; unreached; unreached; unreached; unreached; unreached; unreached
            unreached; unreached; unreached; unreached; unreached; unreached; unreached; unreached; unreached; unreached
            unreached; unreached; unreached; unreached; unreached; unreached; unreached; unreached; unreached; unreached
        raise RuntimeError
    except RuntimeError:
        pass


funcs = short, long

for _ in range(3):
    times = {f: [] for f in funcs}
    def stats(f):
        ts = [t * 1e9 for t in sorted(times[f])[:5]]
        return f'{mean(ts):6.1f} ± {stdev(ts):4.1f} ns '
    for _ in range(100):
        for f in funcs:
            t = timeit(f, number=10**4) / 1e4
            times[f].append(t)
    for f in sorted(funcs, key=stats):
        print(stats(f), f.__name__)
    print()

print('Python:', sys.version)

In fact it takes time linear in how many unreached statements there are. Times for 100 to 100000 unreached statements (on one line, before the try):

   100     2.6 μs
  1000    24.3 μs
 10000   253.3 μs
100000  2786.2 μs

Benchmark script

from time import perf_counter as time
from timeit import repeat

for e in range(2, 6):
    n = 10 ** e
    exec(f'''def f():
        if 0 == 1:
            {'unreached;' * n}
        try:
            raise RuntimeError
        except RuntimeError:
            pass''')
    number = 10**6 // n
    t = min(repeat(f, number=number)) / number
    print(f'{n:6} {t * 1e6 :7.1f} μs')

Attempt This Online!

The slowness happens when the unreached statements are anywhere before the raise, and not when they're anywhere after the raise (demo). So it seems what matters is location of the raise in the function. Long code before it somehow makes it slow.

This has a noticeable impact on real code I wrote (assuming I pinpointed the issue correctly): two solutions for a task, and one was oddly slower (~760 vs ~660 ns) despite executing the exact same sequence of bytecode operations. Just one jump length differed, leading to a raise at a larger address.

Benchmark script with those two solutions and the relevant test case:

The functions shall return the one item from the iterable, or raise an exception if there are fewer or more than one. Testing with an empty iterable, both get the iterator, iterate it (nothing, since it's empty), then raise. The relevant difference appears to be that the slower one has the raise written at the bottom, whereas the faster one has it near the top.

Sample times:

 664.4 ±  8.6 ns  one_sequential
 762.1 ± 28.8 ns  one_nested

Python: 3.11.4 (main, Jun 24 2023, 10:18:04) [GCC 13.1.1 20230429]

Code:

from timeit import timeit
from statistics import mean, stdev
from itertools import repeat, starmap, islice
import sys


def one_nested(iterable, too_short=None, too_long=None):
    it = iter(iterable)
    for first in it:
        for second in it:
            raise too_long or ValueError(
                'Expected exactly one item in iterable, but',
                f'got {first!r}, {second!r}, and perhaps more.'
            )
        return first
    raise too_short or ValueError('too few items in iterable (expected 1)')


def one_sequential(iterable, too_short=None, too_long=None):
    it = iter(iterable)

    for first in it:
        break
    else:
        raise too_short or ValueError('too few items in iterable (expected 1)')

    for second in it:
        raise too_long or ValueError(
            'Expected exactly one item in iterable, but '
            f'got {first!r}, {second!r}, and perhaps more.'
        )

    return first


funcs = one_nested, one_sequential

def empty(f):
    iterable = iter(())
    too_short = RuntimeError()
    for _ in repeat(None, 10**4):
        try:
            f(iterable, too_short)
        except RuntimeError:
            pass

for case in empty,:

    times = {f: [] for f in funcs}
    def stats(f):
        ts = [t * 1e9 for t in sorted(times[f])[:5]]
        return f'{mean(ts):6.1f} ± {stdev(ts):4.1f} ns '
    for _ in range(100):
        for f in funcs:
            t = timeit(lambda: case(f), number=1) / 1e4
            times[f].append(t)
    for f in sorted(funcs, key=stats):
        print(stats(f), f.__name__)
    print()

print('Python:', sys.version)

Attempt This Online!

CPython versions tested on:

3.10, 3.11

Operating systems tested on:

Linux, macOS

Linked PRs

The text was updated successfully, but these errors were encountered:

sunmy2019 · 2023-09-09T18:14:38Z

bisect shows the regression starts at #26955 for the implementation of pep-0657
PEP 657 – Include Fine Grained Error Locations in Tracebacks

The extra time must be caused by finding the error locations.

sunmy2019 · 2023-09-09T18:17:26Z

They use an O(n) implementation internally.

nielstron · 2023-09-11T12:03:50Z

Just for the sake of completeness, the length of unreachable code without a raise statement does not influence the computation time.

Benchmark script

from time import perf_counter as time
from timeit import repeat

for e in range(2, 6):
    n = 10 ** e
    exec(f'''def f():
        if 0 == 1:
            {'unreached;' * n}
        return''')
    number = 10**6 // n
    t = min(repeat(f, number=number)) / number
    print(f'{n:6} {t * 1e6 :7.1f} μs')

By comparing the performance record between the two we can see that Addr2Line is the culprit

Perf screenshots

And indeed, when commenting out the corresponding loops I see a reduction in the runtime

Same test script as initally provided, now with commented out loop

   100     0.2 μs
  1000     0.2 μs
 10000     0.2 μs
100000     0.2 μs
1000000     0.3 μs

vs

   100     0.6 μs
  1000     4.7 μs
 10000    46.0 μs
100000   495.2 μs
1000000  5049.9 μs

More specifically, it seems most time is spent looking for 0 bytes here. This is about where my current understanding of what this code even does ends :)

nielstron · 2023-09-11T12:19:38Z

@AlexWaygood I am also experiencing this slowdown in python 3.10.12 (since you added 3.11, 3.12 and 3.13)

AlexWaygood · 2023-09-11T12:24:27Z

@AlexWaygood I am also experiencing this slowdown in python 3.10.12 (since you added 3.11, 3.12 and 3.13)

Python 3.10 is old enough that it is now only accepting bugfixes if they relate to security issues, and this isn't a security issue. For us, therefore, this is only an issue for Python 3.11+, even if it can be reproduced on Python 3.10.

iritkatriel · 2023-09-15T20:36:05Z

@pablogsal and thoughts on this?

pablogsal · 2023-09-16T01:01:10Z

@pablogsal and thoughts on this?

I am currently on sick leave so I cannot look at this in detail but I feel this is the balance between the compression to save disk size and the speed to "uncompress" the line table.

On the other hand maybe we could defer the computation of the line table offsets only when exceptions buble up to top level, but it may be too late at that point so we may need to save extra information and maybe even the code object if is not reachable already in all cases.

In any case, I don't think this is a bug. It may qualify as a regression if we all agree but I am not even sure if I would call it that, since is just a consequence of the feature.

sunmy2019 · 2023-09-16T14:04:52Z

I agree it's not a bug. Extra cost for new features improving error messages is fine.

Someone may want a switch to disable this feature, but I would argue that you should rather keep the try ... except ... block as short as possible.

pablogsal · 2023-09-16T14:25:14Z

I agree it's not a bug. Extra cost for new features improving error messages is fine.

Someone may want a switch to disable this feature, but I would argue that you should rather keep the try ... except ... block as short as possible.

I think we can make that work with -X no_debug_ranges or PYTHONNODEBUGRANGES

… the line number

… the line number Signed-off-by: Pablo Galindo <[email protected]>

…ine number (#111548) Signed-off-by: Pablo Galindo <[email protected]>

…compute the line number (pythonGH-111548) Signed-off-by: Pablo Galindo <[email protected]>. (cherry picked from commit abb1542) Co-authored-by: Pablo Galindo Salgado <[email protected]>

pablogsal · 2023-10-31T15:19:03Z

Backporting to 3.12 and 3.11 as technically this is a regression

…compute the line number (pythonGH-111548) Signed-off-by: Pablo Galindo <[email protected]>. (cherry picked from commit abb1542) Co-authored-by: Pablo Galindo Salgado <[email protected]>

…compute the line number (pythonGH-111548) Signed-off-by: Pablo Galindo <[email protected]>. (cherry picked from commit abb1542) Co-authored-by: Pablo Galindo Salgado <[email protected]> Signed-off-by: Pablo Galindo <[email protected]>

…e the line number (GH-111548) (#111550) . (cherry picked from commit abb1542)

…lazily compute the line number (pythonGH-111548)

…e the line number (GH-111548) (#111551) . (cherry picked from commit abb1542)

… the line number (python#111548) Signed-off-by: Pablo Galindo <[email protected]>

vstinner · 2023-11-10T12:11:19Z

Backporting to 3.12 and 3.11 as technically this is a regression

These changes introduced references leaks in 3.11 and 3.12 branches: see issue gh-111947.

PyFrame_GetCode() returns a strong reference.

PyFrame_GetCode() returns a strong reference. (cherry picked from commit 4b0c875) Co-authored-by: Victor Stinner <[email protected]>

…111951) [3.12] gh-109181: Fix refleak in tb_get_lineno() (GH-111948) PyFrame_GetCode() returns a strong reference. (cherry picked from commit 4b0c875) Co-authored-by: Victor Stinner <[email protected]>

… the line number (python#111548) Signed-off-by: Pablo Galindo <[email protected]>

Fix for python/cpython#109181 introduced lazily computed lineno for traceback object in 3.11.7 and 3.12.1. Tested in a number of Python versions, the change seems to be safe.

… the line number (python#111548) Signed-off-by: Pablo Galindo <[email protected]>

pochmann added the type-bug An unexpected behavior, bug, or error label Sep 9, 2023

terryjreedy added interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage labels Sep 9, 2023

AlexWaygood added 3.11 only security fixes 3.12 only security fixes 3.13 bugs and security fixes labels Sep 11, 2023

bedevere-app bot mentioned this issue Oct 31, 2023

gh-109181: Speed up Traceback object creation by lazily compute the line number #111548

Merged

pablogsal added a commit to pablogsal/cpython that referenced this issue Oct 31, 2023

pythongh-109181: Speed up Traceback object creation by lazily compute…

ca00234

… the line number

pablogsal added a commit to pablogsal/cpython that referenced this issue Oct 31, 2023

pythongh-109181: Speed up Traceback object creation by lazily compute…

c86573e

… the line number Signed-off-by: Pablo Galindo <[email protected]>

pablogsal added a commit that referenced this issue Oct 31, 2023

gh-109181: Speed up Traceback object creation by lazily compute the l…

abb1542

…ine number (#111548) Signed-off-by: Pablo Galindo <[email protected]>

bedevere-app bot mentioned this issue Oct 31, 2023

[3.11] gh-109181: Speed up Traceback object creation by lazily compute the line number (GH-111548) #111550

Merged

bedevere-app bot mentioned this issue Oct 31, 2023

[3.12] gh-109181: Speed up Traceback object creation by lazily compute the line number (GH-111548) #111551

Merged

pablogsal closed this as completed Oct 31, 2023

pablogsal added a commit that referenced this issue Oct 31, 2023

[3.11] gh-109181: Speed up Traceback object creation by lazily comput…

a4aa213

…e the line number (GH-111548) (#111550) . (cherry picked from commit abb1542)

pablogsal added a commit to pablogsal/cpython that referenced this issue Oct 31, 2023

fixup! [3.12] pythongh-109181: Speed up Traceback object creation by …

c482e32

…lazily compute the line number (pythonGH-111548)

pablogsal added a commit that referenced this issue Oct 31, 2023

[3.12] gh-109181: Speed up Traceback object creation by lazily comput…

4f619e8

…e the line number (GH-111548) (#111551) . (cherry picked from commit abb1542)

FullteaR pushed a commit to FullteaR/cpython that referenced this issue Nov 3, 2023

pythongh-109181: Speed up Traceback object creation by lazily compute…

6313a6a

… the line number (python#111548) Signed-off-by: Pablo Galindo <[email protected]>

vstinner mentioned this issue Nov 10, 2023

Many tests leak references in 3.11 and 3.12, such has test_cgitb: "Speed up Traceback object creation" regression #111947

Closed

vstinner added a commit to vstinner/cpython that referenced this issue Nov 10, 2023

pythongh-109181: Fix refleak in tb_get_lineno()

f2f63b7

PyFrame_GetCode() returns a strong reference.

bedevere-app bot mentioned this issue Nov 10, 2023

[3.12] gh-109181: Fix refleak in tb_get_lineno() #111948

Merged

vstinner added a commit that referenced this issue Nov 10, 2023

[3.12] gh-109181: Fix refleak in tb_get_lineno() (#111948)

4b0c875

PyFrame_GetCode() returns a strong reference.

bedevere-app bot mentioned this issue Nov 10, 2023

[3.11] [3.12] gh-109181: Fix refleak in tb_get_lineno() (GH-111948) #111951

Merged

aisk pushed a commit to aisk/cpython that referenced this issue Feb 11, 2024

pythongh-109181: Speed up Traceback object creation by lazily compute…

4d7ab54

… the line number (python#111548) Signed-off-by: Pablo Galindo <[email protected]>

jmao-denver mentioned this issue Aug 8, 2024

Get the correct computed tb lineno jpy-consortium/jpy#158

Merged

Glyphack pushed a commit to Glyphack/cpython that referenced this issue Sep 2, 2024

pythongh-109181: Speed up Traceback object creation by lazily compute…

a89cfc1

… the line number (python#111548) Signed-off-by: Pablo Galindo <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Exceptions slow in 3.11, depending on location #109181

Exceptions slow in 3.11, depending on location #109181

pochmann commented Sep 9, 2023 •

edited by bedevere-app bot

Loading

sunmy2019 commented Sep 9, 2023

sunmy2019 commented Sep 9, 2023

nielstron commented Sep 11, 2023 •

edited

Loading

nielstron commented Sep 11, 2023

AlexWaygood commented Sep 11, 2023 •

edited

Loading

iritkatriel commented Sep 15, 2023

pablogsal commented Sep 16, 2023 •

edited

Loading

sunmy2019 commented Sep 16, 2023

pablogsal commented Sep 16, 2023

pablogsal commented Oct 31, 2023

vstinner commented Nov 10, 2023

Exceptions slow in 3.11, depending on location #109181

Exceptions slow in 3.11, depending on location #109181

Comments

pochmann commented Sep 9, 2023 • edited by bedevere-app bot Loading

Bug report

Bug description:

CPython versions tested on:

Operating systems tested on:

Linked PRs

sunmy2019 commented Sep 9, 2023

sunmy2019 commented Sep 9, 2023

nielstron commented Sep 11, 2023 • edited Loading

nielstron commented Sep 11, 2023

AlexWaygood commented Sep 11, 2023 • edited Loading

iritkatriel commented Sep 15, 2023

pablogsal commented Sep 16, 2023 • edited Loading

sunmy2019 commented Sep 16, 2023

pablogsal commented Sep 16, 2023

pablogsal commented Oct 31, 2023

vstinner commented Nov 10, 2023

pochmann commented Sep 9, 2023 •

edited by bedevere-app bot

Loading

nielstron commented Sep 11, 2023 •

edited

Loading

AlexWaygood commented Sep 11, 2023 •

edited

Loading

pablogsal commented Sep 16, 2023 •

edited

Loading