[mypyc] Use optimized implementation for builtins.sum #10268

sinback · 2021-03-30T21:14:20Z

Description

In mypyc, add an optimized implementation for the builtins iterator idiom sum().

For some reason, in the case of calling sum() over a Generator, the builtins implementation of sum() is significantly slower than a naive implementation which increments return value every time something in the generator evaluates to True. This PR forces the computation of sum() to be been shown through benchmarking to speed up the execution of sum() over integers by about 2-5x.

There is some support for the start argument, but only for if 'start' is a literal expression (has a .value attribute). The current implementation doesn't work with arbitrary values for start, because I couldn't figure out how to get any Expression that could be given to be evaluated fully into something that you can initialize the retval Register to. So for example, these cases will not get optimized:

a = 1
sum((x == 0 for x in [0]), a)        # won't get evaluated because a is a NameExpr
sum((x == 0 for x in [0], 0 + j)    # won't get evaluated because 0 + j is an OpExpr

I did some playing around for the above two cases which works ok, where you evaluate the expressions by calling builder.accept() on the expression given for 'start' until you finally get down to a literal value (something that has '.value') but it doesn't work for arbitrary expressions and I realized that if mypyc does stuff like that it should probably do it in an expression-substituter somewhere else. idk

This does a good chunk of mypyc/mypyc#796 but doesn't finish it because it only implements sum().

Test Plan

I added a bunch of test cases for different ways that the optimized. I think there are probably enough. Happy to add more.

Works on mypyc/mypyc#796.

Cover cases like sum(x == 0 for x in <iterable>) instead of just cases like sum(function(x) for x in <iterable>).

The checks that the GeneratorExpr had an evaluatable left-hand-side were unnecessary (I think).

msullivan

Thanks for submitting this PR! This is a great start.

msullivan · 2021-03-30T21:49:00Z

mypyc/irbuild/specialize.py

+        call_expr = builder.accept(gen.left_expr)
+        builder.add_bool_branch(call_expr, true_block, false_block)
+        builder.activate_block(true_block)
+        builder.assign(retval, builder.binary_op(retval, Integer(1), '+', expr.line), -1)


Adding Integer(1) seems like it is not what you want? Probably you want call_expr

Oh, oops, I looked closer at this:
Counting up the number of elements that match something in a list was one of our motivating use cases, but we need to work when the value is anything, not just a boolean.

I think you can ditch all the logic that branches on the call_expr and just accumulate the result of + into retval.

Oh right, because booleans. That makes lots of sense lol

Oh, hold up, if I do this (and you are right, that is most definitely what sum is actually for, lol - not just for boolean comparisons), then stuff breaks a la what I saw in the comment below about using builder.accept(start_expr) regardless of whether stuff is literals or not:

def fn(): return 1 print(sum((x == 0 for x in [fn(), fn()])))

leads to

building 'example' extension gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -fPIC -I/home/sinback/repos/mypy/mypyc/lib-rt -I/home/sinback/repos/mypy/env/include -I/home/sinback/.pyenv/versions/3.9.2/include/python3.9 -c build/__native.c -o build/temp.linux-x86_64-3.9/build/__native.o -O3 -Werror -Wno-unused-function -Wno-unused-label -Wno-unreachable-code -Wno-unused-variable -Wno-unused-command-line-argument -Wno-unknown-warning-option -Wno-unused-but-set-variable build/__native.c: In function ‘CPyDef___top_level__’: build/__native.c:172:14: error: assignment makes integer from pointer without a cast [-Werror=int-conversion] cpy_r_r5 = cpy_r_r21; ^ build/__native.c: At top level: cc1: error: unrecognized command line option ‘-Wno-unknown-warning-option’ [-Werror] cc1: error: unrecognized command line option ‘-Wno-unused-command-line-argument’ [-Werror] cc1: all warnings being treated as errors error: command '/usr/bin/gcc' failed with exit code 1

at compile time. that is too bad

msullivan · 2021-03-30T21:51:43Z

mypyc/irbuild/specialize.py

+            target_type = float_rprimitive
+        else:
+            target_type = object_rprimitive
+        # give up if start_expr is not a literal


You ought to be able to call builder.accept on start_expr and get the right thing back regardless of whether it has a 'value' attribute. What was going wrong when you tried that?

Yeah, I was hoping for that too, but couldn't get it to work. It seemed like stuff always went wrong for me when dealing with non-literals for startval that way. Here's some examples. (For all I just commented out the part that makes us give up, we try to initialize the retval to just builder.accept(start_expr) without requiring anything of start_expr)

a = 1 print(sum((x == 0 for x in [0]), a))

makes builder.accept(), when called on a, resolve it from a NameExpr to an Unbox, which seems probably right, but then gcc fails when it actually runs:

building 'example' extension gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -fPIC -I/home/sinback/repos/mypy/mypyc/lib-rt -I/home/sinback/repos/mypy/env/include -I/home/sinback/.pyenv/versions/3.9.2/include/python3.9 -c build/__native.c -o build/temp.linux-x86_64-3.9/build/__native.o -O3 -Werror -Wno-unused-function -Wno-unused-label -Wno-unreachable-code -Wno-unused-variable -Wno-unused-command-line-argument -Wno-unknown-warning-option -Wno-unused-but-set-variable build/__native.c: In function ‘CPyDef___top_level__’: build/__native.c:165:15: error: assignment makes pointer from integer without a cast [-Werror=int-conversion] cpy_r_r18 = cpy_r_r17; ^ build/__native.c: At top level: cc1: error: unrecognized command line option ‘-Wno-unknown-warning-option’ [-Werror] cc1: error: unrecognized command line option ‘-Wno-unused-command-line-argument’ [-Werror] cc1: all warnings being treated as errors error: command '/usr/bin/gcc' failed with exit code 1

and it also doesn't really work for calls:

def fn() -> int: return 1 print(sum((x == 0 for x in [0]), fn()))

evaluates a CallExpr into a Call, but then the same unsafe pointer operation problem happens.

and doesn't work for arithmetic operations either:

print(sum((x == 0 for x in [0]), 0 + 1))

tries to turn an OpExpr into a CallC and then has the same unsafe pointer operation problem.

I dunno what was going on and figuring out seemed like a can of worms.

I did something yesterday which involved a while loop until there was a .value attribute present, but I can't figure out exactly what right now, and it would never finish for CallCs I think

msullivan · 2021-03-30T21:56:45Z

mypyc/irbuild/specialize.py

+        if not expr.arg_kinds[1] in (ARG_POS, ARG_NAMED):
+            return None
+        start_expr = expr.args[1]
+        if isinstance(start_expr, IntExpr):


Instead of figuring out the target_type by casing on the expression, we can find it in a more general way with builder.node_type(expr). (This will also work in the case where there is no start_expr)

this is a good function to be pointed at, thanks!

I implemented & pushed this change - note that for the compile errors I am griping about in the other comments, this change seems to replace those errors with mere segfaults at runtime. idk what's up there yet either

msullivan · 2021-03-30T22:00:59Z

mypyc/test-data/run-misc.test

+# a is a NameExpr and not supported
+a = 1
+print(sum((x == 0 for x in [0, 1]), a))
+


Could you add some test cases where the sum is of some mathematical function of a list, and some where the values are floats? maybe something like (x**2 for x in [some float list])

To avoid a strange issue where one-element sums that should be 1 were instead True, initialize the sum return value to 0.0 when its intended type is not specified as integer. This still leads to sums of boolean expressions returning integers (as in CPython).

sinback · 2021-04-01T00:04:03Z

haha, this stuff is hard, my first implementation was very silly. All your suggestions ultimately helped me make the solution way less derpy though. I think this way looks basically as right as I can figure out how to make it right now.

I updated the code so that when it's time to initialize the sum, if no start argument was given, you just evaluate a dummy start expression anyway, which returns 0. In the 'object' case I found I had to initialize the sum to 0.0 instead of 0, to prevent sums which should be 1 from being True? I beat my head against it for an hour but I couldn't figure out how to avoid that another way, I tried using a bunch of coercions and stuff but it didn't work.

msullivan · 2021-04-01T01:21:40Z

mypyc/irbuild/specialize.py

+        else:
+            # IntExpr feels better here, but then if the return value of sum was untypehinted and
+            # the result should be 1, it seems to be True instead, unless we initialize it this
+            # way?


I agree that this seems like it ought to be an IntExpr, and I think that it is actually important for cases like:

l: List[Any] = [1, 2, 3] result = sum(x*x for x in l)

That will use object_rprimitive as the type, and we want to produce 14 as the result and not 14.0.

What was an example of a case where a True was getting produced?

I think that probably we want to wrap the compilation of start_expr in a builder.coerce before assigning it to retval.

good call. all fixed now

msullivan · 2021-04-01T19:06:14Z

I'll do a (final, probably!) round of comments this afternoon, but I took a quick look at the test failures and the main issue is that apparently the start argument to sum couldn't be specified by name until Python 3.8, so tests will need to pass it without the name in order to work on earlier versions. Also looks like there are some lint errors from flake8.

msullivan

Sorry I didn't get back to you last week like I said I was going to.

Everything looks pretty good now and just needs a couple minor bits:

I've asked for one more test case
Fixing the CI failures. It looks like it is the start parameter issue and some flake8 lint warnings

msullivan · 2021-04-06T05:51:30Z

mypyc/test-data/run-misc.test

+print(sum((x == 0 for x in [0, 1]), 1j))
+print(sum(c == 'd' for c in 'abcdd'))
+print(sum((c == 'd' for c in 'abcdd'), 1))
+print(sum((c == 'd' for c in 'abcdd'), start=1))


We need to drop the start, since it isn't supported on all the pythons we need to support. (We could add a test to run-python38.test, but it hardly seems worth it for this.)

msullivan · 2021-04-06T05:52:04Z

mypyc/test-data/run-misc.test

+print(sum(i + j == 0 for i, j in zip([0, 0, 0], [0, 1, 0])))
+
+print('test misc cases')
+print(sum((x == 0 for x in [0, 1]), 0 + 1j))


Could you also add a test that sums up complex numbers but doesn't have a start argument? Just to test the object flow without a start.

sinback · 2021-04-06T12:49:44Z

Thanks Michael for catching all those little things! I'll get to them.

Last week I also realized the sum implementation wasn't actually being used by the run-misc.test file. I'm not sure why not but I'm going to figure out what's up there as well.

I have a bunch of interviews this week but will probably get to everything by the end of the weekend

The start kwarg was only added in Python 3.8, so it should not be tested as part of the normal testing suite.

JukkaL · 2021-04-06T13:32:12Z

mypyc/test-data/run-misc.test

+
+[case testSum]
+[file driver.py]
+print('test sums of numbers')


These don't actually test anything, since the driver.py file is not compiled. (This is a common mistake so we should try to improve the developer experience here.)

The preferred way is to use test_ functions in the main test case and not include driver.py at all. For example:

[case testFoo] def test_whatever() -> None: assert 1 + 2 == 3 def test_more() -> None: assert 'x' * 2 == 'xx'

Thanks for the tip!

driver.py is not actually compiled, so move all the tests into the main test file.

sinback · 2021-04-06T14:05:38Z

Actually I think this PR is good to go already? despite me saying I'd get it done by the end of the weekend just above.

Is it poor form to rebase and force-push the branch after opening a PR? there are a lot of little fixup commits in this PR by now and it seems like not all this history would need to make its way into the main branch. Or do the maintainers like to handle the git history themselves at merge-time?

msullivan · 2021-04-06T16:53:37Z

Oh, oops; good catch on the test, Jukka >_>.

The preferred workflow for this project is to push new commits instead of rebasing them, since it makes it easier for reviewers to look at just what is changed. We'll squash it down to one commit when we merge it.

msullivan · 2021-04-06T17:30:25Z

mypyc/test-data/fixtures/ir.py

@@ -238,6 +238,7 @@ class StopIteration(Exception):

 def any(i: Iterable[T]) -> bool: pass
 def all(i: Iterable[T]) -> bool: pass
+def sum(i: Iterable[T]) -> int: pass


Ah, this will need to take a start argument. It looks like that is the cause of a bunch of the test failures.

97littleleaf11

LGTM! only a few suggestions on tests.

97littleleaf11 · 2021-07-20T19:39:38Z

mypyc/test-data/run-misc.test

+
+def test_sum_multi() -> None:
+    assert sum(i + j == 0 for i, j in zip([0, 0, 0], [0, 1, 0]))
+


assert sum == 2

97littleleaf11 · 2021-07-20T20:06:07Z

mypyc/test-data/run-misc.test

@@ -875,6 +875,56 @@ assert call_all(mixed_110) == 1
 assert call_any_nested([[1, 1, 1], [1, 1], []]) == 1
 assert call_any_nested([[1, 1, 1], [0, 1], []]) == 0

+
+[case testSum]
+from typing import Any, List


Adding [typing fixtures/typing-full.pyi] here is the simplest way to solve test errors.

97littleleaf11 · 2021-08-11T10:39:54Z

@sinback Hi! Any progress on this PR? IMHO it's almost done and it really helps the performance.

sinback · 2021-11-10T16:49:05Z

excited to see this got merged! :) sorry for the radio silence after the change requests, I worked on this in between jobs and was in the middle of trying to get up to speed on the new one around when I dropped off the map. appreciate you carrying it over the finish line @97littleleaf11

97littleleaf11 · 2021-11-10T17:15:34Z

:) You don't need to say sorry, it was already a good pr before I picked it up, which not only covers many test cases but also improves the perf a lot.

Partially fixes mypyc/mypyc#796 This PR forces the computation of sum() to be been shown through benchmarking to speed up the execution of sum() over integers by about 2-5x. There is some support for the start argument, but only for if 'start' is a literal expression (has a .value attribute). The current implementation doesn't work with arbitrary values for start, because I couldn't figure out how to get any Expression that could be given to be evaluated fully into something that you can initialize the retval Register to. So for example, these cases will not get optimized: a = 1 sum((x == 0 for x in [0]), a) # won't get evaluated because a is a NameExpr sum((x == 0 for x in [0], 0 + j) # won't get evaluated because 0 + j is an OpExpr Co-authored-by: 97littleleaf11 <[email protected]>

sinback added 9 commits March 29, 2021 18:36

[mypyc] add test for builtins.sum

da0166f

[mypyc] start optimized implementation for builtins.sum

624b26e

Works on mypyc/mypyc#796.

[mypyc] add test case for IR for builtins.sum over ints

2d7424f

[mypyc] extend specialized sum implementation

2672bfe

Cover cases like sum(x == 0 for x in <iterable>) instead of just cases like sum(function(x) for x in <iterable>).

[mypyc] support integer 'start' arg to sum()

9d6dfac

[mypyc] extend coverage of test for sum()

b340a55

[mypyc] support all literal expressions for 'start' arg to sum()

8e09f5f

[mypyc] simplify logic in sum implementation

8983938

The checks that the GeneratorExpr had an evaluatable left-hand-side were unnecessary (I think).

[mypyc] update test for sum

0f36a74

sinback changed the title ~~Sinback/mypyc 796 sum impl~~ mypyc: use optimized implementation for builtins.sum Mar 30, 2021

msullivan self-requested a review March 30, 2021 21:50

msullivan reviewed Mar 30, 2021

View reviewed changes

sinback added 3 commits March 31, 2021 09:47

[mypyc] more general / correct detection of sum retval storage type

9e774ae

[mypyc] add more tests for sum

1244b35

msullivan reviewed Apr 1, 2021

View reviewed changes

[mypyc] cleaner initialization for sum total

9b9e804

msullivan reviewed Apr 6, 2021

View reviewed changes

[mypyc] remove start= argument from sum tests

b509447

The start kwarg was only added in Python 3.8, so it should not be tested as part of the normal testing suite.

JukkaL reviewed Apr 6, 2021

View reviewed changes

sinback added 3 commits April 6, 2021 09:52

[mypyc] make sum tests actually run

1824715

driver.py is not actually compiled, so move all the tests into the main test file.

[mypyc] lint

6119d17

[mypyc] correct typo

e0e2a68

msullivan reviewed Apr 6, 2021

View reviewed changes

97littleleaf11 suggested changes Jul 20, 2021

View reviewed changes

97littleleaf11 added 3 commits November 10, 2021 10:45

Merge from master

0945e54

Fix tests

b09bca0

Minor refactor

a1c65cf

97littleleaf11 self-requested a review November 10, 2021 03:23

97littleleaf11 changed the title ~~mypyc: use optimized implementation for builtins.sum~~ [mypyc] Use optimized implementation for builtins.sum Nov 10, 2021

97littleleaf11 merged commit 47b22c5 into python:master Nov 10, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[mypyc] Use optimized implementation for builtins.sum #10268

[mypyc] Use optimized implementation for builtins.sum #10268

sinback commented Mar 30, 2021

msullivan left a comment

msullivan Mar 30, 2021

msullivan Mar 30, 2021

sinback Mar 31, 2021

sinback Mar 31, 2021

msullivan Mar 30, 2021

sinback Mar 31, 2021

sinback Mar 31, 2021

msullivan Mar 30, 2021

sinback Mar 31, 2021

msullivan Mar 30, 2021

sinback commented Apr 1, 2021

msullivan Apr 1, 2021

sinback Apr 6, 2021

msullivan commented Apr 1, 2021

msullivan left a comment

msullivan Apr 6, 2021

msullivan Apr 6, 2021

sinback commented Apr 6, 2021

JukkaL Apr 6, 2021

sinback Apr 6, 2021

sinback commented Apr 6, 2021

msullivan commented Apr 6, 2021

msullivan Apr 6, 2021

97littleleaf11 left a comment

97littleleaf11 Jul 20, 2021

97littleleaf11 Jul 20, 2021

97littleleaf11 commented Aug 11, 2021

sinback commented Nov 10, 2021

97littleleaf11 commented Nov 10, 2021


		def test_sum_multi() -> None:
		assert sum(i + j == 0 for i, j in zip([0, 0, 0], [0, 1, 0]))

[mypyc] Use optimized implementation for builtins.sum #10268

[mypyc] Use optimized implementation for builtins.sum #10268

Conversation

sinback commented Mar 30, 2021

Description

Test Plan

msullivan left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sinback commented Apr 1, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

msullivan commented Apr 1, 2021

msullivan left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sinback commented Apr 6, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sinback commented Apr 6, 2021

msullivan commented Apr 6, 2021

Choose a reason for hiding this comment

97littleleaf11 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

97littleleaf11 commented Aug 11, 2021

sinback commented Nov 10, 2021

97littleleaf11 commented Nov 10, 2021