-
-
Notifications
You must be signed in to change notification settings - Fork 2.9k
[mypyc] Use optimized implementation for builtins.sum #10268
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[mypyc] Use optimized implementation for builtins.sum #10268
Conversation
Cover cases like sum(x == 0 for x in <iterable>) instead of just cases like sum(function(x) for x in <iterable>).
The checks that the GeneratorExpr had an evaluatable left-hand-side were unnecessary (I think).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for submitting this PR! This is a great start.
mypyc/irbuild/specialize.py
Outdated
call_expr = builder.accept(gen.left_expr) | ||
builder.add_bool_branch(call_expr, true_block, false_block) | ||
builder.activate_block(true_block) | ||
builder.assign(retval, builder.binary_op(retval, Integer(1), '+', expr.line), -1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding Integer(1)
seems like it is not what you want? Probably you want call_expr
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, oops, I looked closer at this:
Counting up the number of elements that match something in a list was one of our motivating use cases, but we need to work when the value is anything, not just a boolean.
I think you can ditch all the logic that branches on the call_expr
and just accumulate the result of +
into retval
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh right, because booleans. That makes lots of sense lol
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, hold up, if I do this (and you are right, that is most definitely what sum
is actually for, lol - not just for boolean comparisons), then stuff breaks a la what I saw in the comment below about using builder.accept(start_expr) regardless of whether stuff is literals or not:
def fn():
return 1
print(sum((x == 0 for x in [fn(), fn()])))
leads to
building 'example' extension
gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -fPIC -I/home/sinback/repos/mypy/mypyc/lib-rt -I/home/sinback/repos/mypy/env/include -I/home/sinback/.pyenv/versions/3.9.2/include/python3.9 -c build/__native.c -o build/temp.linux-x86_64-3.9/build/__native.o -O3 -Werror -Wno-unused-function -Wno-unused-label -Wno-unreachable-code -Wno-unused-variable -Wno-unused-command-line-argument -Wno-unknown-warning-option -Wno-unused-but-set-variable
build/__native.c: In function ‘CPyDef___top_level__’:
build/__native.c:172:14: error: assignment makes integer from pointer without a cast [-Werror=int-conversion]
cpy_r_r5 = cpy_r_r21;
^
build/__native.c: At top level:
cc1: error: unrecognized command line option ‘-Wno-unknown-warning-option’ [-Werror]
cc1: error: unrecognized command line option ‘-Wno-unused-command-line-argument’ [-Werror]
cc1: all warnings being treated as errors
error: command '/usr/bin/gcc' failed with exit code 1
at compile time. that is too bad
mypyc/irbuild/specialize.py
Outdated
target_type = float_rprimitive | ||
else: | ||
target_type = object_rprimitive | ||
# give up if start_expr is not a literal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You ought to be able to call builder.accept
on start_expr
and get the right thing back regardless of whether it has a 'value' attribute. What was going wrong when you tried that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I was hoping for that too, but couldn't get it to work. It seemed like stuff always went wrong for me when dealing with non-literals for startval that way. Here's some examples. (For all I just commented out the part that makes us give up, we try to initialize the retval to just builder.accept(start_expr) without requiring anything of start_expr)
a = 1
print(sum((x == 0 for x in [0]), a))
makes builder.accept(), when called on a, resolve it from a NameExpr to an Unbox, which seems probably right, but then gcc fails when it actually runs:
building 'example' extension
gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -fPIC -I/home/sinback/repos/mypy/mypyc/lib-rt -I/home/sinback/repos/mypy/env/include -I/home/sinback/.pyenv/versions/3.9.2/include/python3.9 -c build/__native.c -o build/temp.linux-x86_64-3.9/build/__native.o -O3 -Werror -Wno-unused-function -Wno-unused-label -Wno-unreachable-code -Wno-unused-variable -Wno-unused-command-line-argument -Wno-unknown-warning-option -Wno-unused-but-set-variable
build/__native.c: In function ‘CPyDef___top_level__’:
build/__native.c:165:15: error: assignment makes pointer from integer without a cast [-Werror=int-conversion]
cpy_r_r18 = cpy_r_r17;
^
build/__native.c: At top level:
cc1: error: unrecognized command line option ‘-Wno-unknown-warning-option’ [-Werror]
cc1: error: unrecognized command line option ‘-Wno-unused-command-line-argument’ [-Werror]
cc1: all warnings being treated as errors
error: command '/usr/bin/gcc' failed with exit code 1
and it also doesn't really work for calls:
def fn() -> int:
return 1
print(sum((x == 0 for x in [0]), fn()))
evaluates a CallExpr into a Call, but then the same unsafe pointer operation problem happens.
and doesn't work for arithmetic operations either:
print(sum((x == 0 for x in [0]), 0 + 1))
tries to turn an OpExpr into a CallC and then has the same unsafe pointer operation problem.
I dunno what was going on and figuring out seemed like a can of worms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did something yesterday which involved a while loop until there was a .value attribute present, but I can't figure out exactly what right now, and it would never finish for CallCs I think
mypyc/irbuild/specialize.py
Outdated
if not expr.arg_kinds[1] in (ARG_POS, ARG_NAMED): | ||
return None | ||
start_expr = expr.args[1] | ||
if isinstance(start_expr, IntExpr): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of figuring out the target_type
by casing on the expression, we can find it in a more general way with builder.node_type(expr)
. (This will also work in the case where there is no start_expr
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a good function to be pointed at, thanks!
I implemented & pushed this change - note that for the compile errors I am griping about in the other comments, this change seems to replace those errors with mere segfaults at runtime. idk what's up there yet either
mypyc/test-data/run-misc.test
Outdated
# a is a NameExpr and not supported | ||
a = 1 | ||
print(sum((x == 0 for x in [0, 1]), a)) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add some test cases where the sum is of some mathematical function of a list, and some where the values are floats? maybe something like (x**2 for x in [some float list])
To avoid a strange issue where one-element sums that should be 1 were instead True, initialize the sum return value to 0.0 when its intended type is not specified as integer. This still leads to sums of boolean expressions returning integers (as in CPython).
haha, this stuff is hard, my first implementation was very silly. All your suggestions ultimately helped me make the solution way less derpy though. I think this way looks basically as right as I can figure out how to make it right now. I updated the code so that when it's time to initialize the sum, if no start argument was given, you just evaluate a dummy start expression anyway, which returns 0. In the 'object' case I found I had to initialize the sum to 0.0 instead of 0, to prevent sums which should be 1 from being True? I beat my head against it for an hour but I couldn't figure out how to avoid that another way, I tried using a bunch of coercions and stuff but it didn't work. |
mypyc/irbuild/specialize.py
Outdated
else: | ||
# IntExpr feels better here, but then if the return value of sum was untypehinted and | ||
# the result should be 1, it seems to be True instead, unless we initialize it this | ||
# way? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that this seems like it ought to be an IntExpr
, and I think that it is actually important for cases like:
l: List[Any] = [1, 2, 3]
result = sum(x*x for x in l)
That will use object_rprimitive as the type, and we want to produce 14
as the result and not 14.0
.
What was an example of a case where a True was getting produced?
I think that probably we want to wrap the compilation of start_expr in a builder.coerce
before assigning it to retval.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good call. all fixed now
I'll do a (final, probably!) round of comments this afternoon, but I took a quick look at the test failures and the main issue is that apparently the start argument to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I didn't get back to you last week like I said I was going to.
Everything looks pretty good now and just needs a couple minor bits:
- I've asked for one more test case
- Fixing the CI failures. It looks like it is the
start
parameter issue and some flake8 lint warnings
mypyc/test-data/run-misc.test
Outdated
print(sum((x == 0 for x in [0, 1]), 1j)) | ||
print(sum(c == 'd' for c in 'abcdd')) | ||
print(sum((c == 'd' for c in 'abcdd'), 1)) | ||
print(sum((c == 'd' for c in 'abcdd'), start=1)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to drop the start
, since it isn't supported on all the pythons we need to support. (We could add a test to run-python38.test
, but it hardly seems worth it for this.)
mypyc/test-data/run-misc.test
Outdated
print(sum(i + j == 0 for i, j in zip([0, 0, 0], [0, 1, 0]))) | ||
|
||
print('test misc cases') | ||
print(sum((x == 0 for x in [0, 1]), 0 + 1j)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you also add a test that sums up complex numbers but doesn't have a start argument? Just to test the object flow without a start.
Thanks Michael for catching all those little things! I'll get to them. Last week I also realized the sum implementation wasn't actually being used by the run-misc.test file. I'm not sure why not but I'm going to figure out what's up there as well. I have a bunch of interviews this week but will probably get to everything by the end of the weekend |
The start kwarg was only added in Python 3.8, so it should not be tested as part of the normal testing suite.
mypyc/test-data/run-misc.test
Outdated
|
||
[case testSum] | ||
[file driver.py] | ||
print('test sums of numbers') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These don't actually test anything, since the driver.py
file is not compiled. (This is a common mistake so we should try to improve the developer experience here.)
The preferred way is to use test_
functions in the main test case and not include driver.py
at all. For example:
[case testFoo]
def test_whatever() -> None:
assert 1 + 2 == 3
def test_more() -> None:
assert 'x' * 2 == 'xx'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the tip!
driver.py is not actually compiled, so move all the tests into the main test file.
Actually I think this PR is good to go already? despite me saying I'd get it done by the end of the weekend just above. Is it poor form to rebase and force-push the branch after opening a PR? there are a lot of little fixup commits in this PR by now and it seems like not all this history would need to make its way into the main branch. Or do the maintainers like to handle the git history themselves at merge-time? |
Oh, oops; good catch on the test, Jukka >_>. The preferred workflow for this project is to push new commits instead of rebasing them, since it makes it easier for reviewers to look at just what is changed. We'll squash it down to one commit when we merge it. |
@@ -238,6 +238,7 @@ class StopIteration(Exception): | |||
|
|||
def any(i: Iterable[T]) -> bool: pass | |||
def all(i: Iterable[T]) -> bool: pass | |||
def sum(i: Iterable[T]) -> int: pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, this will need to take a start argument. It looks like that is the cause of a bunch of the test failures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! only a few suggestions on tests.
mypyc/test-data/run-misc.test
Outdated
|
||
def test_sum_multi() -> None: | ||
assert sum(i + j == 0 for i, j in zip([0, 0, 0], [0, 1, 0])) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert sum == 2
@@ -875,6 +875,56 @@ assert call_all(mixed_110) == 1 | |||
assert call_any_nested([[1, 1, 1], [1, 1], []]) == 1 | |||
assert call_any_nested([[1, 1, 1], [0, 1], []]) == 0 | |||
|
|||
|
|||
[case testSum] | |||
from typing import Any, List |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding [typing fixtures/typing-full.pyi]
here is the simplest way to solve test errors.
@sinback Hi! Any progress on this PR? IMHO it's almost done and it really helps the performance. |
excited to see this got merged! :) sorry for the radio silence after the change requests, I worked on this in between jobs and was in the middle of trying to get up to speed on the new one around when I dropped off the map. appreciate you carrying it over the finish line @97littleleaf11 |
:) You don't need to say sorry, it was already a good pr before I picked it up, which not only covers many test cases but also improves the perf a lot. |
Partially fixes mypyc/mypyc#796 This PR forces the computation of sum() to be been shown through benchmarking to speed up the execution of sum() over integers by about 2-5x. There is some support for the start argument, but only for if 'start' is a literal expression (has a .value attribute). The current implementation doesn't work with arbitrary values for start, because I couldn't figure out how to get any Expression that could be given to be evaluated fully into something that you can initialize the retval Register to. So for example, these cases will not get optimized: a = 1 sum((x == 0 for x in [0]), a) # won't get evaluated because a is a NameExpr sum((x == 0 for x in [0], 0 + j) # won't get evaluated because 0 + j is an OpExpr Co-authored-by: 97littleleaf11 <[email protected]>
Description
In mypyc, add an optimized implementation for the builtins iterator idiom sum().
For some reason, in the case of calling
sum()
over a Generator, the builtins implementation ofsum()
is significantly slower than a naive implementation which increments return value every time something in the generator evaluates to True. This PR forces the computation ofsum()
to be been shown through benchmarking to speed up the execution of sum() over integers by about 2-5x.There is some support for the
start
argument, but only for if 'start' is a literal expression (has a .value attribute). The current implementation doesn't work with arbitrary values forstart
, because I couldn't figure out how to get any Expression that could be given to be evaluated fully into something that you can initialize the retval Register to. So for example, these cases will not get optimized:I did some playing around for the above two cases which works ok, where you evaluate the expressions by calling builder.accept() on the expression given for 'start' until you finally get down to a literal value (something that has '.value') but it doesn't work for arbitrary expressions and I realized that if mypyc does stuff like that it should probably do it in an expression-substituter somewhere else. idk
This does a good chunk of mypyc/mypyc#796 but doesn't finish it because it only implements
sum()
.Test Plan
I added a bunch of test cases for different ways that the optimized. I think there are probably enough. Happy to add more.