-
Notifications
You must be signed in to change notification settings - Fork 36
Performance: prevent blowup in normalization #105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This adresses the case where we have an expression which is smal when normalized, but in its current form contains a large dual expression.
assert str(cnf) == "a&c&f&g" | ||
# Locally, this test takes 0.4s, previously it was 500s. | ||
# We allow 30s because of the wide range of possible CPUs. | ||
assert t1 - t0 < 30, "Normalizing took too long" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't use runtime in an assertion as a general rule... it's brittle as you already indicated in the comment. You may want to explore using mocks and counting function calls. expr.simplify
looks like a good candidate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@prodhype @olliemath should I go ahead and merge though?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pombredanne It should be rewritten so that it doesn't depend on runtime before being merged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@prodhype there you go. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See 5f93c8b
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pombredanne sorry it took me a while to get back to you, I've been a bit too busy for much open source recently. Thanks for improving the test though!
Reference: #105 (comment) Reported-by: @prodhype Signed-off-by: Philippe Ombredanne <[email protected]>
Reference: #105 (comment) Reported-by: @prodhype Signed-off-by: Philippe Ombredanne <[email protected]>
I am merging this in #107 ... so I am closing this here |
Thanks! |
This addresses the case where we have an expression which is small when normalized, but in its current form contains large dual expressions. The blowup happens at the
_rdistributive
step, so the idea is to normalize subexpressions as much as possible before that. Closes #106The current test case takes 0.4s with the new code, but took 500s with the old code. We have many of these cases in our codebase and the blow-up is exponential in the size of the original expression. We have tested and this now has reasonable performance with all formulas from our codebase.
The original expression which drew our attention to this issue (below) now takes 30s (~3s under pypy) with the new code and never returns (in fact I run out of memory) with the old code: