Skip to content

Performance: prevent blowup in normalization #105

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 2, 2022

Conversation

olliemath
Copy link
Contributor

@olliemath olliemath commented Jul 7, 2021

This addresses the case where we have an expression which is small when normalized, but in its current form contains large dual expressions. The blowup happens at the _rdistributive step, so the idea is to normalize subexpressions as much as possible before that. Closes #106

The current test case takes 0.4s with the new code, but took 500s with the old code. We have many of these cases in our codebase and the blow-up is exponential in the size of the original expression. We have tested and this now has reasonable performance with all formulas from our codebase.

The original expression which drew our attention to this issue (below) now takes 30s (~3s under pypy) with the new code and never returns (in fact I run out of memory) with the old code:

        a & (
            (b & c & d & e & f & g)
            | (c & d & f & g & i & j)
            | (c & f & g & h & i & j)
            | (c & f & g & i & j & k)
            | (c & d & f & g & i & n & p)
            | (c & e & f & g & i & m & x)
            | (c & e & f & g & l & o & w)
            | (c & e & f & g & q & s & t)
            | (c & f & g & i & m & n & r)
            | (c & f & g & l & n & o & r)
            | (c & d & f & g & i & l & o & u)
            | (c & e & f & g & i & p & y & ~v)
            | (c & f & g & i & j & z & ~(c & f & g & i & j & k))
            | (
                c & f & g & t
                & ~(b & c & d & e & f & g)
                & ~(c & d & f & g & i & j)
                & ~(c & f & g & h & i & j)
                & ~(c & f & g & i & j & k)
                & ~(c & d & f & g & i & n & p)
                & ~(c & e & f & g & i & m & x)
                & ~(c & e & f & g & l & o & w)
                & ~(c & e & f & g & q & s & t)
                & ~(c & f & g & i & m & n & r)
                & ~(c & f & g & l & n & o & r)
                & ~(c & d & f & g & i & l & o & u)
                & ~(c & e & f & g & i & p & y & ~v)
                & ~(c & f & g & i & j & z & ~(c & f & g & i & j & k))
            )
            | (
                c & f & g & ~t
                & ~(b & c & d & e & f & g)
                & ~(c & d & f & g & i & j)
                & ~(c & f & g & h & i & j)
                & ~(c & f & g & i & j & k)
                & ~(c & d & f & g & i & n & p)
                & ~(c & e & f & g & i & m & x)
                & ~(c & e & f & g & l & o & w)
                & ~(c & e & f & g & q & s & t)
                & ~(c & f & g & i & m & n & r)
                & ~(c & f & g & l & n & o & r)
                & ~(c & d & f & g & i & l & o & u)
                & ~(c & e & f & g & i & p & y & ~v)
                & ~(c & f & g & i & j & z & ~(c & f & g & i & j & k))
            )
        )

This adresses the case where we have an expression which is smal
when normalized, but in its current form contains a large dual
expression.
assert str(cnf) == "a&c&f&g"
# Locally, this test takes 0.4s, previously it was 500s.
# We allow 30s because of the wide range of possible CPUs.
assert t1 - t0 < 30, "Normalizing took too long"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't use runtime in an assertion as a general rule... it's brittle as you already indicated in the comment. You may want to explore using mocks and counting function calls. expr.simplify looks like a good candidate.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@prodhype @olliemath should I go ahead and merge though?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pombredanne It should be rewritten so that it doesn't depend on runtime before being merged.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@prodhype there you go. Thanks!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See 5f93c8b

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pombredanne sorry it took me a while to get back to you, I've been a bit too busy for much open source recently. Thanks for improving the test though!

pombredanne added a commit that referenced this pull request May 2, 2022
Reference: #105 (comment)
Reported-by: @prodhype
Signed-off-by: Philippe Ombredanne <[email protected]>
pombredanne added a commit that referenced this pull request May 2, 2022
Reference: #105 (comment)
Reported-by: @prodhype
Signed-off-by: Philippe Ombredanne <[email protected]>
@pombredanne pombredanne merged commit 1305cbf into bastikr:master May 2, 2022
@pombredanne
Copy link
Collaborator

I am merging this in #107 ... so I am closing this here

@pombredanne
Copy link
Collaborator

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Exponential blowup when normalizing
3 participants