-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
BUG: MvNormal logp recomputes Cholesky factorization #6717
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Yeah I think you're right. We always convert the cholesky into a covariance internally, because that's what the We had planned to introduce an optimization on the logp graph before we create the dlogp, and this is yet another good reason for it. A more immediate solution is to create new |
I looked into adding a graph rewrite, but I couldn't see a way of proving the L in That said, being able to handle this in PyTensor would be nice, as I think it would open the door to some other optimisations around Cholesky factorisation, such as rank-n updates/downdates. |
We can add a flag to the tag of the More generally, if you are interested I think there's a lot of low-hanging fruit at the rewrite level in PyTensor for this kind of operations. I think this all we have so far: https://github.com/pymc-devs/pytensor/blob/main/pytensor/sandbox/linalg/ops.py |
This adds safe rewrites to logp before the grad operator is applied. This is motivated by pymc-devs#6717, where expensive `cholesky(L.dot(L.T))` operations are removed. If these remain in the logp graph when the grad is taken, the resulting dlogp graph contains unnecessary operations. However this may improve the stability and performance of grad logp in other situation.
Since pymc-devs/pytensor#303, `cholesky(L.dot(L.T))` will rewritten to L if `L.tag.lower_triangular=True`. This change adds these where appropriate. Fixes pymc-devs#6717.
This adds safe rewrites to logp before the grad operator is applied. This is motivated by #6717, where expensive `cholesky(L.dot(L.T))` operations are removed. If these remain in the logp graph when the grad is taken, the resulting dlogp graph contains unnecessary operations. However this may improve the stability and performance of grad logp in other situation.
Describe the issue:
When given a Cholesky factor, L, an observed
MvNormal
, egthe logp function unnecessarily computes the matrix product of L.LT, then recomputes the Cholesky factor.
This is expensive for large matrices.
This originates in the changes for pymc 4, where both Cholesky and precision matrix parameterizations were modified to transform into a covariance matrix parameterization. I'm guessing there are some performance improvements to be had with the precision matrix as well.
Reproduceable code example:
Error message:
Region of logp dprint interest highlighted here, full logp below. The function computes L.LT then recomputes cholesky(L.LT).
PyMC version information:
Context for the issue:
Performance regression for any model involving observed MvNormal
The text was updated successfully, but these errors were encountered: