-
-
Notifications
You must be signed in to change notification settings - Fork 20
Update loglik
and re-enable its tests
#3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update loglik
and re-enable its tests
#3
Conversation
loglik
implementation and re-enable its testsloglik
and re-enable its tests
Codecov Report
@@ Coverage Diff @@
## main #3 +/- ##
===========================================
+ Coverage 64.85% 76.74% +11.89%
===========================================
Files 3 5 +2
Lines 552 886 +334
===========================================
+ Hits 358 680 +322
- Misses 194 206 +12
Continue to review full report at Codecov.
|
d5d9eb3
to
1e1daff
Compare
1e1daff
to
ddabc9f
Compare
Now that The problem with the current approach is that it reintroduces log-likelihood terms into the graph: the untransformed ones and the transformed ones. We could use the current approach, but we would need to walk the graph backwards (i.e. outputs to inputs) and keep track of which nodes were transformed and ignore their old counterpart terms. |
ddabc9f
to
86a9a46
Compare
aeppl/loglik.py
Outdated
|
||
q = deque(io_toposort(graph_inputs((rv_var,)), (rv_var,))) | ||
|
||
logpdf_var = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logpdf_var = 0
? No need for the if statement below, then.I assume a 0 +
is optimized away automatically...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it should be removed during canonicalization, but then it's just a matter of when and where you want to do that work. Since it doesn't really cost us anything to do that here, I would rather avoid introducing terms that would necessarily be removed.
aeppl/loglik.py
Outdated
# Let's try deriving a log-likelihood from this | ||
# non-`RandonVariable` node | ||
q_logpdf_parts = [] | ||
for q_rv_var in node.outputs: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not familiar with any multi-output nodes, is it safe to say the loglike can always be dealt independently for these?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This loop should handle all cases just fine. The important thing is that we make sure to only add the log-likelihoods for the outputs that are supposed to be in the log-likelihood we're computing. The replacements
check is supposed to do that, but, since other things could be in there, it might not be the best approach.
I'm about to push a docstring update to loglik
that should explain exactly what this function is supposed to do (and why we should probably rename it).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems fine, knowing the PyMC3 version is working.
More generally, I have the following vague (and confused) thought:
Do we need to treat the RandomVariables differently as we walk down the graph? I wonder if we could just call _loglik
on all nodes equally, add them up and, once the logp graph is finished, replace all the random variables by the respective value_vars
(only once at the root level of the loglik
call).
Which transforms are you referring to? Variable transforms, or graph transforms / rewrites? I am guessing the latter. |
aeppl/loglik.py
Outdated
if q_rv_value_var is None: | ||
continue | ||
|
||
# TODO: We need to prevent this from re-adding terms |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the TODO concern here? Do you have a MWE?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That comment is referring to the cause of the failure in test_loglik_subtensor
.
When the I = bernoulli(p)
is lifted through the Y = normal(mu, sigma)
in the graph of Y[I]
, producing Y_new = normal(mu[I], sigma[I])
, the resulting log-likelihood ends up being something like log(p(I) * (p(Y_new) * p(I)) * p(Y))
. The term in parenthesis is introduced by the call to _loglik
/subtensor_loglik
, and the erroneous p(I)
and p(Y)
surrounding it are from the io_toposort
walk through the original graph in loglik
.
Since the io_toposort
walk in loglik
knows nothing about what _loglik
has already done, it just keeps walking the nodes in q
and "re-adds" the terms/nodes that were transformed and computed in _loglik
/subtensor_loglik
.
ae852d1
to
ed53f52
Compare
Remember, here we're developing the general approach that PyMC will also use, so we can't really fall back on that.
That's essentially what we're doing. The only distinction is that we must produce log-likelihoods for the nodes with value variables specified (i.e. the keys in Since the only two I've added an example in the docstring that should clarify what |
@aesara-devs, should we rename The |
logprob sounds fine (and _logprob for the individual "dispatched" terms). Measure might be more close to the "definition" but probably more foreign to most people. |
90bd831
to
a802156
Compare
All right, The old All that remains is to determine how we want to handle the old For instance, I'm pretty sure the same missing data situation can be handled outside of At most, we might need to create a transform that strips the |
aeppl/loglik.py
Outdated
if sum: | ||
logpdf_var = at.sum(logpdf_var) | ||
else: | ||
raise NotImplementedError( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the plan to dispatch to other Ops in here (such as for the affine transform)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, that's all handled via rewrites in RVSinker
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay I think I see what you have in mind. Is the goal to convert all graphs into basic random-variable nodes first and then simply rely on the respective logpdfs
? Sounds a bit more limited than what I had in mind, but perhaps it's the best first approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds a bit more limited than what I had in mind, but perhaps it's the best first approach.
It's not limited in any way—at least not relative to the dispatch approach. It is significantly simpler, though.
For example, I just finished implementing everything, and, now, joint_logprob
is very direct and shouldn't need to be changed in order to provide any of the features we've considered (e.g. support for convolutions, affine transforms, Scan
/for
-loops, etc.)
In the present form, joint_logprob
takes a graph, rewrites it, and forms the sum of each RandomVariable
's log-probability. That's it.
This neatly compartmentalizes the rewriting work by placing it squarely within the optimization/rewriting framework where it belongs.
It looks quite neat. Any plans for the specification of conditionals? |
That's already covered by the choice of entries in the Do you mean something else? |
I might be missing it. I mean the disambiguation when you have |
OK, I think I know what you're asking. Yeah, we can look into that soon. |
Can we use |
a802156
to
a2d5046
Compare
1c9d4bd
to
6514ae3
Compare
While trying to create a |
f5f9471
to
56f1021
Compare
7f4c87f
to
fad89a2
Compare
fad89a2
to
9979344
Compare
It looks great but I think I still have the same conceptual issue. How do you add arbitrary expressions to the logp graph when these cannot just be written in the form of of basic existing random variables. For instance how would you add the Jacobian if a user wanted to define a log-beta? y_rv = exp(beta(2, 2))
joint_logprob(y_rv, {y_rv : y_value} The graph would be (I think)
It don't see in the current approach how you add the second tem to the graph. I see how you rewrite the Am I missing it or is something that was not supposed to be implemented or just for now. Otherwise the joint prob looks really neat and the graph optimization at the beginning looks promising for some crazy experiments :D |
Which graph? The transforms are applied to the "sample-space" graph (i.e. the one with There are a couple ways to accomplish what you're asking via the sample-space graph alone, and another that involves using For the former, we can write optimizations that generate specialized |
I think that's the element I was missing. Thanks |
This PR puts
loglik
back into working order.RandonVariable
forms for observed non-RandonVariable
Op
s