Skip to content

Log Likelihood of MvGaussianRandomWalk Returns Array Rather Than Scalar #2859

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
a3huang opened this issue Feb 15, 2018 · 4 comments
Closed
Labels

Comments

@a3huang
Copy link

a3huang commented Feb 15, 2018

I noticed another small quirk with MvGaussianRandomWalk. When I run:

pm.MvGaussianRandomWalk.dist(mu=tt.constant([1,1]), tau=tt.constant(np.eye(2)))\
    .logp(tt.constant([[1,2], [3,4]])).eval()

I get:

array([-2.83787707, -2.83787707])

This output seems strange to me as I was expecting to see a single number representing the log likelihood of the single sequence of vectors I input. The log likelihood itself seems correct, but is repeated twice in this case.

Looking into this, I found the problem to be the line:

return self.init.logp(x[0]) + tt.sum(innov_like)

By default, self.init = Flat.dist() and since this is actually a univariate distribution, calling self.init.logp(x[0]) actually returns an array of shape (2,) since x[0] is an array of shape (2,). Therefore, this results in tt.sum(innov_like) being broadcasted into an array rather than a scalar.

I was thinking that we could probably replace this with:

return tt.sum(self.init.logp(x[0])) + tt.sum(innov_like)

This way, we will correctly return a scalar value when the initial distribution provided is univariate (i.e. the components of the initial vector are iid). Please let me know if I'm mistaken and this was actually the intended behavior.

Versions and main components

PyMC3 Version: 3.3
Theano Version: 1.0.1
Python Version: 2.7.12
Operating system: MacOS 10.13.3
How did you install PyMC3: pip

@junpenglao
Copy link
Member

I think you are right, cc @gBokiau so we can also change it in #2847

@junpenglao junpenglao added the bug label Feb 15, 2018
@krishamehta
Copy link

Can I work on this?

@junpenglao
Copy link
Member

@krishamehta thanks for your interested! However, it might be better to address it in #2847.
You can have a look at https://github.com/pymc-devs/pymc3/issues?q=is%3Aissue+is%3Aopen+label%3Abeginner_friendly which contains some beginer friendly issues :-)

@gBokiau
Copy link
Contributor

gBokiau commented Jun 5, 2018

Once #3008 gets approval and merged, we should:

  • replace all instances of tt.sum(logp) with logp_sum
  • add tests that assert that timeseries return log-probabilities of the expected shapes
  • use logp_sum where the tests fail

@gBokiau gBokiau mentioned this issue Jun 6, 2018
crgagne pushed a commit to crgagne/pymc3 that referenced this issue Jun 13, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants