Skip to content

Issues getting variable in hierarchical model to update #2271

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
aspassiani opened this issue Jun 5, 2017 · 16 comments
Closed

Issues getting variable in hierarchical model to update #2271

aspassiani opened this issue Jun 5, 2017 · 16 comments

Comments

@aspassiani
Copy link

I have previously been working with OpenBUGS/WinBUGS to perform my Bayesian statistics but have decided to move over to using the PYMC3 package in Python. I believe I have correctly converted my code to be used with PYMC3 but I am having issues with getting my output from PYMC3 to match my output from BUGS, specifically one of my variables does not update from its initialized values. My code is written as follows:

model2 = pm.Model()
with model2:
    # Co-efficients
    alpha = pm.Normal('alpha', mu=0.0, sd=1000)
    beta = pm.Deterministic('beta', tt.exp(alpha))
    lamda = pm.Gamma('lamda', mu=1.0e-2, sd=100)
    # Population Effect
    pop_eff= pm.Deterministic('pop_eff', tt.exp((-1*beta)/tt.exp(pop_den_all))) 
    # Mean/Expected Event Occurrence Rate
    lamdal = pm.Deterministic('lamdal', lamda*5625)
    # Actual Occurrence of Events
    Tlatent_new = pm.Poisson('Tlatent_new', mu=lamdal, shape=56)
    # Observed Event Counts
    Tobs_new = pm.Binomial('Tobs_new', n=Tlatent_new, p=pop_eff, shape=56, observed=Tobs)
    # Initialization
    start1 = {'alpha': 0.5, 'lambda': 1., 'Tlatent_new': Tlatent}
    start2 = {'alpha': 1., 'lambda': 2., 'Tlatent_new': Tlatent}
    start = [start1, start2]
    step1 = pm.Metropolis([alpha, lamda])
    step2 = pm.BinaryGibbsMetropolis([Tlatent_new])

with model2:
    trace2 = merge_traces([pm.sample(2000, step=[step1, step2], tune=1000, start=start[i], chain=i)
                            for i in range(2)])

with sample data:

#Sample Data
Tobs = np.array([0,0,1,0,0,0,8,3,0,0,2,0,1,0,0,0,0,3,4,1,1,0,1,0,1,0,0,0,0,0,0,0,0,3,0,0,0,1,5,2,
                    27,17,26,9,6,0,1,3,4,5,1,4,16,73,26,3])
pop_den_all = np.array([0.01,0.69,1.03,0.12,0.22,0.59,26.60,9.76,0.0033,0.064,0.81,0.96,0.76,1.01,
                        0.20,0.79,4.47,12.21,18.56,1.68,0.33,0.001,0.22,0.046,0.11,0.065,0.33,0.002,
                        0.065,0.021,0.33,0.086,0.60,1.43,0.072,0.13,0.73,2.89,7.22,5.42,24.35,104.88,
                        122.54,17.71,5.13,1.70,0.037,2.05,5.43,2.83,1.61,3.00,20.19,284.83,98.69,2.67])
#Initialization Data
Tlatent = np.array([80,80,80,80,80,80,80,80,80,80,80,80,80,80,80,80,80,80,80,80,80,80,80,80,80,
                    80,80,80,80,80,80,80,80,80,80,80,80,80,80,80,80,80,80,80,80,80,80,80,80,80,
                    80,80,80,80,80,80])

When I run the model, the variable Tlatent_new does not update from the initialized values I give it in Tlatent. This occurs when I want to include the effects of population density in my model using this line Tobs_new = pm.Binomial('Tobs_new', n=Tlatent_new, p=pop_eff, shape=56, observed=Tobs). When I do not include this line the output of Tlatent_new differs from the initial guess Tlatent.

I cannot figure out what is wrong with that additional line. I think it might have something to do with the use of the Binomial distribution because if I change it to a different distribution I will see a change in Tlatent_new from the initial guess Tlatent. I don't see why the Binomial distribution would not work. I thought my choice of sampling method might be an issue but I have tried changing that as well with no luck.

Any help as to why this does not work is greatly appreciated.

Thanks in advance,

@junpenglao
Copy link
Member

junpenglao commented Jun 5, 2017

The Binomial distribution is not working in your model because:
1, some elements in pop_eff being 0 or 1;
2, Tlatent_new can be smaller than Tobs;
making the logp of the model -inf. You can check it by doing:

Tobs_new.logp(model2.test_point)
model2.logp(model2.test_point)

Maybe you can try to reparameterize your model into a centered model, for example Tlatent_new = Tobs + Occur with Occur > 0?

@junpenglao
Copy link
Member

Seems to work with a small constraint on pop_eff, but you should still consider reparameterized your model.

with model2:
    # Co-efficients
    alpha = pm.Normal('alpha', mu=0.0, sd=1000)
    beta = pm.Deterministic('beta', tt.exp(alpha))
    lamda = pm.Gamma('lamda', mu=1.0e-2, sd=100)
    # Population Effect
    pop_eff= pm.Deterministic('pop_eff', tt.exp(-1e-10+(-1*beta)/tt.exp(pop_den_all))) 
    # Mean/Expected Event Occurrence Rate
    lamdal = pm.Deterministic('lamdal', lamda*5625)
    # Actual Occurrence of Events
    Tlatent_new = pm.Poisson('Tlatent_new', mu=lamdal, shape=56, testval=Tlatent)
    # Observed Event Counts
    Tobs_new = pm.Binomial('Tobs_new', n=Tlatent_new, p=pop_eff, observed=Tobs)

    trace1 = pm.sample(3000)

@aspassiani
Copy link
Author

I see, I assumes that it would be bounded to be equal to or larger than Tobs as that is how it was dealt with in BUGS. Thanks for pointing that out. I assume that PYMC3 doesn't offer a way to have Tobs set as a lower bound for Tlatent_new in the model?

@twiecki
Copy link
Member

twiecki commented Jun 5, 2017

There is pm.Bound() but not sure it works with another parameter.

@junpenglao
Copy link
Member

I am going to close this because it is not an issue but feel free to continue the conversation.

@aspassiani
Copy link
Author

I've tried constraining pop_eff like you have done, I see that it no longer makes logp of the model -inf but my Tlatent_new values are still equal to the starting value I assign to it (80), which it greater than any of the Tobs values (the largest being 73). I'm not sure if the Tlatent_new values changed for you but looking at my traces for Tlatent_new its value never deviates from 80. So, I don't quite understand how Tlatent_new being smaller than Tobs could be the problem.

I tired to use pm.Bound() to bound the Poisson distribution for Tlatent_new to see if that made a difference but that didn't change anything. I couldn't see if there was a way to have a different lower value for each cell to related to the corresponding Tobs value for the cell so I just used a value that was larger than the max Tobs.

@junpenglao
Copy link
Member

Are you on master and theano 0.9.0? Tlatent_new does change in my case.

with model2:
    trace1 = pm.sample(3000, njobs=2)
pm.traceplot(trace1, varnames=['alpha', 'lamda', 'Tlatent_new']);

This is what I got
image
Having said that, you might need to run the chain a bit longer for convergence.

Tlatent_new being smaller than Tobs is definatly a problem - you can try setting the testval to smaller than 73 (e.g., Tlatent = np.ones(56)*72).

You can also try integrate out the Poisson similar to the style in Stan.

@aspassiani
Copy link
Author

Okay, I see why it wasn't outputting, had to do with the sampling method I was using. I now get the same output as you have shown.

I understand that Tlatent_new needs to be larger than Tobs_new, just assumed that when giving it an initial value above Tobs_new and specifying the observed values for Tobs_new that successive iteration of Tlatent_new would be bounded by the observed values. I will try your suggestions as ways to fix my model.

Thanks!

@junpenglao
Copy link
Member

No problem! You can have a look at the more general tips about porting Bugs/JAGS model into pymc3 here:
http://pymc-devs.github.io/pymc3/notebooks/PyMC3_tips_and_heuristic.html

@aseyboldt
Copy link
Member

@aspassiani you could also try to use a continuous variable for Tlatent_new. That allows you to use NUTS, and you can set the bounds using:

BoundStudentT = pm.Bound(pm.StudentT, lower=Tobs)  # or whatever dist seems appropriate
Tlatent_new = BoundStudentT('Tlatent_new', nu=3, sd=sd)

If you need a Poisson you might be able to define a continuous version of it using pm.DensityDist.

@junpenglao
Copy link
Member

junpenglao commented Jun 6, 2017

@aseyboldt I dont think it works with the lower bound being a vector in pm.Bound. Currently, there are a few limitations of pm.Bound need to be improved.

@aseyboldt
Copy link
Member

@junpenglao @aspassiani oh. There is even some np.isinf in distributions.Bounded. This shouldn't be there...
The transformation itself works, so you could use this:

with pm.Model() as model:
    trafo = pm.distributions.transforms.lowerbound([1., 2.])
    pm.Normal('a', mu=0, sd=1, transform=trafo, shape=2, testval=[3., 3.])

@aseyboldt
Copy link
Member

xref #2277

@aspassiani
Copy link
Author

@aseyboldt is it not possible to use the transformation you showed for a Poisson distribution? I get this error when using it for a Poisson,

The error when converting the test value to that variable type:
TensorType(int64, vector) cannot store a value of dtype float64 without risking loss of precision. If you do not mind this loss, you can: 1) explicitly cast your data to int64, or 2) set "allow_input_downcast=True" when calling "function". Value: "array([ 100., 100., 100., ..., 100., 100., 100.])"

But if I use the same data for example a Normal distribution, like your example, I don't get that error.

I would like to try defining a continuous version of Poisson like you suggested in your earlier comment but I'm not quite sure how to use pm.DensityDist. Do you know of a good example that uses it that I could try to follow/re-create?

Thanks!

@aseyboldt
Copy link
Member

aseyboldt commented Jun 7, 2017

The transformation is only defined for continuous distributions.
Now that I think about it a bit, maybe using a gamma prior is better than some hacked continuous version of the poisson. Also, I'm a bit stuck at computing the normalization constant (or the part that depends on λ) if we use e^(-λ) λ^x / Γ(x+1) as pdf. How does ∫λ^x / Γ(x+1)dx depend on λ? And I want to have latex in github issues...

@aspassiani
Copy link
Author

@aseyboldt I'm not sure about that. But I'm still unsure of about changing the Poisson distribution.

The reasoning behind using the Poisson is because I am trying to model the occurrence of weather events. The Poisson distribution was chosen since it expresses the probability of a given number of events and it assumes the events occur independently. I then used this in the Binomial distribution since not all events that occur are observed. Population density is utilized to determine the probability of observing an event (pop_eff) which becomes my "p" parameter and Tlatent_new my "n" parameters. Tobs_new is therefore the number of events successfully observed and is given by my observational dataset, my observed values.

As @junpenglao pointed out apparently my pop_eff is also an issue. I get value of 1 or 0 because, I'll have areas where the population density is large enough to observe all events (pop_eff = 1), and areas where there is zero population (pop_eff = 0). OpenBUGS seemed to be able to deal with these two issues. I don't see constraining pop_eff like @junpenglao did, to allow the model to run, being an issue and affecting the output of the model much, but maybe I'm overlooking something.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants