What is the `p` random variable for in glm models? #1717

chumsley · 2017-01-26T15:51:39Z

I was using glm to create a simple logistic regression model, and I noticed that it seems to be adding a random variable named p. This variable is sampled over along with the intercept and coefficients, and changing its value seems to affect the total logp of the model. (I.e., if I pass in samples to the model's logp method that differ only in p_logodds_ I get different return values).

I don't understand what this variable is for, and I am concerned that it is interfering with proper sampling---at any rate, when I sample from the model it seems to give results which strongly disagree with the equivalent model and data in R.

Is p meant to be there? If so, what is it for?

Minimal confusing example:

import numpy as np
import pymc3 as pm

ys = np.array([0, 0, 1, 0, 1, 1])
xs = np.array([10, 5, 1, -1, -5, -10])
data = {"x":xs, "y":ys}
with pm.Model():
    pm.glm.glm("y ~ x", data, family=pm.glm.families.Binomial())
    start = pm.find_MAP()
    step = pm.NUTS(scaling=start)
    trace = pm.sample(1000, step=step)
    print pm.df_summary(trace)

    sam = trace[-1]
    ix = np.argmax([abs(s['p_logodds_'] - sam['p_logodds_']) for s in trace])
    sam2 = dict(sam)
    sam2['p'] = trace[ix]['p']
    sam2['p_logodds_'] = trace[ix]['p_logodds_']

    print "\nlogp(p=%s)=%s,  logp(p=%s)=%s" % (sam['p'], M.logp(sam), sam2['p'], M.logp(sam2))

"""
Optimization terminated successfully.
         Current function value: 32.964035
         Iterations: 7
         Function evaluations: 10
         Gradient evaluations: 10
100%|##########| 1000/1000 [00:00<00:00, 1521.61it/s]
               mean        sd  mc_error   hpd_2.5  hpd_97.5
Intercept -3.548549  0.979635  0.088712 -3.750052 -3.750052
x         -1.061083  0.097358  0.006966 -1.058443 -1.058443
p          0.082809  0.106645  0.009598  0.061961  0.061961

logp(p=0.00116844038631)=-39.3032225537,  logp(p=0.995574022242)=-37.9746678716
"""

Thanks!

The text was updated successfully, but these errors were encountered:

twiecki · 2017-01-28T12:35:33Z

Hm, yeah, that shouldn't be there. Not sure where that's coming from.

bhargavvader · 2017-02-16T20:07:22Z

I was trying to have a go at this but really had no idea what to do - any pointers?

twiecki · 2017-02-16T20:32:18Z

I think https://github.com/pymc-devs/pymc3/blob/master/pymc3/glm/families.py#L104 just needs to be an empty dict.

An attempt to solve pymc-devs#1717 and issues with `docs/source/notebooks/GLM-logistic.ipynb` at the same time. This commits reverts change made in PR pymc-devs#685.

An attempt to solve #1717 and issues with `docs/source/notebooks/GLM-logistic.ipynb` at the same time. This commits reverts change made in PR #685.

bhargavvader · 2017-02-26T15:00:23Z

Is this issue fixed now?

twiecki added the beginner friendly label Feb 13, 2017

uhho added a commit to uhho/pymc3 that referenced this issue Feb 22, 2017

Remove prior from glm.family.Binomial

95aed8d

An attempt to solve pymc-devs#1717 and issues with `docs/source/notebooks/GLM-logistic.ipynb` at the same time. This commits reverts change made in PR pymc-devs#685.

uhho mentioned this issue Feb 22, 2017

Remove prior from glm.family.Binomial #1809

Merged

twiecki pushed a commit that referenced this issue Feb 24, 2017

Remove prior from glm.family.Binomial

8ad1af2

An attempt to solve #1717 and issues with `docs/source/notebooks/GLM-logistic.ipynb` at the same time. This commits reverts change made in PR #685.

twiecki closed this as completed Feb 26, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

What is the `p` random variable for in glm models? #1717

What is the `p` random variable for in glm models? #1717

chumsley commented Jan 26, 2017 •

edited by twiecki

Loading

twiecki commented Jan 28, 2017

Uh oh!

bhargavvader commented Feb 16, 2017

Uh oh!

twiecki commented Feb 16, 2017

Uh oh!

bhargavvader commented Feb 26, 2017

Uh oh!

Uh oh!

What is the p random variable for in glm models? #1717

What is the p random variable for in glm models? #1717

Comments

chumsley commented Jan 26, 2017 • edited by twiecki Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

twiecki commented Jan 28, 2017

Uh oh!

bhargavvader commented Feb 16, 2017

Uh oh!

twiecki commented Feb 16, 2017

Uh oh!

bhargavvader commented Feb 26, 2017

Uh oh!

What is the `p` random variable for in glm models? #1717

What is the `p` random variable for in glm models? #1717

chumsley commented Jan 26, 2017 •

edited by twiecki

Loading