Warn users about discrete parameters with nuts or advi #1902

aseyboldt · 2017-03-15T16:04:33Z

Right now we allow discrete parameters in models when sampling using nuts, and assign a metropolis step for those parameters. I can't see why this would work well even for moderate models sizes (every change in one of the discrete parameters will probably send us out of the typical set for the continuous ones, right?). Or do you know models where this works? If not, I would propose to print at least a warning if a user tries to do this. For an example where I think this lead to a problem, see this SO question

fonnesbeck · 2017-03-15T16:10:41Z

This generally works for me, so I do use it with NUTS. By "works" I mean that I get the same answer as when I switch over to a pure Metropolis sampling scheme. I typically only use it when there is only one or two non-continuous parameters. As to when it stops working, I have no idea. It would be an interesting research question.

I don't think it would even run at all with ADVI, would it? It certainly would not happen automatically, since ADVI does not use sample.

aseyboldt · 2017-03-15T16:13:20Z

No, initialization using advi seems to be disabled silently.

aseyboldt · 2017-03-15T16:17:18Z

The problem I see with this is that it is very easy for a new user to put some discrete parameter in a model, which may really mess up convergence. And if you don't know much about how the samplers work it is not easy to see why this is a problem. It also does not make it easier that a lot of valid models contain observed discrete RVs.

twiecki · 2017-03-15T16:55:30Z

@aseyboldt What would be the particular problem of being moved outside the typical set? Wouldn't that also happen if you mix other samplers?

fonnesbeck · 2017-03-15T17:09:28Z

I think its an open question. I mentioned it to the Stan guys once and they did not seem to have an intuition about it either.

aseyboldt · 2017-03-30T09:24:28Z

Sorry, I forgot about this issue.

I certainly can't claim to have a good understanding of this, but here is what I had in mind when I mentioned the typical sets. Lets say we have a distribution with one binary parameter $\alpha$ and a couple of continuous parameters $\theta$. Also, to keep it simple $P(\alpha=0) = P(\alpha=1)$. If the typical sets of $\theta | \alpha=0$ and $\theta | \alpha=1$ don't overlap much, then the sampler can't switch between the states for alpha, and gets stuck at one of the values. My understanding is that this could happen pretty quickly if a number of values in $\theta$ depend on $\alpha$ – even if all the typical sets of $\theta_i | \alpha=0, \theta_{\ne i}$ and $\theta_i | \alpha=1, \theta_{\ne i}$ overlap. For example:

N = 100
with pm.Model() as model:
    alpha = pm.Bernoulli('alpha', p=0.5)
    pm.Normal('theta', mu=alpha, sd=1, shape=N)

If N is 1, we can sample from this easily. But for N=100 alpha is stuck at 0 for the whole trace.
(Of course we could reparameterize y as N(0, 1) + alpha and avoid that problem, but still...)

I don't know how much of a problem this is in real world applications. I just remember running into some trouble like this some time ago, but can't even find the model anymore.

Edit: Ahhhh, no latex on github...

ghost mentioned this issue May 29, 2017

WIP: replace Hessian scaling guessing by an identity matrix #2232

Merged

twiecki closed this as completed Dec 22, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Warn users about discrete parameters with nuts or advi #1902

Warn users about discrete parameters with nuts or advi #1902

aseyboldt commented Mar 15, 2017

fonnesbeck commented Mar 15, 2017 •

edited

Loading

aseyboldt commented Mar 15, 2017

aseyboldt commented Mar 15, 2017

twiecki commented Mar 15, 2017

fonnesbeck commented Mar 15, 2017

aseyboldt commented Mar 30, 2017 •

edited

Loading

Warn users about discrete parameters with nuts or advi #1902

Warn users about discrete parameters with nuts or advi #1902

Comments

aseyboldt commented Mar 15, 2017

fonnesbeck commented Mar 15, 2017 • edited Loading

aseyboldt commented Mar 15, 2017

aseyboldt commented Mar 15, 2017

twiecki commented Mar 15, 2017

fonnesbeck commented Mar 15, 2017

aseyboldt commented Mar 30, 2017 • edited Loading

fonnesbeck commented Mar 15, 2017 •

edited

Loading

aseyboldt commented Mar 30, 2017 •

edited

Loading