Skip to content

Strange samples from multilevel model with bounded distributions #5668

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
finncatling opened this issue Mar 29, 2022 · 4 comments
Closed

Strange samples from multilevel model with bounded distributions #5668

finncatling opened this issue Mar 29, 2022 · 4 comments

Comments

@finncatling
Copy link

I'm getting strange samples from this simple multilevel model with bounded priors:

import pymc3 as pm

with pm.Model(coords={'plate_dim': [0] * 5}):
    BoundedNormal = pm.Bound(pm.Normal, lower=0.0)
    parent = BoundedNormal("parent", mu=1.0, sigma=0.4)
    child = BoundedNormal("child", mu=parent, sigma=0.05, dims='plate_dim')

    prior_samples = pm.sample_prior_predictive(10000, random_seed=1)

Inspecting the bivariate plots for the child prior samples:

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

g = sns.PairGrid(
    pd.DataFrame(prior_samples['child']),
    height=1.0,
    diag_sharey=False,
    corner=True
)
g = g.map_lower(plt.scatter, alpha=0.03, s=1)
g = g.map_diag(plt.hist, bins=20)
plt.show()

Figure_1

We see two distributions superimposed in each subplot:

  1. The correlated samples you would expect
  2. Some uncorrelated Gaussian samples. I assume that these are generated when a sample from parent is outside the bound (lower than 0). I'm not sure if this is expected behaviour but it is alluded to in Future of pm.Bound? #4800

The puzzling thing is that there are many more of the uncorrelated samples in some child variables (e.g. see towards bottom left of plot). I doubt this is expected?

NB. I'm bounding Gaussians here just for illustration purposes. I have seen pm.TruncatedNormal and this issue doesn't occur when it's used.

Thanks for your work on this great library!

Versions and main components

  • PyMC/PyMC3 Version: 3.11.4
  • Theano Version: 1.1.2
  • Python Version: 3.9.10
  • Operating system: Ubuntu 20.04
  • How did you install PyMC/PyMC3: conda
@ricardoV94
Copy link
Member

ricardoV94 commented Mar 29, 2022

This is probably the same problem described in #4643

The multiple children and the parent are likely inducing some extreme correlated rejection sampling. Plotting the parent variable together with those pair plots might help understand how exactly

Edit: Removed wild conjecture.

@ricardoV94
Copy link
Member

Anyway, it's better not to use prior predictive with Bounded variables, because they do not respect the hierarchical conditional independence necessary for correct forward draws. This was in fact disabled in the next major release of PyMC

@finncatling
Copy link
Author

The multiple children and the parent are likely inducing some extreme correlated rejection sampling. Plotting the parent variable together with those pair plots might help understand how exactly

Here's the plot:

prior_samples_with_parent

@ricardoV94
Copy link
Member

ricardoV94 commented Mar 30, 2022

I checked and it has nothing to do with the multiple children. I think the random method of bound was simply broken:

import arviz as az
import pymc3 as pm

with pm.Model():
    BoundedNormal = pm.Bound(pm.Normal, lower=0.0)
    parent = BoundedNormal("parent", mu=1.0, sigma=0.4)
    child = BoundedNormal(f"child", mu=parent, sigma=0.05)
    prior_samples = pm.sample_prior_predictive(10000, random_seed=1)

az.plot_pair(prior_samples, var_names=["parent", "child"])

lower = 0.0

image

If I set the lower bound to smaller and smaller values, it gets better:

lower = -0.5:

image

lower = -1.0

image


My best guess at the moment, is that when the Bounded distribution has to resample values, it "forgets" the hierarchical dependency, leading to that uncorrelated blob. Somehow, in the first set of draws this dependency is still maintained.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants