Update posterior_predictive notebook #2853

ColCarroll · 2018-02-11T15:51:48Z

Reran with @junpenglao's styles from #2834, and also updated grammar, style, and links a bit (we quote extensively from an Edward tutorial which was removed).

junpenglao · 2018-02-11T16:56:39Z

Looks good, nitpicks:

print('Runing on PyMC3 v%s'%pm.__version__) to have nice output
It always bothers me that in the last figure the errorbar goes outside of the [0, 1], could you use confidence interval instead?

ColCarroll · 2018-02-11T17:03:49Z

Can do – that's a funny example because the ppc is just sampling outcomes, not probabilities. I would almost rather put a disc, scaled by number of samples, at y=0 and y=1 to view these samples.

junpenglao · 2018-02-11T17:08:53Z

Agree - for discrete observed RVs I also prefer to sample the latent continuous parameters. Currently it is not straightforward to do: you either need to write the forward function yourself or use draw_value (which is pretty slow). I am hoping it can be addressed when somebody pick up our 2018 GSOC to implement the likelihood-free inference.

aloctavodia · 2018-02-11T21:28:15Z

Just a detail, but we can avoid importing seaborn, if we use

ax.hist([n.mean() for n in ppc['n']], alpha=0.5)

instead of

sns.distplot([n.mean() for n in ppc['n']], kde=False, ax=ax)

fonnesbeck · 2018-02-11T22:23:19Z

Seaborn plots look a lot nicer than MPL though

aloctavodia · 2018-02-11T23:56:06Z

But we will still use plt.style.use('seaborn-darkgrid'), so we get the seaborn's look.

with

ax.hist([n.mean() for n in ppc['n']], alpha=0.5)

and with

sns.distplot([n.mean() for n in ppc['n']], kde=False, ax=ax)

not exactly the same (mainly due to the different number of bins) but pretty similar.

fonnesbeck · 2018-02-12T00:02:50Z

Fair enough, in this case. In general seaborn generates nicer plots with less code, but here I agree there is no real difference.

ColCarroll · 2018-02-12T01:13:02Z

Any strong opinions on the last image, then? I prefer the below, along with a more descriptive discussion that this is one measure of centrality you could do with sample_ppc.

junpenglao · 2018-02-12T19:07:03Z

I prefer some errorbar tho

ColCarroll · 2018-02-12T19:37:27Z

There are two problems with the errorbar that are different levels of difficult:

1.~~I don't think there's a numpy built-in for Beta error bars, and I am lazy~~ scipy.stats.beta.interval! Undocumented, but passing 0.9 appears to give a 90% confidence interval.

The error bar is recording the MCMC error, right? I am not sure where else there is uncertainty in this model. Here's an example with a 90% confidence interval, and the "true" value of p:

junpenglao · 2018-02-12T19:57:05Z

The error bar is recording the MCMC error, right?

If you are ploting the error of the mean, yes. But you can also show the [2.5, 97.5] quantile to represent uncertainty right?

ColCarroll · 2018-02-12T19:59:47Z

Hrm... I think I don't understand where you would calculate that. It seems like there is only uncertainty in the Binomial variable here. How would you calculate the [2.5, 97.5] quantile?

junpenglao · 2018-02-12T20:29:43Z

What I had in mind is:

samples = 500
post_coeff = trace['coeff']
nsample = len(post_coeff)
postp = tinvlogit(post_coeff[np.random.randint(
    0, nsample, samples)][:, np.newaxis] * predictors_oos).eval()

ymean = postp.mean(axis=0)
yerr = np.percentile(postp, [2.5, 97.5], axis=0)
yerr[0, :] = ymean - yerr[0, :]
yerr[1, :] = yerr[1, :] - ymean

plt.errorbar(
    x=predictors_oos,
    y=ymean,
    yerr=yerr,
    linestyle='',
    marker='o')
plt.plot(predictors_oos, outcomes_oos, 'o')
plt.ylim(-.05, 1.05)
plt.xlabel('predictor')
plt.ylabel('outcome')

But you can barely see any uncertainty... I guess in this simple example this is not much point to do

Last nitpick: there is an unnecessary n_init in the last pm.sample()

ColCarroll · 2018-02-12T20:52:55Z

Just updated – I think I did the equivalent of your suggestion (at least in expectation!) The big difference is that I lowered the number of ppc samples to 100 so the error bars were more prominent.

Grammar, style, version edits

49d40f3

Address comments

b47e61f

junpenglao merged commit ffc0608 into pymc-devs:master Feb 13, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Update posterior_predictive notebook #2853

Update posterior_predictive notebook #2853

Uh oh!

ColCarroll commented Feb 11, 2018

Uh oh!

junpenglao commented Feb 11, 2018

Uh oh!

ColCarroll commented Feb 11, 2018

Uh oh!

junpenglao commented Feb 11, 2018

Uh oh!

aloctavodia commented Feb 11, 2018

Uh oh!

fonnesbeck commented Feb 11, 2018

Uh oh!

aloctavodia commented Feb 11, 2018

Uh oh!

fonnesbeck commented Feb 12, 2018

Uh oh!

ColCarroll commented Feb 12, 2018

Uh oh!

junpenglao commented Feb 12, 2018

Uh oh!

ColCarroll commented Feb 12, 2018

Uh oh!

junpenglao commented Feb 12, 2018

Uh oh!

ColCarroll commented Feb 12, 2018

Uh oh!

junpenglao commented Feb 12, 2018

Uh oh!

ColCarroll commented Feb 12, 2018

Uh oh!

Uh oh!

Uh oh!

Update posterior_predictive notebook #2853

Update posterior_predictive notebook #2853

Uh oh!

Conversation

ColCarroll commented Feb 11, 2018

Uh oh!

junpenglao commented Feb 11, 2018

Uh oh!

ColCarroll commented Feb 11, 2018

Uh oh!

junpenglao commented Feb 11, 2018

Uh oh!

aloctavodia commented Feb 11, 2018

Uh oh!

fonnesbeck commented Feb 11, 2018

Uh oh!

aloctavodia commented Feb 11, 2018

Uh oh!

fonnesbeck commented Feb 12, 2018

Uh oh!

ColCarroll commented Feb 12, 2018

Uh oh!

junpenglao commented Feb 12, 2018

Uh oh!

ColCarroll commented Feb 12, 2018

Uh oh!

junpenglao commented Feb 12, 2018

Uh oh!

ColCarroll commented Feb 12, 2018

Uh oh!

junpenglao commented Feb 12, 2018

Uh oh!

ColCarroll commented Feb 12, 2018

Uh oh!

Uh oh!