-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
WIP: Implement opvi #1694
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Implement opvi #1694
Conversation
There is no support for aevb yet, it is going to be implemented soon |
elbo = mf.elbo(i) | ||
elbos = theano.function([i], elbo) | ||
self.assertEqual(elbos(1).shape[0], 1) | ||
self.assertEqual(elbos(10).shape[0], 10) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this not ready yet?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've copied this file from #1600, interface is different so tests will fail. I'll refactor them later when interface will be convenient
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I get you :)
pymc3/theanof.py
Outdated
|
||
@change_flags(compute_test_value='off') | ||
def launch_rng(rng): | ||
"""Helper function for safe launch of rng. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe rng
needs expanded here in the docstring.
""" | ||
vars_ = [var for var in model.vars if not isinstance(var, pm.model.ObservedRV)] | ||
if any([var.dtype in pm.discrete_types for var in vars_]): | ||
raise ValueError('Model should not include discrete RVs') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is good :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the paper authors suggest the way how to deal with discrete vars. The approach is described in the appendix. See https://arxiv.org/abs/1610.09033
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that it can be possibly implemented in the future
I'll need to review the papers to really get on top of this, which will be difficult. Is anyone else au fait with this? Maybe @taku-y or @ColCarroll or @twiecki |
Thanks for working on this! |
pymc3/theanof.py
Outdated
def tt_rng(): | ||
""" | ||
Get the package-level random number generator. | ||
Returns |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need blank lines in-between.
pymc3/distributions/dist_math.py
Outdated
Det = tt.nlinalg.det(S) | ||
delta = x - mean | ||
k = S.shape[0] | ||
result = k * tt.log(2 * np.pi) + tt.log(Det) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be better to use logdet(S), which is more numerically stable log(det(S)) especially for high dimensional variables. See #1012.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That PR in Theano is not merged yet. I can copy and paste it into theanof
or our math
maybe
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#1777 is merged now.
pymc3/variational/opvi.py
Outdated
|
||
class ObjectiveFunction(object): | ||
def __init__(self, op, tf): | ||
self.random = lambda: op.approx.random() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be better to use def instead of lambda, PEP8.
pymc3/variational/opvi.py
Outdated
def __call__(self, z): | ||
return self.obj(z) | ||
|
||
def updates(self, z, obj_optimizer=None, test_optimizer=None, more_params=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updates()
looks good to me.
Rebased |
This looks good. Are there changes that need made to Theano for this to work? |
pymc3/variational/opvi.py
Outdated
return self.logq(z) - self.logp(z) | ||
|
||
|
||
class LS(Operator): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose this is still not functional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this does not work yet. I have great vacations and will resume to work in 14 days:)
Welcome for feedback)
@springcoil no changes for Theano are required. The only serious problem I know is with langevine stein operator. I need theoretical help with it. Here I'm trying to resolve the it. |
@ferrine I think we should forge ahead here without langevin stein. |
@twiecki I've thought about it. Good idea. |
I've found problems with minibatch training. Can't get why that happens. |
Rebased |
FinallyUsage# assuming neural_network, sigmoid_output, input_var are defined above
with neural_network:
inference = ADVI()
# BTW inference.approx is already created
approx = inference.fit(10000)
new_input = tt.matrix()
proba = approx.apply_replacements(
sigmoid_output,
deterministic=False, # It's default, just for clarity
more_replacements={input_var:new_input}
)
predict_proba = theano.function([new_input], proba) Features
BenchmarksI've checked if this works with http://twiecki.github.io/blog/2016/06/01/bayesian-deep-learning/
TODO:
Further ideas
|
What about merge? |
It might be better to update example notebooks with the new interface, while commenting out the old one. |
else: | ||
std = tau**(-1) | ||
std += eps | ||
return c - tt.log(tt.abs_(std)) - (x - mean) ** 2 / (2 * std ** 2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
c
is scaled with the number of elements in x
: c
-> c * x.ravel().shape[0]
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are elemwise operations, so that's ok here
pymc3/model.py
Outdated
|
||
Returns | ||
------- | ||
order, flat_view, view |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems conflicting with the code: order, flat_view, view
-> flat_view, order, view
This looks really good. My only recommendation is a tweak to the API to make it more consistent with the rest of the PyMC fitting methods. At the moment, it is identical to the scikit-learn interface, where you instantiate a class and call its
this makes it closer to what we do for MCMC. Even better might be something like:
which mirrors I think a consistent interface is important, so I'd like to hear what others think. If we like a scikit-learn-like API better, then I'd suggest making changes across the board to accomodate it. |
@fonnesbeck Yes, it is good idea. I think that I'll still leave OOP interface for advanced users (for training model sequentially) while adding a simple shortcut |
Also, we probably want to modify one or more of the ADVI examples to use this interface. |
Rebased. |
Congrats @ferrine, this is a huge one. |
Yes, well done! I will be putting this interface to use right away. |
I don't see a Edit: never mind, I see it now. |
* migrate useful functions from previous PR (cherry picked from commit 9f61ab4) * opvi draft (cherry picked from commit d0997ff) * made some test work (cherry picked from commit b1a87d5) * refactored approximation to support aevb (without test) * refactor opvi delete unnecessary methods from operator, change method order * change log_q_local computation * add full rank approximation * add more_params argument to ObjectiveFunction.updates (aevb case) * refactor density computation in full rank approximation * typo: cast dict values to list * typo: cast dict values to list * typo: undefined T in dist_math * refactor gradient scaling as suggested in approximateinference.org/accepted/RoederEtAl2016.pdf * implement Langevin-Stein (LS) operator * fix docstring * add blank line in docs * refactor ObjectiveFunction * add not working LS Op test * experiments with not working LS Op * change activations * refactor networks * add step_function * remove Langevin Stein, done refactoring * remove Langevin Stein, done refactoring * change optimizers * refactor init params * implement tests * implement Inference * code style * test fix * add minibatch test (fails now) * add more tests for minibatch training * add logdet to FullRank approximation * add conversion of arrays to floatX * tiny changes * change number of iterations * fix test and pylint check * memoize functions in Objective function * Optimize code a lot * a bit more efficient pickling * add docs * Add MeanField -> FullRank parameter transfer * refactor MeanField and FullRank a bit * fix FullRank bug with shapes in random * refactor Model.flatten (CC @taku-y) * add `approximate` to inference * rename approximate->fit * change abbreviations * Fix bug with scaling input variable in aevb * fix theano bottleneck in graph * more efficient scaling for local vars * fix typo in local Q * add aevb test * refactor memoize to work with my objects * add tests for numpy view usage * pickle-hash fix * pickle-hash fix again * add node sampling + make up some code * add notebook with example * sample_proba explained
* Added live_traceplot function * Cosmetic change * Changed the API to pm.sample(..., live_plot=True) * Don't include `-np.inf` in calculating average ELBO (#1880) * Adds an infmean for advi reporting * fixing typo * Add tutorial to detect sampling problems (#1866) * Expand sampler-stats.ipynb example include model diagnose from case study example in Stan http://mc-stan.org/documentation/case-studies/divergences_and_bias.html * Sampler Diagnose for NUTS * descriptive annotation and axis labels * Fix typos * PEP8 styling * minor updates 1, add example to examples.rst 2, original content in Markdown code block * Make install scripts idempotent (#1879) * DOC Change heading names. * Add examples of censored data models (#1870) * Raise TypeError on non-data values of observed (#1872) * Raise TypeError on non-data values of observed * Added check for observed TypeError * Make exponential mode have the correct shape * Fix support of LKJCorr * Added tutorial notebook on updating priors * Fixed y-axis bug in forestplot; added transform argument to summary * Style cleanup * Made small changes and executed the notebook * Added probit and invprobit functions * Added carriage return to end of file * Fixed indentation * Changed probit test to use assert_allclose * Fix tests for LKJCorr * Added warning for ignoring init arguments in sample * Kill stray tab * Improve performance of transformations * DOC Add new features * Bump version. * Added docs and scripts to MANIFEST * WIP: Implement opvi (#1694) * migrate useful functions from previous PR (cherry picked from commit 9f61ab4) * opvi draft (cherry picked from commit d0997ff) * made some test work (cherry picked from commit b1a87d5) * refactored approximation to support aevb (without test) * refactor opvi delete unnecessary methods from operator, change method order * change log_q_local computation * add full rank approximation * add more_params argument to ObjectiveFunction.updates (aevb case) * refactor density computation in full rank approximation * typo: cast dict values to list * typo: cast dict values to list * typo: undefined T in dist_math * refactor gradient scaling as suggested in approximateinference.org/accepted/RoederEtAl2016.pdf * implement Langevin-Stein (LS) operator * fix docstring * add blank line in docs * refactor ObjectiveFunction * add not working LS Op test * experiments with not working LS Op * change activations * refactor networks * add step_function * remove Langevin Stein, done refactoring * remove Langevin Stein, done refactoring * change optimizers * refactor init params * implement tests * implement Inference * code style * test fix * add minibatch test (fails now) * add more tests for minibatch training * add logdet to FullRank approximation * add conversion of arrays to floatX * tiny changes * change number of iterations * fix test and pylint check * memoize functions in Objective function * Optimize code a lot * a bit more efficient pickling * add docs * Add MeanField -> FullRank parameter transfer * refactor MeanField and FullRank a bit * fix FullRank bug with shapes in random * refactor Model.flatten (CC @taku-y) * add `approximate` to inference * rename approximate->fit * change abbreviations * Fix bug with scaling input variable in aevb * fix theano bottleneck in graph * more efficient scaling for local vars * fix typo in local Q * add aevb test * refactor memoize to work with my objects * add tests for numpy view usage * pickle-hash fix * pickle-hash fix again * add node sampling + make up some code * add notebook with example * sample_proba explained * Revert "small fix for multivariate mixture models" * Added message about init only working with auto-assigned step methods * doc(DiagInferDiv): formatting fix in blog post quote. Closes #1895. (#1909) * delete unnecessary text and add some benchmarks (#1901) * Add LKJCholeskyCov * Added newline to MANIFEST * Replaced package list with find_packages in setup.py; removed examples/data/__init__.py * Fix log jacobian in LKJCholeskyCov * Updated version to rc2 * Fixed stray version string * Fix indexing traces with steps greater one * refactor variational module, add histogram approximation (#1904) * refactor module, add histogram * add more tests * refactor some code concerning AEVB histogram * fix test for histogram * use mean as deterministic point in Histogram * remove unused import * change names of shortcuts * add names to shared params * add new line at the end of `approximations.py` * Add documentation for LKJCholeskyCov * SVGD problems (#1916) * fix some svgd problems * switch -> ifelse * except in record * Histogram docs (#1914) * add docs * delete redundant code * add usage example * remove unused import * Add expand_packed_triangular * improve aesthetics * Bump theano to 0.9.0rc4 (#1921) * Add tests for LKJCholeskyCov * Histogram: use only free RVs from trace (#1926) * use only free RVs from trace * use memoize in Histogram.histogram_logp * Change tests for histogram * Bump theano to be at least 0.9.0 * small fix to prevent a TypeError with the ufunc true_divide * Fix tests for py2 * Add floatX wrappers in test_advi * Changed the API to pm.sample(..., live_plot=True) * Better formatting
This PR suggests new approach to variational inference via OPVI framework
Convenient interface is still a question, I'm open to suggestions