-
Notifications
You must be signed in to change notification settings - Fork 19
Closed
Description
Alright! It's time to seriously take care of AdvancedVI :D
Here are some of the things we talked about in the meeting back in October:
- There should be two distinct methods of optimization when the variational distribution is given as a function (like
update_q
) or a distribution from which the parameters change. - Hyperparameter optimization should be nicely implemented, a proposition was :
makelogπ(logπ, ::Nothing) = logπ
makelogπ(logπ, hyperparams) = logπ(hyperparams)
function vi(..., logπ; hyperparams = nothing)
...
while not_converged
logjoint = makelogπ(logπ, hyperparams)
for i in 1:n_inner
...
end
end
end
- We should condensate the updates on the variational parameters via a more "atomic"
step!
function
And here are some more personal points (disclaimer: I will be happy to take care of these different points)
- I don't think the current
ELBO
approach is good, the ELBO can always be splitted between an entropy term (depending only of the distribution) and an expectation term over the log joint. Most VI methods take advantage of this by computing the entropy gradient analytically (and smartly!), see "Doubly Stochastic Variational Inference" by Titias for instance. My proposition would be to split the gradient into two parts (grad_entropy + grad_expeclog), where one can specialize given the problem. - I would personally argue that
update_q
only makes sense with the current obsolete implementation using distributions with immutable fields likeTuringMvNormal
. See again Titsias using the reparametrization trick.
Metadata
Metadata
Assignees
Labels
No labels