boost minibatches #2171

ferrine · 2017-05-10T17:41:37Z

support multidimentional subsampling Possible extensions of total_size #2125
add minibatch inteface API proposal: container or data class #2056
tests
notebooks

twiecki · 2017-05-12T13:21:48Z

@ferrine is this good from your end?

ferrine · 2017-05-13T07:12:08Z

I need tests for minibatch class

fonnesbeck · 2017-05-22T19:21:49Z

Would be great to convert the notebooks that use the old minibatch interface over to this one as part of the PR.

ferrine · 2017-05-27T15:37:35Z

Once tests pass I'll cherry pick #2097

fonnesbeck · 2017-05-27T15:45:02Z

So, does this replace DataSampler?

ferrine · 2017-05-27T16:28:02Z

Yes. This implementation is much more faster in runtime.

ferrine · 2017-05-27T21:33:51Z

DataSampler is used only in tests. I think we are free to remove it before #2223

ferrine · 2017-05-28T13:52:19Z

Previously I was suggested to change OOP inference to pm.fit. There is OOP inference in quick start api. I see it suitable for VI notebooks. So I am a bit confused why should we hide it in examples. It is very handy.
Thus I see further work in introducing new minibatches. Any other thoughts?

fonnesbeck · 2017-05-28T18:48:32Z

@ferrine it's a matter of having consistent primary interfaces across PyMC3. I'm fine with having convenient methods for advanced users, but I think we need to be showing the same approach to fitting models among the suite of notebooks that we maintain. If we think OO is the way to go, we can have that discussion and change it across the board, including for the MCMC classes.

That said, I'm not opposed to having a notebook of advanced features that could show off a comprehensive example that uses all the handy shortcuts that you have made available.

ferrine · 2017-05-30T21:58:37Z

Hm, seems like very comprehensive __init__ and clone method can be together

fonnesbeck · 2017-06-01T01:32:48Z

pymc3/data.py

+    the same thing would be but less convenient
+    >>> x.shared.set_value(pm.floatX(np.random.laplace(size=(100, 100))))
+
+    programmatic way to change storage is as following


fonnesbeck · 2017-06-01T01:35:09Z

pymc3/data.py

+    if we want 1d slice of size 10 we do
+    >>> x = Minibatch(data, batch_size=10)
+
+    Note, that your data is casted to `floatX` if it is not integer type


fonnesbeck · 2017-06-01T01:37:04Z

pymc3/data.py

+    >>> x = Minibatch(datagen(), batch_size=100, update_shared_f=datagen)
+    >>> x.update_shared()
+
+    To be more precise of how we get minibatch, here is a demo


To be more concrete about how we get ...

fonnesbeck · 2017-06-01T01:37:56Z

pymc3/data.py

+    That's done. So if you'll need some replacements in the graph 
+    >>> testdata = pm.floatX(np.random.laplace(size=(1000, 10)))
+
+    you are free to use a kind of this one as `x` is regular Theano Tensor


(I don't quite understand this sentence)

fonnesbeck · 2017-06-01T01:38:12Z

pymc3/data.py

+    >>> moredata = np.random.rand(10, 20, 30, 40, 50)
+
+    default total_size is then (10, 20, 30, 40, 50) but 
+    can be less verbose in sove cases


fonnesbeck · 2017-06-01T01:51:02Z

The down side to having the batches as tensors is that we cannot iterate over them, which is sometimes necessary. You can't even pass them to scan as a sequence.

ferrine · 2017-06-01T05:52:10Z

@fonnesbeck thanks for reviewing this:)
Case of iterating minibatches is not covered ofc. But scan is ok. You can do arbitrary theano operations over this minibatch. It is sampled once per graph execution

ferrine · 2017-06-06T16:14:58Z

I think that's done after rebase, feel free to leave feedback about refactored notebooks

ferrine · 2017-06-07T06:52:44Z

Need review

taku-y · 2017-06-09T00:26:48Z

In convolutional_vae_keras_advi.ipynb from pymc3.variational import advi_minibatch is not used.

taku-y · 2017-06-09T00:35:15Z

LDA notebook explains that "is automativally exponentiated (thus bounded to be positive) in advi_minibatch(), the estimation function.". advi_minibatch() should be removed, or readers might be confused.

(cherry picked from commit ab704bf)

(cherry picked from commit 6fc6136)

junpenglao · 2017-06-11T11:40:53Z

Did you also update test_advi.py? It's still testing the old advi api.

junpenglao · 2017-06-11T12:32:12Z

Oh I see there is test_variational_inference.py for the new api

taku-y · 2017-06-12T01:58:30Z

pymc3/tests/conftest.py

+
+    def __next__(self):
+        idx = (self.rng
+               .uniform(size=self.n,


You didn't use randint() because Theano doesn't support randint(), right?

Yes, that is it

But here I've left it just for test purposes. I use numpy for this test

It might be a reasonable option wrt consistency between numpy and theano.

taku-y · 2017-06-12T05:22:50Z

I had a close look at this PR and I believe it's ready for merge. Although some tests failed but those doesn't seem to be related to the PR. Actually, test_minibatches.py passed with 100%. I want to merge it unless I hear otherwise.

pwl · 2017-06-12T13:18:34Z

I'm getting dimension mismatch with the following model:

data = np.random.rand(101, 202)
data_t = pm.Minibatch(data, [10,11])
with pm.Model() as model:
    U = pm.Normal('U', 0, 1, shape=(101, 3))
    V = pm.Normal('V', 0, 1, shape=(202, 3))
    reads = pm.Normal('reads', mu=pm.math.dot(U,V.T), observed=data_t,
                      total_size=data.shape)

ValueError: Input dimension mis-match. (input[0].shape[0] = 10, input[1].shape[0] = 101)

but it works when I replace data_t with data.

junpenglao · 2017-06-12T13:32:08Z

@pwl I think the correct usage goes something like this?

data = np.random.rand(101, 202)
data_t = pm.Minibatch(data, [10,11])
with pm.Model() as model:
    U = pm.Normal('U', 0, 1, shape=(10, 3), total_size=(101, 3))
    V = pm.Normal('V', 0, 1, shape=(11, 3), total_size=(202, 3))
    reads = pm.Normal('reads', mu=pm.math.dot(U,V.T), observed=data_t,
                      total_size=data.shape)

pwl · 2017-06-12T13:34:01Z

@junpenglao thanks! I was just testing that, it seems that it works. How can I extract the full extent of U and V after fitting the model? sample is returning just the minibatch-sized chunk of U.

EDIT: just to clarify, I'm doing

data = np.random.rand(101, 202)
data_t = pm.Minibatch(data, [10,11])
with pm.Model() as model:
    U = pm.Normal('U', 0, 1, shape=(10, 3), total_size=(101, 3))
    V = pm.Normal('V', 0, 1, shape=(11, 3), total_size=(202, 3))
    reads = pm.Normal('reads', mu=pm.math.dot(U,V.T), observed=data_t,
                      total_size=data.shape)
    advi = pm.ADVI()
    approx = advi.fit(1)
    trace = approx.sample(1)
trace['U'].shape # gives (1, 10, 3)

junpenglao · 2017-06-12T13:37:49Z

@pwl I am not sure, this is a brand new feature - I havent got a chance to play with it yet. @ferrine?

pwl · 2017-06-12T13:54:57Z

Ok, I think I misunderstood the interface and somehow ignored the need to add indices by hand as in

n_i = 101
n_j = 202
data = np.random.rand(n_i, n_j)
data_t = pm.Minibatch(data, [10,11])
data_i_idx = pm.Minibatch(range(n_i),10)
data_j_idx = pm.Minibatch(range(n_j),11)
with pm.Model() as model:
    U = pm.Normal('U', 0, 1, shape=(n_i, 3))[data_i_idx,:]
    V = pm.Normal('V', 0, 1, shape=(n_j, 3))[data_j_idx,:]
    reads = pm.Normal('reads', mu=pm.math.dot(U,V.T), observed=data_t,
                      total_size=data.shape)
    advi = pm.ADVI()
    approx = advi.fit(10000)
    trace = approx.sample(100)
trace['U'].shape # works fine now: (100, 101, 3)

I had an impression that the indexing will be done automatically under the hood.

pwl · 2017-06-12T14:21:07Z

@ferrine thanks for this great PR, this really simplifies and unifies the minibatch training!

ferrine · 2017-06-12T20:08:45Z

@pwl I would also recommend setting different random seeds on different dimensions. Orelse you get correlated observations. You can find reference in the docstring

ferrine · 2017-06-12T20:16:52Z

@junpenglao I haven't yet played with multidim training too. But for 1d I found this very convenient

junpenglao · 2017-06-12T20:28:14Z

@ferrine We should extend the probabilistic matrix factorization example to demonstrate the multi-dimensional mini batch training.

ferrine · 2017-06-12T20:33:01Z

Yes that is a good idea

mbabadi · 2017-09-19T18:32:28Z

@ferrine thanks for the minibatch implementation -- great PR. I am not quite familiar with the guts of PyMC3 yet, but looking at @pwl's matrix factorization model, I wonder whether there is a guarantee that data_t, data_i_idx and data_j_idx are evaluated only once during each round of gradient/ELBO evaluation? if one of them falls behind (or moves forward), the game is ruined.

ferrine · 2017-09-22T15:15:18Z

Hi,
Random indices are evaluated once per function evaluation by design of theano. What about manual minibatch.eval() or independent usage within several functions that's the thing I don't know how to control. That is what user should take care of. Personally I prefer putting minibatches in inference as a replacement, less error prone.

mbabadi · 2017-09-25T17:18:09Z

Thank you very much for your comment @ferrine.

I wonder, perhaps it would be even more error proof if it was possible to extract the slicing indices from the data minibatch directly. @pwl's solution involving multiple minibatches on linear ranges for each dimension makes room for strange bugs down the road (what if a future version of theano decides to evaluate some tensors twice?)

Is there an elegant solution to having a tuple of integer 1D tensors inside the minibatch class, one for each dimension, that would always be in sync with the slice minibatch.eval() would yield, and that the user could use/evaluate without the side effect of changing the state of the RNG?

ferrine · 2017-09-25T17:25:10Z

Having that on instance level is not a solution since you have different minibatches for different tensors. Class level solution is needed here. I see it is not easy to implement, and as you have a workaround it does not worth at all.

fonnesbeck reviewed Jun 1, 2017

View reviewed changes

ferrine added 9 commits June 10, 2017 14:46

draft

1f2913b

upgrade subsampling

7c9963c

add broadcastable kwarg

4022841

fix mutable in defaults

7edaba0

fix mutable in defaults

3d0ae88

draft

9aa0b1d

refactor minibatch

2be3609

add tests for minibatches

d3b8ae8

fix broadcastable

4e3b55d

ferrine and others added 8 commits June 10, 2017 14:47

add temperature

075fd2b

(cherry picked from commit ab704bf)

Unused Import

1d3e173

(cherry picked from commit 6fc6136)

update GLM-hierarchical-advi-minibatch.ipynb

89a47d5

update docs/source/notebooks/lda-advi-aevb.ipynb

56c5263

update convolutional_vae_keras_advi.ipynb

50c67df

update gaussian-mixture-model-advi.ipynb

6e8977a

some fixes in docstring

400fd92

fix notebooks' comments following @taku-y recommendations

bdf5602

taku-y reviewed Jun 12, 2017

View reviewed changes

taku-y merged commit e8da421 into pymc-devs:master Jun 12, 2017

junpenglao mentioned this pull request Jun 12, 2017

Bug after #2171 #2298

Closed

ferrine deleted the boost_minibatches branch June 12, 2017 20:08

ferrine mentioned this pull request Jun 12, 2017

Update notebook examples with new variational API #2097

Closed

5 tasks

boost minibatches #2171

boost minibatches #2171

Conversation

ferrine commented May 10, 2017 • edited Loading

twiecki commented May 12, 2017

ferrine commented May 13, 2017

fonnesbeck commented May 22, 2017

ferrine commented May 27, 2017

fonnesbeck commented May 27, 2017

ferrine commented May 27, 2017

ferrine commented May 27, 2017

ferrine commented May 28, 2017 • edited Loading

fonnesbeck commented May 28, 2017

ferrine commented May 30, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fonnesbeck commented Jun 1, 2017 • edited Loading

ferrine commented Jun 1, 2017

ferrine commented Jun 6, 2017

ferrine commented Jun 7, 2017

taku-y commented Jun 9, 2017

taku-y commented Jun 9, 2017

junpenglao commented Jun 11, 2017

junpenglao commented Jun 11, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ferrine Jun 12, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

taku-y commented Jun 12, 2017

pwl commented Jun 12, 2017

junpenglao commented Jun 12, 2017

pwl commented Jun 12, 2017 • edited Loading

junpenglao commented Jun 12, 2017

pwl commented Jun 12, 2017

pwl commented Jun 12, 2017

ferrine commented Jun 12, 2017

ferrine commented Jun 12, 2017

junpenglao commented Jun 12, 2017 • edited Loading

ferrine commented Jun 12, 2017

mbabadi commented Sep 19, 2017

ferrine commented Sep 22, 2017 • edited Loading

mbabadi commented Sep 25, 2017 • edited Loading

ferrine commented Sep 25, 2017

ferrine commented May 10, 2017 •

edited

Loading

ferrine commented May 28, 2017 •

edited

Loading

fonnesbeck commented Jun 1, 2017 •

edited

Loading

ferrine Jun 12, 2017 •

edited

Loading

pwl commented Jun 12, 2017 •

edited

Loading

junpenglao commented Jun 12, 2017 •

edited

Loading

ferrine commented Sep 22, 2017 •

edited

Loading

mbabadi commented Sep 25, 2017 •

edited

Loading