Skip to content

User model #1525

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 52 commits into from
Nov 28, 2016
Merged

User model #1525

merged 52 commits into from
Nov 28, 2016

Conversation

ferrine
Copy link
Member

@ferrine ferrine commented Nov 12, 2016

I started preparation for implementing GLM model and I think good base class is necessary for that purpose. Also refer to #1524 and #1517.

@ferrine
Copy link
Member Author

ferrine commented Nov 12, 2016

Do we really need compatible sklearn api? I think if well be a bit hard to make them friends
Are there any ideas of basic functionality of UserModel?

@twiecki
Copy link
Member

twiecki commented Nov 14, 2016

@ferrine I don't think it's a priority to keep sklearn API. Can just be a starting point but if it's not helpful there's no reason to be beholden to it.

@ferrine
Copy link
Member Author

ferrine commented Nov 15, 2016

Please review LinearComponent API and base class usage example

@ferrine
Copy link
Member Author

ferrine commented Nov 15, 2016

Hmm, what's wrong with travis?

init : dict - test_vals for coefficients
rvars : dict - random variables instead of creating new ones
"""
def __init__(self, name, x, y, intercept=True, labels=None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

name is only required if you have multiple LinearComponents, right? Not sure it should be a required arg then.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree

"""Shortcut to model"""
return modelcontext(None)

def new_var(self, name, dist, data=None, test_val=None):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see how this is a useful abstraction but it makes transcribing a PyMC3 model from the usual syntax to be a class a bit harder as it gives cognitive overhead. After all, the code below isn't doing so much that it warrants the indirection. Is there another way we can achieve that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, there is no easier way to rename lots of variables in the model.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also thought about a method like this for base class, but i'm not sure if it will be good for readability. Here it exists to avoid code duplication

@ferrine
Copy link
Member Author

ferrine commented Nov 15, 2016

travis was failing because of that, waiting for merge this PR

@springcoil
Copy link
Contributor

Are you ready to accept this @twiecki. I like the general idea.

@ferrine
Copy link
Member Author

ferrine commented Nov 26, 2016

Though I don't think model specific plots are a part of this PR.

@ferrine
Copy link
Member Author

ferrine commented Nov 26, 2016

Not in WIP. It's done.

 Unfortunately functools wrapper fails
 when decorating built-in methods so I
 need to fix that improper behaviour.
 Some bad but needed tricks were implemented
@ferrine
Copy link
Member Author

ferrine commented Nov 26, 2016

Travis plays bad tricks, definitely. I relied on his tests but in my new projects environment (python 2.7.12) model crushed.

Copy link
Member

@ColCarroll ColCarroll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks really good! I'll have to look at the examples more, but seems fine to merge to me. I left some nitpicks that are probably better off as followup PRs.

return list.__contains__(self, item)

def __setitem__(self, key, value):
raise NotImplementedError('Not able to determine '
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can exception message be more specific? Something like "cannot set item value for treelist (tried to set {key} = {value})"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cannot set value at all. It inserts a value by some key, but in parent list there can be something else. I'm not sure if we need that, but this will probably cause unexpected behaviour.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I just meant the exception message could be more clear!

import theano.tensor as tt


def any_to_tensor_and_labels(x, labels=None):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any chance for breaking up this logic a bit?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(just saw awesome test suite, but still comment still holds :) )

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like splitting out _input_to_array, _verify_labels, then you can rewrite this as

x = _input_to_array(x)
labels = _verify_labels(labels)
assert_shape_labels(x.shape, labels)
return x, labels

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

labels can come from x, so they go together

-------
(x, labels) - tensor and labels for it's columns
"""
def assert_shape_labels(s, l):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like this is only used once

raise ValueError('Cannot use scalars')
elif x.ndim == 1:
x = x[:, None]
# else: trust input
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would love an explicit

else:
    pass

here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(and below)

@ColCarroll
Copy link
Member

Also, there are a few uses of "it's" (="it is") in the docs that should change to "its" (possessive).

@ColCarroll ColCarroll removed the WIP label Nov 27, 2016
@ferrine
Copy link
Member Author

ferrine commented Nov 28, 2016

BTW, gelato is comming. PyMC3 becomes really powerful

@twiecki twiecki changed the base branch from master to 3.1 November 28, 2016 04:37
@twiecki
Copy link
Member

twiecki commented Nov 28, 2016

Alright, I'll go ahead and merge this into the 3.1 branch. We'll merge it into master once we shipped 3.0. Thanks for the great contribution @ferrine.

@twiecki twiecki merged commit 55c8ce6 into pymc-devs:3.1 Nov 28, 2016
@twiecki
Copy link
Member

twiecki commented Nov 28, 2016

Just checking out Gelato, looks really interesting!

ColCarroll pushed a commit to ColCarroll/pymc3 that referenced this pull request Dec 2, 2016
* Started to write Base class for pymc3.models

* mode `add_var` to public api

* Added some docstrings

* Added some docstrings

* added getitem and fixed a typo

* added assertion check

* added resolve var method

* decided not to add resolve method

* Added linear component

* Docs fix

* patsy's intercept is inited properly now

* refactored code

* updated docs

* added possibility to init coefficients with random variables

* added glm

* refactored api, fixed formula init

* refactored linear model, extended acceptable types

* moved useful matrix and labels creation to utils file

* code style

* removed redundant evaluation of shape

* refactored resolver for constructing matrix and labels

* changed error message

* changed signature of init

* simplified utils any_to_tensor_and_labels code

* tests for `any_to_tensor_and_labels`

* added docstring for `any_to_tensor_and_labels` util

* forgot to document return type in `any_to_tensor_and_labels`

* refactored code for dict

* dict tests fix(do not check labels there)

* added access to random vars of model

* added a shortcut for all variables so there is a unified way to get them

* added default priors for linear model

* update docs for linear

* refactored UserModel api, made it more similar to pm.Model class

* Lots of refactoring, tests for base class, more plain api design

* deleted unused module variable

* fixed some typos in docstring

* Refactored pm.Model class, now it is ready for inheritance

* Added documentation for Model class

* Small typo in docstring

* nested contains for treedict (needed for add_random_variable)

* More accurate duplicate implementation of treedict/treelist

* refactored treedict/treelist

* changed `__imul__` of treelist

* added `root` property and `isroot` indicator for base model

* protect `parent` and `model` attributes from violation

* travis' python2 did not fail on bad syntax(maybe it's too new), fixed

* decided not to use functools wrapper

 Unfortunately functools wrapper fails
 when decorating built-in methods so I
 need to fix that improper behaviour.
 Some bad but needed tricks were implemented

* Added models package to setup script

* Refactor utils

* Fix some typos in pm.model
twiecki pushed a commit that referenced this pull request Dec 5, 2016
* Started to write Base class for pymc3.models

* mode `add_var` to public api

* Added some docstrings

* Added some docstrings

* added getitem and fixed a typo

* added assertion check

* added resolve var method

* decided not to add resolve method

* Added linear component

* Docs fix

* patsy's intercept is inited properly now

* refactored code

* updated docs

* added possibility to init coefficients with random variables

* added glm

* refactored api, fixed formula init

* refactored linear model, extended acceptable types

* moved useful matrix and labels creation to utils file

* code style

* removed redundant evaluation of shape

* refactored resolver for constructing matrix and labels

* changed error message

* changed signature of init

* simplified utils any_to_tensor_and_labels code

* tests for `any_to_tensor_and_labels`

* added docstring for `any_to_tensor_and_labels` util

* forgot to document return type in `any_to_tensor_and_labels`

* refactored code for dict

* dict tests fix(do not check labels there)

* added access to random vars of model

* added a shortcut for all variables so there is a unified way to get them

* added default priors for linear model

* update docs for linear

* refactored UserModel api, made it more similar to pm.Model class

* Lots of refactoring, tests for base class, more plain api design

* deleted unused module variable

* fixed some typos in docstring

* Refactored pm.Model class, now it is ready for inheritance

* Added documentation for Model class

* Small typo in docstring

* nested contains for treedict (needed for add_random_variable)

* More accurate duplicate implementation of treedict/treelist

* refactored treedict/treelist

* changed `__imul__` of treelist

* added `root` property and `isroot` indicator for base model

* protect `parent` and `model` attributes from violation

* travis' python2 did not fail on bad syntax(maybe it's too new), fixed

* decided not to use functools wrapper

 Unfortunately functools wrapper fails
 when decorating built-in methods so I
 need to fix that improper behaviour.
 Some bad but needed tricks were implemented

* Added models package to setup script

* Refactor utils

* Fix some typos in pm.model
twiecki pushed a commit that referenced this pull request Dec 5, 2016
* Started to write Base class for pymc3.models

* mode `add_var` to public api

* Added some docstrings

* Added some docstrings

* added getitem and fixed a typo

* added assertion check

* added resolve var method

* decided not to add resolve method

* Added linear component

* Docs fix

* patsy's intercept is inited properly now

* refactored code

* updated docs

* added possibility to init coefficients with random variables

* added glm

* refactored api, fixed formula init

* refactored linear model, extended acceptable types

* moved useful matrix and labels creation to utils file

* code style

* removed redundant evaluation of shape

* refactored resolver for constructing matrix and labels

* changed error message

* changed signature of init

* simplified utils any_to_tensor_and_labels code

* tests for `any_to_tensor_and_labels`

* added docstring for `any_to_tensor_and_labels` util

* forgot to document return type in `any_to_tensor_and_labels`

* refactored code for dict

* dict tests fix(do not check labels there)

* added access to random vars of model

* added a shortcut for all variables so there is a unified way to get them

* added default priors for linear model

* update docs for linear

* refactored UserModel api, made it more similar to pm.Model class

* Lots of refactoring, tests for base class, more plain api design

* deleted unused module variable

* fixed some typos in docstring

* Refactored pm.Model class, now it is ready for inheritance

* Added documentation for Model class

* Small typo in docstring

* nested contains for treedict (needed for add_random_variable)

* More accurate duplicate implementation of treedict/treelist

* refactored treedict/treelist

* changed `__imul__` of treelist

* added `root` property and `isroot` indicator for base model

* protect `parent` and `model` attributes from violation

* travis' python2 did not fail on bad syntax(maybe it's too new), fixed

* decided not to use functools wrapper

 Unfortunately functools wrapper fails
 when decorating built-in methods so I
 need to fix that improper behaviour.
 Some bad but needed tricks were implemented

* Added models package to setup script

* Refactor utils

* Fix some typos in pm.model
@ferrine ferrine deleted the user_model branch December 9, 2016 20:32
twiecki pushed a commit that referenced this pull request Jan 9, 2017
* Started to write Base class for pymc3.models

* mode `add_var` to public api

* Added some docstrings

* Added some docstrings

* added getitem and fixed a typo

* added assertion check

* added resolve var method

* decided not to add resolve method

* Added linear component

* Docs fix

* patsy's intercept is inited properly now

* refactored code

* updated docs

* added possibility to init coefficients with random variables

* added glm

* refactored api, fixed formula init

* refactored linear model, extended acceptable types

* moved useful matrix and labels creation to utils file

* code style

* removed redundant evaluation of shape

* refactored resolver for constructing matrix and labels

* changed error message

* changed signature of init

* simplified utils any_to_tensor_and_labels code

* tests for `any_to_tensor_and_labels`

* added docstring for `any_to_tensor_and_labels` util

* forgot to document return type in `any_to_tensor_and_labels`

* refactored code for dict

* dict tests fix(do not check labels there)

* added access to random vars of model

* added a shortcut for all variables so there is a unified way to get them

* added default priors for linear model

* update docs for linear

* refactored UserModel api, made it more similar to pm.Model class

* Lots of refactoring, tests for base class, more plain api design

* deleted unused module variable

* fixed some typos in docstring

* Refactored pm.Model class, now it is ready for inheritance

* Added documentation for Model class

* Small typo in docstring

* nested contains for treedict (needed for add_random_variable)

* More accurate duplicate implementation of treedict/treelist

* refactored treedict/treelist

* changed `__imul__` of treelist

* added `root` property and `isroot` indicator for base model

* protect `parent` and `model` attributes from violation

* travis' python2 did not fail on bad syntax(maybe it's too new), fixed

* decided not to use functools wrapper

 Unfortunately functools wrapper fails
 when decorating built-in methods so I
 need to fix that improper behaviour.
 Some bad but needed tricks were implemented

* Added models package to setup script

* Refactor utils

* Fix some typos in pm.model
twiecki pushed a commit that referenced this pull request Jan 9, 2017
* Started to write Base class for pymc3.models

* mode `add_var` to public api

* Added some docstrings

* Added some docstrings

* added getitem and fixed a typo

* added assertion check

* added resolve var method

* decided not to add resolve method

* Added linear component

* Docs fix

* patsy's intercept is inited properly now

* refactored code

* updated docs

* added possibility to init coefficients with random variables

* added glm

* refactored api, fixed formula init

* refactored linear model, extended acceptable types

* moved useful matrix and labels creation to utils file

* code style

* removed redundant evaluation of shape

* refactored resolver for constructing matrix and labels

* changed error message

* changed signature of init

* simplified utils any_to_tensor_and_labels code

* tests for `any_to_tensor_and_labels`

* added docstring for `any_to_tensor_and_labels` util

* forgot to document return type in `any_to_tensor_and_labels`

* refactored code for dict

* dict tests fix(do not check labels there)

* added access to random vars of model

* added a shortcut for all variables so there is a unified way to get them

* added default priors for linear model

* update docs for linear

* refactored UserModel api, made it more similar to pm.Model class

* Lots of refactoring, tests for base class, more plain api design

* deleted unused module variable

* fixed some typos in docstring

* Refactored pm.Model class, now it is ready for inheritance

* Added documentation for Model class

* Small typo in docstring

* nested contains for treedict (needed for add_random_variable)

* More accurate duplicate implementation of treedict/treelist

* refactored treedict/treelist

* changed `__imul__` of treelist

* added `root` property and `isroot` indicator for base model

* protect `parent` and `model` attributes from violation

* travis' python2 did not fail on bad syntax(maybe it's too new), fixed

* decided not to use functools wrapper

 Unfortunately functools wrapper fails
 when decorating built-in methods so I
 need to fix that improper behaviour.
 Some bad but needed tricks were implemented

* Added models package to setup script

* Refactor utils

* Fix some typos in pm.model
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants