-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
User model #1525
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
User model #1525
Conversation
Do we really need compatible sklearn api? I think if well be a bit hard to make them friends |
@ferrine I don't think it's a priority to keep sklearn API. Can just be a starting point but if it's not helpful there's no reason to be beholden to it. |
Please review LinearComponent API and base class usage example |
Hmm, what's wrong with travis? |
init : dict - test_vals for coefficients | ||
rvars : dict - random variables instead of creating new ones | ||
""" | ||
def __init__(self, name, x, y, intercept=True, labels=None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
name
is only required if you have multiple LinearComponent
s, right? Not sure it should be a required arg then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree
"""Shortcut to model""" | ||
return modelcontext(None) | ||
|
||
def new_var(self, name, dist, data=None, test_val=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see how this is a useful abstraction but it makes transcribing a PyMC3 model from the usual syntax to be a class a bit harder as it gives cognitive overhead. After all, the code below isn't doing so much that it warrants the indirection. Is there another way we can achieve that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, there is no easier way to rename lots of variables in the model.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also thought about a method like this for base class, but i'm not sure if it will be good for readability. Here it exists to avoid code duplication
Are you ready to accept this @twiecki. I like the general idea. |
Though I don't think model specific plots are a part of this PR. |
Not in WIP. It's done. |
Unfortunately functools wrapper fails when decorating built-in methods so I need to fix that improper behaviour. Some bad but needed tricks were implemented
Travis plays bad tricks, definitely. I relied on his tests but in my new projects environment (python 2.7.12) model crushed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks really good! I'll have to look at the examples more, but seems fine to merge to me. I left some nitpicks that are probably better off as followup PRs.
return list.__contains__(self, item) | ||
|
||
def __setitem__(self, key, value): | ||
raise NotImplementedError('Not able to determine ' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can exception message be more specific? Something like "cannot set item value for treelist
(tried to set {key} = {value})"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I cannot set value at all. It inserts a value by some key, but in parent list there can be something else. I'm not sure if we need that, but this will probably cause unexpected behaviour.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I just meant the exception message could be more clear!
import theano.tensor as tt | ||
|
||
|
||
def any_to_tensor_and_labels(x, labels=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any chance for breaking up this logic a bit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(just saw awesome test suite, but still comment still holds :) )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like splitting out _input_to_array
, _verify_labels
, then you can rewrite this as
x = _input_to_array(x)
labels = _verify_labels(labels)
assert_shape_labels(x.shape, labels)
return x, labels
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
labels can come from x, so they go together
------- | ||
(x, labels) - tensor and labels for it's columns | ||
""" | ||
def assert_shape_labels(s, l): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like this is only used once
raise ValueError('Cannot use scalars') | ||
elif x.ndim == 1: | ||
x = x[:, None] | ||
# else: trust input |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would love an explicit
else:
pass
here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(and below)
Also, there are a few uses of "it's" (="it is") in the docs that should change to "its" (possessive). |
BTW, gelato is comming. PyMC3 becomes really powerful |
Alright, I'll go ahead and merge this into the |
Just checking out Gelato, looks really interesting! |
* Started to write Base class for pymc3.models * mode `add_var` to public api * Added some docstrings * Added some docstrings * added getitem and fixed a typo * added assertion check * added resolve var method * decided not to add resolve method * Added linear component * Docs fix * patsy's intercept is inited properly now * refactored code * updated docs * added possibility to init coefficients with random variables * added glm * refactored api, fixed formula init * refactored linear model, extended acceptable types * moved useful matrix and labels creation to utils file * code style * removed redundant evaluation of shape * refactored resolver for constructing matrix and labels * changed error message * changed signature of init * simplified utils any_to_tensor_and_labels code * tests for `any_to_tensor_and_labels` * added docstring for `any_to_tensor_and_labels` util * forgot to document return type in `any_to_tensor_and_labels` * refactored code for dict * dict tests fix(do not check labels there) * added access to random vars of model * added a shortcut for all variables so there is a unified way to get them * added default priors for linear model * update docs for linear * refactored UserModel api, made it more similar to pm.Model class * Lots of refactoring, tests for base class, more plain api design * deleted unused module variable * fixed some typos in docstring * Refactored pm.Model class, now it is ready for inheritance * Added documentation for Model class * Small typo in docstring * nested contains for treedict (needed for add_random_variable) * More accurate duplicate implementation of treedict/treelist * refactored treedict/treelist * changed `__imul__` of treelist * added `root` property and `isroot` indicator for base model * protect `parent` and `model` attributes from violation * travis' python2 did not fail on bad syntax(maybe it's too new), fixed * decided not to use functools wrapper Unfortunately functools wrapper fails when decorating built-in methods so I need to fix that improper behaviour. Some bad but needed tricks were implemented * Added models package to setup script * Refactor utils * Fix some typos in pm.model
* Started to write Base class for pymc3.models * mode `add_var` to public api * Added some docstrings * Added some docstrings * added getitem and fixed a typo * added assertion check * added resolve var method * decided not to add resolve method * Added linear component * Docs fix * patsy's intercept is inited properly now * refactored code * updated docs * added possibility to init coefficients with random variables * added glm * refactored api, fixed formula init * refactored linear model, extended acceptable types * moved useful matrix and labels creation to utils file * code style * removed redundant evaluation of shape * refactored resolver for constructing matrix and labels * changed error message * changed signature of init * simplified utils any_to_tensor_and_labels code * tests for `any_to_tensor_and_labels` * added docstring for `any_to_tensor_and_labels` util * forgot to document return type in `any_to_tensor_and_labels` * refactored code for dict * dict tests fix(do not check labels there) * added access to random vars of model * added a shortcut for all variables so there is a unified way to get them * added default priors for linear model * update docs for linear * refactored UserModel api, made it more similar to pm.Model class * Lots of refactoring, tests for base class, more plain api design * deleted unused module variable * fixed some typos in docstring * Refactored pm.Model class, now it is ready for inheritance * Added documentation for Model class * Small typo in docstring * nested contains for treedict (needed for add_random_variable) * More accurate duplicate implementation of treedict/treelist * refactored treedict/treelist * changed `__imul__` of treelist * added `root` property and `isroot` indicator for base model * protect `parent` and `model` attributes from violation * travis' python2 did not fail on bad syntax(maybe it's too new), fixed * decided not to use functools wrapper Unfortunately functools wrapper fails when decorating built-in methods so I need to fix that improper behaviour. Some bad but needed tricks were implemented * Added models package to setup script * Refactor utils * Fix some typos in pm.model
* Started to write Base class for pymc3.models * mode `add_var` to public api * Added some docstrings * Added some docstrings * added getitem and fixed a typo * added assertion check * added resolve var method * decided not to add resolve method * Added linear component * Docs fix * patsy's intercept is inited properly now * refactored code * updated docs * added possibility to init coefficients with random variables * added glm * refactored api, fixed formula init * refactored linear model, extended acceptable types * moved useful matrix and labels creation to utils file * code style * removed redundant evaluation of shape * refactored resolver for constructing matrix and labels * changed error message * changed signature of init * simplified utils any_to_tensor_and_labels code * tests for `any_to_tensor_and_labels` * added docstring for `any_to_tensor_and_labels` util * forgot to document return type in `any_to_tensor_and_labels` * refactored code for dict * dict tests fix(do not check labels there) * added access to random vars of model * added a shortcut for all variables so there is a unified way to get them * added default priors for linear model * update docs for linear * refactored UserModel api, made it more similar to pm.Model class * Lots of refactoring, tests for base class, more plain api design * deleted unused module variable * fixed some typos in docstring * Refactored pm.Model class, now it is ready for inheritance * Added documentation for Model class * Small typo in docstring * nested contains for treedict (needed for add_random_variable) * More accurate duplicate implementation of treedict/treelist * refactored treedict/treelist * changed `__imul__` of treelist * added `root` property and `isroot` indicator for base model * protect `parent` and `model` attributes from violation * travis' python2 did not fail on bad syntax(maybe it's too new), fixed * decided not to use functools wrapper Unfortunately functools wrapper fails when decorating built-in methods so I need to fix that improper behaviour. Some bad but needed tricks were implemented * Added models package to setup script * Refactor utils * Fix some typos in pm.model
* Started to write Base class for pymc3.models * mode `add_var` to public api * Added some docstrings * Added some docstrings * added getitem and fixed a typo * added assertion check * added resolve var method * decided not to add resolve method * Added linear component * Docs fix * patsy's intercept is inited properly now * refactored code * updated docs * added possibility to init coefficients with random variables * added glm * refactored api, fixed formula init * refactored linear model, extended acceptable types * moved useful matrix and labels creation to utils file * code style * removed redundant evaluation of shape * refactored resolver for constructing matrix and labels * changed error message * changed signature of init * simplified utils any_to_tensor_and_labels code * tests for `any_to_tensor_and_labels` * added docstring for `any_to_tensor_and_labels` util * forgot to document return type in `any_to_tensor_and_labels` * refactored code for dict * dict tests fix(do not check labels there) * added access to random vars of model * added a shortcut for all variables so there is a unified way to get them * added default priors for linear model * update docs for linear * refactored UserModel api, made it more similar to pm.Model class * Lots of refactoring, tests for base class, more plain api design * deleted unused module variable * fixed some typos in docstring * Refactored pm.Model class, now it is ready for inheritance * Added documentation for Model class * Small typo in docstring * nested contains for treedict (needed for add_random_variable) * More accurate duplicate implementation of treedict/treelist * refactored treedict/treelist * changed `__imul__` of treelist * added `root` property and `isroot` indicator for base model * protect `parent` and `model` attributes from violation * travis' python2 did not fail on bad syntax(maybe it's too new), fixed * decided not to use functools wrapper Unfortunately functools wrapper fails when decorating built-in methods so I need to fix that improper behaviour. Some bad but needed tricks were implemented * Added models package to setup script * Refactor utils * Fix some typos in pm.model
* Started to write Base class for pymc3.models * mode `add_var` to public api * Added some docstrings * Added some docstrings * added getitem and fixed a typo * added assertion check * added resolve var method * decided not to add resolve method * Added linear component * Docs fix * patsy's intercept is inited properly now * refactored code * updated docs * added possibility to init coefficients with random variables * added glm * refactored api, fixed formula init * refactored linear model, extended acceptable types * moved useful matrix and labels creation to utils file * code style * removed redundant evaluation of shape * refactored resolver for constructing matrix and labels * changed error message * changed signature of init * simplified utils any_to_tensor_and_labels code * tests for `any_to_tensor_and_labels` * added docstring for `any_to_tensor_and_labels` util * forgot to document return type in `any_to_tensor_and_labels` * refactored code for dict * dict tests fix(do not check labels there) * added access to random vars of model * added a shortcut for all variables so there is a unified way to get them * added default priors for linear model * update docs for linear * refactored UserModel api, made it more similar to pm.Model class * Lots of refactoring, tests for base class, more plain api design * deleted unused module variable * fixed some typos in docstring * Refactored pm.Model class, now it is ready for inheritance * Added documentation for Model class * Small typo in docstring * nested contains for treedict (needed for add_random_variable) * More accurate duplicate implementation of treedict/treelist * refactored treedict/treelist * changed `__imul__` of treelist * added `root` property and `isroot` indicator for base model * protect `parent` and `model` attributes from violation * travis' python2 did not fail on bad syntax(maybe it's too new), fixed * decided not to use functools wrapper Unfortunately functools wrapper fails when decorating built-in methods so I need to fix that improper behaviour. Some bad but needed tricks were implemented * Added models package to setup script * Refactor utils * Fix some typos in pm.model
I started preparation for implementing GLM model and I think good base class is necessary for that purpose. Also refer to #1524 and #1517.