Skip to content

GLM poisson #86

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
OriolAbril opened this issue Mar 30, 2021 · 15 comments
Open

GLM poisson #86

OriolAbril opened this issue Mar 30, 2021 · 15 comments
Labels
tracker id Issues used as trackers in the notebook update project, do not close!

Comments

@OriolAbril
Copy link
Member

OriolAbril commented Mar 30, 2021

File: https://github.com/pymc-devs/pymc-examples/blob/main/examples/generalized_linear_models/GLM-poisson-regression.ipynb
Reviewers:

The sections below may still be pending. If so, the issue is still available, it simply doesn't
have specific guidance yet. Please refer to this overview of updates

Known changes needed

Changes listed in this section should all be done at some point in order to get this
notebook to a "Best Practices" state. However, these are probably not enough!
Make sure to thoroughly review the notebook and search for other updates.

General updates

  • Use numpy generator
  • ⚠️ code cells 15 and 19 are plain wrong, we are doing np.exp(np.mean()) instead of np.mean(np.exp()).

ArviZ related

Notes

Exotic dependencies

None

Computing requirements

Models sample in less than a minute

@OriolAbril OriolAbril added the tracker id Issues used as trackers in the notebook update project, do not close! label Mar 30, 2021
@jessicakzhang
Copy link

Hi! I'd like to try working on this.

@OriolAbril
Copy link
Member Author

That would be great @jessicakzhang! I have added a couple of suggestions based on a quick look over the notebook, I'll review more carefully once you submit a PR.

Let us know if you have any doubt while working on this

@chiral-carbon
Copy link
Collaborator

Hi @jessicakzhang , are you still working on this issue? @OriolAbril would it be okay if I were to submit a PR for this issue, considering the fact that this issue has already been assigned?

@OriolAbril
Copy link
Member Author

Hi, yes, as it has been more that two weeks with no activity, as indicated in the contributing guide, I'll assign the issue to you so you can submit a PR.

@chiral-carbon
Copy link
Collaborator

thanks a lot!

@chiral-carbon
Copy link
Collaborator

chiral-carbon commented May 5, 2021

@OriolAbril had a doubt regarding updating np.exp(np.mean()) to np.mean(np.exp()) in cells 15 and 19. as far as I understood, az.summary returns certain statistics of which mean is one, and we are computing np.exp() for this summary dataframe.
should this be changed to compute np.mean() of az.summary? I am unclear as to exactly where we should be applying np.exp() here.

also, while updating az.summary to use args rather than manually subsetting the dataframe, should we show all the default stats (mean, sd, hdi_3%, hdi_97%) for the variables or only a subset of mean, hdi_3%, hdi_97% as is being shown currently?

@OriolAbril
Copy link
Member Author

I used mean as a placeholder, summary is acting as mean (as well as acting as hdi). Exponentiating should come first, then calling summary on the exponentiated data, not the other way around.

I think the default stats is good enough and it's simple, there is no need to overly complicate the notebook only to exclude sd from summary.

@chiral-carbon
Copy link
Collaborator

chiral-carbon commented May 5, 2021

oh okay, thanks for clarifying.
so when I do the exponentiation on inf_fish which is an InferenceData object, I should first convert it to a data frame and exponentiate that data frame, and only then create a summary for it. have I understood this correctly? but az.summary takes an InferenceData object, so what would be the best way to exponentiate inf_fish here?

@OriolAbril
Copy link
Member Author

You should exponentiate the posterior samples, which are a group in inferencedata, in the form of an xarray dataset. It should look something like: az.summary(np.exp(idata.posterior), ...)

@chiral-carbon
Copy link
Collaborator

yes, that works, thanks a lot!

@chiral-carbon
Copy link
Collaborator

chiral-carbon commented May 5, 2021

in cell 19, I think there's a typing error. the code in cells 15 and 19 are identical and call a summary for the same variable inf_fish, which was generated using the manual model. cell 19 should be displaying the summary for the model results created with glm.from_formula and which are stored in the variable inf_fish_alt. can you confirm this?

also, should the data in the markdown cell after cell 15 be altered to match the new mean and hdi that we see in the summary? the values are only very slightly off
Screenshot from 2021-05-05 20-58-23

@OriolAbril
Copy link
Member Author

in cell 19, I think there's a typing error. the code in cells 15 and 19 are identical and call a summary for the same variable inf_fish, which was generated using the manual model. cell 19 should be displaying the summary for the model results created with glm.from_formula and which are stored in the variable inf_fish_alt. can you confirm this?

yes, it is definitely a typo.

also, should the data in the markdown cell after cell 15 be altered to match the new mean and hdi that we see in the summary?

I would update it to avoid confusing readers

@OriolAbril
Copy link
Member Author

Needs to be updated to use bambi instead of glm module

@chiral-carbon
Copy link
Collaborator

will be working on it @OriolAbril

@drbenvincent
Copy link
Contributor

I'm about to update this to v4

@twiecki twiecki moved this to v4 (auto) in Notebook state tracker Nov 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tracker id Issues used as trackers in the notebook update project, do not close!
Projects
Status: v4 (auto)
Development

No branches or pull requests

4 participants