Skip to content

Updates to pymc.Hypergeometric docstrings #5488

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Mar 3, 2022
Merged

Updates to pymc.Hypergeometric docstrings #5488

merged 6 commits into from
Mar 3, 2022

Conversation

cuchoi
Copy link
Member

@cuchoi cuchoi commented Feb 18, 2022

As part of the work of #5459:

  • Added the bounds to the parameters N, k, and n in the parameters section of pymc.Hypergeometric.
  • Added docstrings for parameters "good" and "bad" in logp and logcdf. These parameters are different than the ones used in __init__, which is odd to me. They transform the N and k parameters to bad (which is N - k) and good (which is k), but this transformation could happen inside logp and logcdf. Would it make sense to always use N, k?

#DataUmbrellaPyMCSprint

@cuchoi cuchoi closed this Feb 18, 2022
@cuchoi cuchoi reopened this Feb 18, 2022
@cuchoi cuchoi closed this Feb 18, 2022
@cuchoi cuchoi changed the title Added bounds to the docstring of the Hypergeometric dist Updates to pymc.Hypergeometric docstrings Feb 18, 2022
@@ -972,11 +972,11 @@ class HyperGeometric(Discrete):
Parameters
----------
N : integer
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are scalars the only valid type for all 3 arguments? cc @ricardoV94 ? (no idea who knows, feel free to tag someone else)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but maybe an array of integers is also supported? 🤔

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logcdf does say:

        # logcdf can only handle scalar values at the moment
        if np.ndim(value):
            raise TypeError(
                f"HyperGeometric.logcdf expects a scalar value but received a {np.ndim(value)}-dimensional object."
            )

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But maybe logp supports it?

Copy link
Member Author

@cuchoi cuchoi Feb 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can I use a different format to specify the array types? Example integer, array_like[integer] or TensorVariable[integer]?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, numpydoc formatting is very clear in the array of type format

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We decided to use tensor_like of integer and we'll add its definition on the glossary so we have a type that covers scalars, arrays, aesara tensorvariable and randomvariable

Copy link
Member Author

@cuchoi cuchoi Feb 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noted! Picked integer, array_like or TensorVariable

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the late review. We decided to go with tensor_like of integer which will get rendered as a link to our glossary where we have an explanation on what inputs are accepted for arguments documented like this: https://docs.pymc.io/en/latest/glossary.html#term-tensor_like

@OriolAbril
Copy link
Member

@cuchoi did you mean to close the PR? maybe open another not from main?

@cuchoi
Copy link
Member Author

cuchoi commented Feb 18, 2022

I closed the PR while adding more changes to it to avoid it being reviewed before adding them.

Should I change the PR to be from a branch of my forked repo?

@cuchoi cuchoi reopened this Feb 18, 2022
@cuchoi
Copy link
Member Author

cuchoi commented Feb 18, 2022

Added docstrings for parameters "good" and "bad" in logp and logcdf. These parameters are different than the ones used in init, which is odd to me. They transform the N and k parameters to bad (which is N - k) and good (which is k), but this transformation could happen inside logp and logcdf. Would it make sense to always use N, k?

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
@OriolAbril OriolAbril merged commit a92a414 into pymc-devs:main Mar 3, 2022
Comment on lines +1007 to +1010
good : integer, array_like or TensorVariable
Number of successful individuals in the population. Alias for parameter :math:`k`.
bad : integer, array_like or TensorVariable
Number of unsuccessful individuals in the population. Alias for :math:`N-k`.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also use "tensor_like of integer" here @OriolAbril ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are in the process of removing the distribution methods from the documentation and leaving only the dist one. The Distribution class is still being refactored, see #5308 for some more details (I don't really understand what is going on very well) and those methods might be removed completely, and because they should not be used themselves already but called via pm.logp() or pm.logcdf

I think it's best to leave them be for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants