Skip to content

GH-95861: Add support for Spearman's rank correlation coefficient #95863

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Aug 18, 2022

Conversation

rhettinger
Copy link
Contributor

@rhettinger rhettinger commented Aug 10, 2022

Still need to think about whether by_rank is the best name for the flag.

@rhettinger rhettinger added type-feature A feature request or enhancement stdlib Python modules in the Lib dir 3.12 only security fixes labels Aug 10, 2022
@stevendaprano
Copy link
Member

See my early comment on #95861

I think we should make _rank public, and remove the "by_rank" parameter to correlation, moving that functionality into a thin wrapper function spearman. Otherwise this is good work, thank you.

Copy link
Member

@stevendaprano stevendaprano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you categorically disagree with my suggestion to make rank public and move spearman into its own named function, I will accept this (you've done the hard work writing the code and I do like the feature). But I do strongly request that we keep to the design of separate named functions rather than overloading the one function with a flag.

If you categorically want to keep the single correlation function, then we could future-proof it and make the type of correlation more explicit with an enumerated flag rather than a bool:

def correlation(x, y, /, kind='pearson'):
    if kind == 'spearman':
        x = _rank(x)
        y = _rank(y)
   elif kind != 'pearson':
       raise ValueError('unsupported correlation kind')

If we ever add another sort of correlation (say, Kendall tau) is would be awkward to handle a second bool flag.

@bedevere-bot
Copy link

When you're done making the requested changes, leave the comment: I have made the requested changes; please review again.

@rhettinger
Copy link
Contributor Author

Thank you for responding quickly and thoughtfully.

I added some comments to the issue to help us get to a good meeting of the minds on the API.

@rhettinger rhettinger changed the title GH-95861: Add support for Pearson's correlation coefficient GH-95861: Add support for Spearman's rank correlation coefficient Aug 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.12 only security fixes stdlib Python modules in the Lib dir type-feature A feature request or enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants