Skip to content

Show repository name, in the GitHub statistics section #10917

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
pradyunsg opened this issue Mar 11, 2022 · 13 comments · Fixed by #16532
Closed

Show repository name, in the GitHub statistics section #10917

pradyunsg opened this issue Mar 11, 2022 · 13 comments · Fixed by #16532

Comments

@pradyunsg
Copy link
Contributor

What's the problem this feature will solve?
Currently, it's not possible to discover the name of the GitHub repository, associated with a project's metadata from the PyPI page directly. It is also occasionally not immediately clear which of the URLs associated with a project is a GitHub URL.

This makes it possible for projects to misrepresent what their GitHub statistics are, by using a more popular Python project's URL on their metadata.

Describe the solution you'd like
Change "GitHub statistics:" to instead be "GitHub: {repo-name}". So, for pip, that would be "GitHub: pypa/pip".

This would:

  • provide an "obvious thing" to make into a link to the GitHub repository.
  • surface this information to help identify projects that are misrepresenting themselves (intentionally or otherwise).

Additional context
See https://discuss.python.org/t/14181/, where this was originally proposed.

@merwok
Copy link
Contributor

merwok commented Mar 11, 2022

PR opened: #10925

I chose to keep the heading and add a line for repo name with link:
pypi-stats

(ignore the partial translation, I’ll make another PR to address problems in the french translations, unless that tooling is obsolete and only weblate should be used)

@miketheman
Copy link
Member

Isn't this an existing, opt-in behavior?

Poetry allows a package publisher to set a repository key in their pyproject.toml with a URL: https://python-poetry.org/docs/pyproject/#repository which shows up as a Repository link on pypi sidebar already.

Example: https://github.com/miketheman/pytest-socket/blob/ee1377a020e9c68a9e73ba8e696dc57587653ef3/pyproject.toml#L9 - view sidebar on pypi: https://pypi.org/project/pytest-socket/

@miketheman
Copy link
Member

This isn't a new feature, either - setuptools affords publishers to set project_urls keywords:

An arbitrary map of URL names to hyperlinks, allowing more extensible documentation of where various resources can be found than the simple url and download_url options provide.

A concrete example in setuptools setup.cfg: https://github.com/pypa/setuptools/blob/fc72349c9397f2f03e971d8ebc3e4928957180f1/setup.cfg#L20-L21

@pradyunsg
Copy link
Contributor Author

pradyunsg commented Mar 11, 2022

@miketheman Read the first post again please and see the discussion linked to.

@miketheman
Copy link
Member

I did, and re-read the thread in discourse.

I guess I'm not clear how this solves anything other than adding a little more clarity that someone hovering over a link or clicking would readily uncover. Does the warehouse boost search results based on GitHub statistics? (I don't think so, and hope not).

From reading the thread, the original poster had already found and addressed the question, and the maintainer responded (sure, not in a way they wanted, but hey, that's not a requirement).

If anything, I'd opt to redact/remove GitHub statistics from the sidebar so as to remove the potential abuse until it can be governed effectively. But again, that's not the issue the thread poster had - they wanted to find the source in GitHub, and that's not required for PyPi, as far as I can tell.

@merwok
Copy link
Contributor

merwok commented Mar 14, 2022

The issue solved is exactly the lack of clarify of the github section!

I often find it weird/annoying that I have to use the issues or stars link to go to github then have to navigate to the repository main page (there is often no Repository link in the project URLs section). Hover info is not good enough, because people like to see info before they decide to move the pointer to do an action + there isn’t always a mouse to move.

Even though the original post that caused this is now solved (by a direct request on the repo that is linked and does not have source code), this is a little improvement on its own.

(BTW Pradyun there is no similar extraction for gitlab repos)

@pradyunsg pradyunsg changed the title Show repository name, in the GitHub/GitLab statistics section Show repository name, in the GitHub statistics section Mar 14, 2022
@pradyunsg
Copy link
Contributor Author

Issue title duly amended. :)

@di
Copy link
Member

di commented Mar 14, 2022

I think I agree with @miketheman, I'm not seeing how this resolves the problem listed in https://discuss.python.org/t/14181/.

I'm also slightly concerned with how displaying the GitHub user/org could be confused with the PyPI user/org (ref #201)

Also not sure I understand this:

I often find it weird/annoying that I have to use the issues or stars link to go to github then have to navigate to the repository main page (there is often no Repository link in the project URLs section).

PyPI gets the GitHub repo name from the project URLs section, so there should always be a "Source Code" link in the project urls section for any project that also has GitHub statistics listed. If this isn't the case, it's a bug that we should fix instead.

@merwok
Copy link
Contributor

merwok commented Mar 14, 2022

About the confusion: in my PR, we would see «Github Stats» heading followed by the word «Repository:», so I hope that would avoid confusion with a PyPI project.

About the link: The link could be Source code or Home page or Repository or any free text label.
I guess I’ve been lazy and used the github section links as shortcuts to get to the repo for sure! So the weirdness/annoyance was of my own making 🙃

@pradyunsg
Copy link
Contributor Author

I'm also slightly concerned with how displaying the GitHub user/org could be confused with the PyPI user/org (ref #201)

It would clearly show up under the GitHub section and won't be a part of the package name being presented. I don't imagine that we'd put the PyPI organisation in the sidebar either, so... this shouldn't be that bad.

I'm not seeing how this resolves the problem listed in discuss.python.org/t/14181.

Resolving that isn't the intended goal. The goal is to make it clearer what the GitHub repository we're pulling the statistics from and to make it more consistent to get to that repository -- that helps with identifying situations like that. It doesn't do anything to prevent issues like that, which I don't think anyone has ideas for anyway.

I guess I’ve been lazy and used the github section links as shortcuts to get to the repo for sure! So the weirdness/annoyance was of my own making 🙃

Well, I can tell you that you're not the only one who does this. :)

@merwok
Copy link
Contributor

merwok commented Sep 19, 2022

Can we get a direction for this? Otherwise I’ll close the PR.

@jayaddison
Copy link
Contributor

(I arrived here after searching for 'GitLab', because I was curious about support for other statistics providers in the PyPI frontend. I'm not a lawyer)

There are currently many cases where it's non-trivial to determine whether a packaged release corresponds to a source code repository. It's generally (although not always) possible by downloading/cloning them both and comparing the contents. I think that improvements on that are possible, but that seems off-topic here.

There's also a question of trust in published statistics - are they genuine? Why does one apparently-authoritative BosqueLanguage repository seem so much more popular than another?

For people scanning the displayed package detail page visually, it does seem like adding the name of the statistics source repository could save some verification time -- or, to consider it another way: it could make it easier for some people to spot discrepancies.

It does also seem possible that there could be some confusion with PyPI's own upcoming representation of organizations (and projects within those). What are the possible confusion cases, and what would the outcomes of those be?

@jayaddison
Copy link
Contributor

If anything, I'd opt to redact/remove GitHub statistics from the sidebar so as to remove the potential abuse until it can be governed effectively. But again, that's not the issue the thread poster had - they wanted to find the source in GitHub, and that's not required for PyPi, as far as I can tell.

Despite being eager to see additional support for statistics providers, after more thought I'm leaning towards removal of the stats altogether, too.

Rationales:

  • GitLab looks like a potential second-statistics-provider candidate, but as far as I'm aware, they already do a few things to load-shed traffic (keyword searches aren't possible without authentication, for example) -- so adding more inbound unauthenticated requests might not make them happy
  • The traffic from users' own browsers to api.github.com is kinda hidden -- it's visible in the source JavaScript, but it's not super transparent (unlike, for example, EthicalAds where the content's origin is displayed for users to read alongside the ad positions)
  • Links to project URLs are already displayed elsewhere on the package detail page
  • Removing the statistics would reduce the chance for (and incentive for) user confusion about package popularity
  • It could arguably be worthwhile for users to discover project statistics of their own accord separately
  • It reduces PyPi's risk of being accused of placing content that 'belongs' to another entity alongside other content (as far as I'm aware, there's no real trademark issue here, because these are numbers and statistics -- but a) I'm not a lawyer, and b) just because there's no legal requirement doesn't mean it's morally fine not to do anything)
  • It would allow simplifying the site's content security policy

Those are possible benefits that I perceive from removing the stats indicator. No doubt there are downsides too, though (not least: one more click -- although arguably the same number of HTTP requests -- required by users to view project stats).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants