Skip to content

More backports #29

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jayvdb opened this issue Jan 11, 2020 · 3 comments
Closed

More backports #29

jayvdb opened this issue Jan 11, 2020 · 3 comments

Comments

@jayvdb
Copy link

jayvdb commented Jan 11, 2020

The list of backports can be partially obtained using pypi/stdlib-list#28

Using that, I found the following are missing from the EXCLUDED_PACKAGES list.

pprint, resource and ast should be excluded IMO, as I am sure their inclusion in the top packages is only due to having the same name as a stdlib package. I think they should also be delisted from PyPI and other existing stdlib names prevented from being used (or an extra permission needed to control who can use those names) to prevent malicious uploads.

DateTime would probably not be on the list if it wasnt for the stdlib name clash. Oddly, the 'used by' on https://github.com/zopefoundation/DateTime is quite high (7.2k) - I wonder if the GitHub stats are also skewed. But it could be that the GitHub stats are correct, as this is Zope, and IMO it isnt worth delisting it from this project due to the Zope aspect - perhaps intentionally move it to the end of the list since its true relevance based on download count is highly suspect, so its appropriate position in the list is not knowable.

fwiw, the other stdlib names in the top list all appear to be 'safe'.

@hugovk
Copy link
Owner

hugovk commented Jan 11, 2020

Thanks for this!

I've only been adding things to the exclusion list when I've seen them show up as one the 360 here, and I see that the most downloaded of these ones (DateTime) is currently at 632 on Top PyPI Packages, and the next one (logging) is at 791, so at least they're unlikely to be relevant for a while.

6396758 - DateTime
4527726 - logging
4262268 - statistics
4177601 - dataclasses
3097987 - asyncio
2741560 - enum
2070490 - uuid
1458974 - importlib
1291645 - contextvars
1194645 - Resource
905303 - functools
842383 - faulthandler
720260 - pprint
699354 - html
405650 - AST
374841 - readline
331224 - multiprocessing
269055 - email
231243 - wsgiref
230497 - hashlib

I'm not aiming for a definitive list of exclusions, only those that appear in the 360, so this is useful for future reference. If you want, feel free to create a PR to exclude some that are definitely backports, although if they may never show up here I wouldn't put too much effort into that. Thanks again!

@jayvdb
Copy link
Author

jayvdb commented Jan 11, 2020

Ah, I am using the full list for my current analysis. The problem is still relevant to https://github.com/hugovk/top-pypi-packages I guess.

For completeness, some other packages using stdlib names that look questionable / not backports, but need to recheck

  • chunk
  • dis
  • formatter
  • mailbox
  • modulefinder
  • secrets
  • token
  • turtle
  • wave

The following have urls and no artifacts on PyPI

  • numbers
  • calendar
  • trace
  • signal
  • shelve
  • select

numbers, select and signal point to github repos which are 404

select was renamed to https://github.com/Jaymon/que

time has no urls or artifacts
trace links to http://billionuploads.com/ka79h2t4jpi1 which looks dodgy but is 404 atm

@hugovk
Copy link
Owner

hugovk commented Jan 30, 2020

I'm keeping the list at https://github.com/hugovk/top-pypi-packages vanilla with no changes at all, so people can use it as they wish. For example, maybe to find out how popular a backport is to decide whether to continue maintaining it. But thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants