Skip to content

torchhub tests are failing on CI #3664

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
fmassa opened this issue Apr 13, 2021 · 11 comments
Closed

torchhub tests are failing on CI #3664

fmassa opened this issue Apr 13, 2021 · 11 comments

Comments

@fmassa
Copy link
Member

fmassa commented Apr 13, 2021

🐛 Bug

For a few days already, torchhub tests have been failing with the following error message (as can be seen from https://app.circleci.com/pipelines/github/pytorch/vision/7408/workflows/5e29e045-7878-489f-8ff7-8bfe18d47e2a/jobs/511637)

ValueError: Cannot find master in https://github.com/pytorch/vision. If it's a commit from a forked repo, please call hub.load() with forked repo directly

We should fix this, this probably happens due to pytorch/pytorch#54451

cc @seemethere @ailzhang

@ailzhang
Copy link
Contributor

Hmmm I can take a look later today, it's running fine in pytorch CI tho. Is it happening in vision CI?

@ailzhang ailzhang self-assigned this Apr 13, 2021
@fmassa
Copy link
Member Author

fmassa commented Apr 14, 2021

Yes, this is happening on all torchvision CIs, see #3662 for a recent example

@willclarktech
Copy link

willclarktech commented Apr 14, 2021

Looks like the GitHub API is limiting results per page to 30 by default, so master doesn't show up here. Compare

@ailzhang
Copy link
Contributor

@willclarktech @fmassa Yea my bad, was only testing the API on a small repo so didn't notice the number limit. I'll revert the PR first and work on a better fix! @willclarktech Good catch on missing tags on the original PR! I need to read github API more carefull to find a better fit :D
pytorch/pytorch#56048

@willclarktech
Copy link

@ailzhang This preview API seems to do exactly the check you want for commit hashes (although I'm not sure why it's important to restrict commits to branch heads): https://docs.github.com/en/rest/reference/repos#list-branches-for-head-commit. It doesn't cover tags obviously, but those don't introduce the same danger as commits.

@ailzhang
Copy link
Contributor

ailzhang commented Apr 16, 2021

@ailzhang This preview API seems to do exactly the check you want for commit hashes (although I'm not sure why it's important to restrict commits to branch heads): docs.github.com/en/rest/reference/repos#list-branches-for-head-commit. It doesn't cover tags obviously, but those don't introduce the same danger as commits.

@willclarktech this API looks great! although it doesn't handle branch_name/tag_name where in those cases we don't have a commit_hash. Currently I'm just thinking about fixing it in the old way that fetching all the branches and tags and then check whether dst is listed in them. Wdyt?

Btw restricting commits to branch heads is not a hard technical requirement, we just feel that it's easier for repo owner to manage if only a smaller set of commits are available to users. We can totally relax that if there's a request in the future.

@willclarktech
Copy link

@ailzhang Fetching all branches and tags seems like the most straightforward option if you want to restrict commit hashes to branch heads (which you can get from the branches request). It definitely prevents commit hashes from forked repos, which is good. The restriction to branch heads seems odd to me: if I specified a commit to load I'd be surprised eg if my scripts all broke because somebody in that repo pushed to the branch.

Git itself has functionality to check which (if any) branches a commit belongs to on a remote (whether at the branch head or not), but I didn't see a GitHub API endpoint that supports this.

@ailzhang
Copy link
Contributor

@willclarktech Re commit hash: we just feel that it's easier for repo owner to manage if only a smaller set of branches/tags are available to users. Actually branch_name/tag_name are used more frequently than commit hash in practice(which won't break if somebody pushed to the branch). We can totally relax that if there's a request in the future.

@ailzhang
Copy link
Contributor

@willclarktech Another reason that we prefer branch/tag head is that:
if user load from a commit hash and the model doesn't work, we don't expect the repo owner to fix it unless the commit hash is a head for any branch/tag (in other words we only expect to head work well but not a random commit in the branch to give some room for repo owners to commit breakages if necessary).

@willclarktech
Copy link

@ailzhang I guess I find the reasoning confusing because in other contexts (eg installing a package via GitHub in poetry/npm/whatever) the use cases are typically something like this:

  • tag: I'm a regular user and I want a stable version
  • branch: I'm a developer working on the code or a power user accessing an unreleased/alternative feature set
  • commit hash: I'm probably a user who has hit up against a bug in a tagged release. I need a temporary fix now, but will switch back to a tag as soon as the fix has been released. In this case I'm aware that there won't be much support from the code maintainers for problems that exist at that commit, but I do expect the same script to work the same way when run at different times because the commit hash is intended to pin the functionality to a specific point in the code's history.

I don't have a specific use case here where I need a commit hash that's not at a branch head, just trying to give you some data on developer expectations - hope it's helpful.

@datumbox
Copy link
Contributor

Closing the issue as this is now resolved on master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants