Skip to content

Add TorchVision wheel builds for M1 #5948

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 26, 2022
Merged

Add TorchVision wheel builds for M1 #5948

merged 1 commit into from
May 26, 2022

Conversation

malfet
Copy link
Contributor

@malfet malfet commented May 5, 2022

No description provided.

@malfet malfet force-pushed the malfet/build-on-m1 branch 3 times, most recently from fe864ec to 88a92b4 Compare May 5, 2022 01:50
@malfet malfet force-pushed the malfet/build-on-m1 branch 4 times, most recently from ed138b9 to f602d6a Compare May 6, 2022 01:47
@malfet malfet changed the title [WIP] Add Build-on-M1 workflow Add TorchVision wheel build for M1 May 19, 2022
@malfet malfet changed the title Add TorchVision wheel build for M1 Add TorchVision wheel builds for M1 May 20, 2022
@malfet malfet force-pushed the malfet/build-on-m1 branch from a2cd1ef to 911c38c Compare May 20, 2022 21:28
@NicolasHug
Copy link
Member

@malfet Are we waiting for anything specific before we can merge this?

@datumbox
Copy link
Contributor

datumbox commented May 23, 2022

@malfet The linter is complaining for a trailing new line that is missing and there is a segmentation fault on binary_macos_conda_py3.10_cpu.

Also, there are some indications (see #6067, #5893, #5413) that TorchVision might not work as expected on M1 hardware. I think it's worth adding in this PR the CI jobs for unit-tests to be able to debug it and see if it's easy to fix. If not we would need to decide whether TorchVision should release M1 binaries this release or if they are broken and we can't fix them in time, bump it in terms of priority for the next one.

@malfet malfet mentioned this pull request May 26, 2022
@malfet malfet requested a review from atalman May 26, 2022 22:22
@malfet malfet force-pushed the malfet/build-on-m1 branch 2 times, most recently from 1d5752f to 2b8a72f Compare May 26, 2022 23:09
@malfet malfet force-pushed the malfet/build-on-m1 branch from 2b8a72f to c91ad06 Compare May 26, 2022 23:10
jobs:
build_wheels:
name: "Build TorchVision M1 wheels"
runs-on: macos-m1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have only 1 macos-m1 worker currently. I will change the rest of our workers to adhere to the same macos-m1 label

Copy link
Contributor

@atalman atalman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@malfet malfet merged commit d592925 into main May 26, 2022
@malfet malfet deleted the malfet/build-on-m1 branch May 26, 2022 23:41
@github-actions
Copy link

Hey @malfet!

You merged this PR, but no labels were added. The list of valid labels is available at https://github.com/pytorch/vision/blob/main/.github/process_commit.py

@NicolasHug
Copy link
Member

@malfet @atalman

as @datumbox noted above #5948 (comment), we're not certain that torchvision properly works on M1. In order to avoid additional bug reports from users, are you planning to add an M1 CI so we're sure the binaries are working properly?

@malfet
Copy link
Contributor Author

malfet commented May 30, 2022

In order to avoid additional bug reports from users, are you planning to add an M1 CI so we're sure the binaries are working properly?

@NicolasHug we don't have a capacity to run PR CI on M1 right now (there are 4 machines in total for the entire org), but if you can pick a few smoketests we can try running them.

@datumbox
Copy link
Contributor

@malfet Our users are currently reporting that multiple parts of TorchVision are not working on M1: #6067, #5893, #5413

If that's true, it will take more than a couple of smoketests to debug, fix and ensure it works smoothly. If we currently don't have the resources for adding proper support for M1 and testing them across PyTorch, we should revert this PR and consider adding it back once we have the necessary capacity. Otherwise, after release we will have a huge volume of users, reporting that TorchVision is broken on M1 with maintainers unable to know what to do next and how to address this. We are happy to discuss a middle ground solution that allows us to run our tests in one configuration on M1. Please let me know.

@malfet
Copy link
Contributor Author

malfet commented May 30, 2022

@datumbox thank you for sharing. All those sound like a basic packaging issues to me, so I'm not sure how CI testing (where system is configured to have all above mentioned dependencies will help) And even if PR CI would be enabled, those machines would not have rerun with SSH capability.

But we should definitely test for the conditions mentioned in above during packaging.

facebook-github-bot pushed a commit to pytorch/audio that referenced this pull request May 31, 2022
Summary:
This PR adds M1 wheel builds for torchaudio
Based on this PR: pytorch/vision#5948
And this Builder [script](https://github.com/pytorch/builder/blob/main/build_m1_domains.sh)

Pull Request resolved: #2421

Reviewed By: mthrok

Differential Revision: D36767469

Pulled By: atalman

fbshipit-source-id: 9fc3b74b50ee669a230302fd27682702f83f63dc
@datumbox
Copy link
Contributor

I had an offline discussion with @malfet concerning this. I'll summarize here what was discussed:

Our team doesn't have an M1 mac locally but even if we did, we still need to test each PR prior merging on at least one M1/Python configuration. Nikita agreed that since TorchVision is the first to be onboarded on native M1, we can do that now. Nevertheless because PyTorch has only 4 machines, it is not guaranteed that the job will be maintained. To avoid having the job removed, TorchVision should dedicate time to speed up CI execution for its tests.

I will therefore use the approach on this PR to add a unit-test job for a single python configuration. See #6111

YosuaMichael pushed a commit to YosuaMichael/vision that referenced this pull request May 31, 2022
@datumbox datumbox mentioned this pull request May 31, 2022
facebook-github-bot pushed a commit that referenced this pull request Jun 1, 2022
Reviewed By: NicolasHug

Differential Revision: D36760940

fbshipit-source-id: 9fc5198521097148843bae042102b10c38861c7c
atalman pushed a commit to atalman/vision that referenced this pull request Jun 14, 2022
NicolasHug added a commit that referenced this pull request Jun 15, 2022
* Add M1 wheels binary builds (#5948)

* [M1] Set build version and delocate wheels (#6110)

This would package libpng and libjpeg.dylib into wheel files
Add a very simple test step, copied from https://github.com/pytorch/pytorch.github.io/blob/1eaa33a3d3f1b83b64c5031c6dd04dbb238d6105/scripts/test_install.py#L78
Cherry-picked from https://github.com/pytorch/builder/blob/d0bc74cc363a9da5a8b6a40e883d40d25d050036/build_m1_domains.sh#L22

* [BE] Unify version computation (#6117)

* [BE] Unify version computation

Instead of hardcoding dev version in various script, use one from
`version.txt` if `setup_build_version` is called without arguments

Also, pass `--pre` option to M1 build/test pip install commands to build
TorchVision against nightly pytorch

* Pin torchvision dependency to a specific pytorch version

* [M1] Install "jpeg<=9b" rather than OpenJpeg (#6122)

Explicitly set PATH to point to `conda` binary, otherwise libjpeg detection logic does not work
Pin libjpeg to the same version on x86 and m1
Add simple tests that jpeg can be decoded by a generated wheel

* Add unit-tests for M1 (#6132)

* Add M1 testing job

* libjpeg -> jpeg<=9b in test-m1.yml

* Added export PATH=~/miniconda3/bin... from 6122

* Tests were OK, let's see if we can remove the pinning

* GH: Add M1 conda build workflows (#6135)

Clean up Conda build folder before every run
Enable artifact upload to GitHub for every workflow run, but upload to Conda/S3 only on nightly pushes

Test plan: `conda install -c pytorch-nightly torchvision; python -c "import torchvision;print(torchvision.io.read_image('hummingbird.jpg').shape)"`

* Fix `Test M1` workflow

By passing `--pre` option to `pip install`, otherwise torchvision were always tested against last PyTorch release

* Adding tagged builds for M1 (#6140)

* Adding tagged builds

* Testing

* Testing

* Testing

* Testing

* Adding conda builds

* Fix `if` condition for s3/conda uploads (#6146)

Replace `steps.extract_branch.outputs.branch` (which were probably taken from
https://stackoverflow.com/questions/58033366/how-to-get-the-current-branch-within-github-actions ) with straightforward `github.event.ref`

Tested in
https://github.com/malfet/deleteme/runs/6822647720?check_suite_focus=true and https://github.com/malfet/deleteme/runs/6822691158?check_suite_focus=true

* Fix typo in GHA nightly build condition (#6158)

s#ref/heads/#refs/heads/#

I should have noticed it while copy-n-pasting the condition. Unfortunately there are no way to test is other than in prod, but nightly builds are still not getting pushed, see https://github.com/pytorch/vision/runs/6860407007?check_suite_focus=true for example

* Making sure we are building against release

* Testing

Testing

Testing

Testing

testing

Testing

Testing

Testing

Testing

Testing

Testing

Testing

* Testing

* Testing

* Cleanup

* Refactoring logic

Co-authored-by: Nikita Shulga <[email protected]>
Co-authored-by: Nicolas Hug <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants