[BE] Add sccache to manywheel binary build #1169

huydhn · 2022-10-21T21:41:20Z

Manywheel binary build currently doesn't have any compiler cache, so it takes more than 2h+ to just build PyTorch https://github.com/pytorch/pytorch/actions/runs/3285968556/jobs/5413580528.

I'll make a similar change to libtorch binary build later in a separate PR (lower priority cause building libtorch is not that slow)

Testing

Build the new image locally with GPU_ARCH_TYPE=cuda GPU_ARCH_VERSION=11.6 manywheel/build_docker.sh
Build PyTorch binary locally with the locally built image, i.e. pytorch/manylinux-builder:cuda11.6-e7608179efd287af102e40941fc24abff8d8a5bd. Here is the exact command I run inside the container:

export PYTORCH_ROOT=/tmp/pytorch
export BUILDER_ROOT=/tmp/builder
export PACKAGE_TYPE=manywheel
export DESIRED_CUDA=cu116
export GPU_ARCH_VERSION=11.6
export GPU_ARCH_TYPE=cuda
export DESIRED_PYTHON=3.7
export SCCACHE_BUCKET=ossci-compiler-cache-circleci-v2
export SCCACHE_S3_KEY_PREFIX=debug
/tmp/builder/manywheel/build.sh

huydhn · 2022-10-21T23:37:03Z

install_cached.sh is copied as it is from https://github.com/pytorch/pytorch/blob/master/.circleci/docker/common/install_cache.sh

huydhn · 2022-10-22T01:54:13Z

The ROCm failure https://github.com/pytorch/builder/actions/runs/3301097112/jobs/5446221931 is due to the recent change in #1160. cc @jataylo Could you help do a fix for that? It looks like 5.1.1 is missing from the list (-eq v.s. -ge)

jataylo · 2022-10-24T09:13:54Z

@huydhn Will take a look at this, thank you.

CC @jithunnair-amd

jithunnair-amd · 2022-10-24T19:24:50Z

@huydhn Attempting a fix in PR #1170

huydhn · 2022-10-24T21:48:22Z

Push to https://github.com/pytorch/builder/tree/add-sccache-support so that the docker image can be published for testing

huydhn · 2022-10-24T23:23:28Z

The new docker images are published for testing cuda11.6

malfet · 2022-10-25T18:30:57Z

Not using sccache for binary builds is a design decision:

sccache in the past is known to return stale/invalid objects (as it failed to account that file needs to be recompiled)
This also a security risk, as one can submit a PR that populates cache with some malicious data that will be incorporated into a nightly/release builds.

atalman · 2022-10-25T18:37:45Z

I would also add, that for binary builds we want to have predictable dependency versions from what I saw in failure logs it failed to return some dependency - dependency was missing or incorrect version. This maybe a same use case as @malfet point 1.

Add sccache to manywheel binary build

02fe95c

huydhn requested review from atalman and a team October 21, 2022 21:41

huydhn self-assigned this Oct 21, 2022

facebook-github-bot added the cla signed label Oct 21, 2022

Handle rocm and cpu and cuda

e760817

huydhn mentioned this pull request Oct 21, 2022

[BE] Use sccache when building manywheel binary pytorch/pytorch#87523

Closed

huydhn marked this pull request as ready for review October 21, 2022 23:16

Fix copy paste duplication

cf7fdd0

jataylo mentioned this pull request Oct 24, 2022

Use ROCm5.3 branch for MIOpen ROCm/builder#12

Merged

jithunnair-amd mentioned this pull request Oct 24, 2022

[ROCm] support for rocm5.3 wheel builds #1160

Merged

huydhn added the ciflow/binaries label Oct 24, 2022

Debug with the new docker cuda images

09fcfc0

huydhn removed the ciflow/binaries label Oct 24, 2022

Uncomment debug

7fe191b

huydhn added 2 commits October 24, 2022 15:34

Testing add-sccache-support

37bd8cc

Remove reference to the debug branch

5934f8b

huydhn added 7 commits October 24, 2022 20:21

Install sccache from source

0de0b28

PyTorch fork of sccache does not work with centos

c58b2ff

Not using sccache if it is not setup properly

ef33b03

Unset sccache if it is not used

10ea871

Quick fix to not use sccache if it's not setup properly

46923db

Merge branch 'main' of github.com:huydhn/builder

765e3f8

Merge branch 'main' into add-sccache-support

c461884

huydhn added 2 commits October 25, 2022 00:44

Add a comment about pytorch/sccache issue on CentOS

2babaea

Pin sccache version to 0.3.0

bed12b4

huydhn mentioned this pull request Oct 25, 2022

Support testing Docker images in PR #1174

Open

huydhn added 2 commits October 25, 2022 11:19

Merge branch 'main' into add-sccache-support

099f522

Merge from main

cb2f598

huydhn marked this pull request as draft October 25, 2022 21:18

huydhn closed this Oct 26, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[BE] Add sccache to manywheel binary build #1169

[BE] Add sccache to manywheel binary build #1169

huydhn commented Oct 21, 2022 •

edited

Loading

Uh oh!

huydhn commented Oct 21, 2022

Uh oh!

huydhn commented Oct 22, 2022 •

edited

Loading

Uh oh!

jataylo commented Oct 24, 2022

Uh oh!

jithunnair-amd commented Oct 24, 2022

Uh oh!

huydhn commented Oct 24, 2022

Uh oh!

huydhn commented Oct 24, 2022 •

edited

Loading

Uh oh!

malfet commented Oct 25, 2022

Uh oh!

atalman commented Oct 25, 2022 •

edited

Loading

Uh oh!

[BE] Add sccache to manywheel binary build #1169

[BE] Add sccache to manywheel binary build #1169

Conversation

huydhn commented Oct 21, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Testing

Uh oh!

huydhn commented Oct 21, 2022

Uh oh!

huydhn commented Oct 22, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jataylo commented Oct 24, 2022

Uh oh!

jithunnair-amd commented Oct 24, 2022

Uh oh!

huydhn commented Oct 24, 2022

Uh oh!

huydhn commented Oct 24, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

malfet commented Oct 25, 2022

Uh oh!

atalman commented Oct 25, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

huydhn commented Oct 21, 2022 •

edited

Loading

huydhn commented Oct 22, 2022 •

edited

Loading

huydhn commented Oct 24, 2022 •

edited

Loading

atalman commented Oct 25, 2022 •

edited

Loading