manywheel: add _GLIBCXX_USE_CXX11_ABI=1 support for linux cpu wheel #990

zhuhong61 · 2022-03-12T08:29:59Z

The target of this commit is to add the support for a new linux
cpu pip wheel file built with _GLIBCXX_USE_CXX11_ABI=1.

Currently, linux wheels are built in a system based on CentOS7
and devtoolset7, and CXX11_ABI is ignored by the compiler. The
same issue with devtoolset8 and devtoolset9, and so we add a Docker
file (Dockerfile_cxx11-abi) with Ubuntu 18.04 as base image to
support CXX11_ABI=1, by referring the Dockerfile for libtorch.

To build the new docker image with CXX11_ABI support, run:
GPU_ARCH_TYPE=cpu-cxx11-abi manywheel/build_docker.sh
or
manywheel/build_all_docker.sh

To build a linux cpu pip wheel with CXX11_ABI within this image, run:
// the below settings are special for this image
export DESIRED_CUDA=cpu-cxx11-abi # change from cpu for wheel name
export GPU_ARCH_TYPE=cpu-cxx11-abi # change from cpu for build.sh
export DOCKER_IMAGE=pytorch/manylinuxcxx11-abi-builder:cpu-cxx11-abi
export DESIRED_DEVTOOLSET=cxx11-abi

// the below settings are as usual
export BINARY_ENV_FILE=/tmp/env
export BUILDER_ROOT=/builder
export DESIRED_PYTHON=3.7 # or 3.8, 3.9, etc.
export IS_GHA=1
export PACKAGE_TYPE=manywheel
export PYTORCH_FINAL_PACKAGE_DIR=/artifacts
export PYTORCH_ROOT=/pytorch
export GITHUB_WORKSPACE=/your_path_to_workspace

// the '-e DESIRED_DEVTOOLSET' below is newly added for this container,
// others are as usual
set -x
mkdir -p artifacts/
container_name=$(docker run
-e BINARY_ENV_FILE
-e BUILDER_ROOT
-e DESIRED_CUDA
-e DESIRED_PYTHON
-e GPU_ARCH_TYPE
-e IS_GHA
-e PACKAGE_TYPE
-e PYTORCH_FINAL_PACKAGE_DIR
-e PYTORCH_ROOT
-e DOCKER_IMAGE
-e DESIRED_DEVTOOLSET
--tty
--detach
-v "${GITHUB_WORKSPACE}/pytorch:/pytorch"
-v "${GITHUB_WORKSPACE}/builder:/builder"
-v "${RUNNER_TEMP}/artifacts:/artifacts"
-w /
"${DOCKER_IMAGE}"
)

// build pip wheel as usual,
// and the built wheel file name looks like: torch-1.12.0.dev20220312+cpu.cxx11.abi-cp37-cp37m-linux_x86_64.whl
docker exec -t -w "${PYTORCH_ROOT}" "${container_name}" bash -c "bash .circleci/scripts/binary_populate_env.sh"
docker exec -t "${container_name}" bash -c "source ${BINARY_ENV_FILE} && bash /builder/manywheel/build.sh"

// to verify the built wheel file, we'll see 'True'
$ pip install torch-1.12.0.dev20220312+cpu.cxx11.abi-cp37-cp37m-linux_x86_64.whl
$ python -c 'import torch; print(torch._C._GLIBCXX_USE_CXX11_ABI)'
True

Co-authored-by: Guo Yejun [email protected]
Co-authored-by: Zhu Hong [email protected]

The target of this commit is to add the support for a new linux cpu pip wheel file built with _GLIBCXX_USE_CXX11_ABI=1. Currently, linux wheels are built in a system based on CentOS7 and devtoolset7, and CXX11_ABI is ignored by the compiler. The same issue with devtoolset8 and devtoolset9, and so we add a Docker file (Dockerfile_cxx11-abi) with Ubuntu 18.04 as base image to support CXX11_ABI=1, by referring the Dockerfile for libtorch. To build the new docker image with CXX11_ABI support, run: GPU_ARCH_TYPE=cpu-cxx11-abi manywheel/build_docker.sh or manywheel/build_all_docker.sh To build a linux cpu pip wheel with CXX11_ABI within this image, run: // the below settings are special for this image export DESIRED_CUDA=cpu-cxx11-abi # change from cpu for wheel name export GPU_ARCH_TYPE=cpu-cxx11-abi # change from cpu for build.sh export DOCKER_IMAGE=pytorch/manylinuxcxx11-abi-builder:cpu-cxx11-abi export DESIRED_DEVTOOLSET=cxx11-abi // the below settings are as usual export BINARY_ENV_FILE=/tmp/env export BUILDER_ROOT=/builder export DESIRED_PYTHON=3.7 # or 3.8, 3.9, etc. export IS_GHA=1 export PACKAGE_TYPE=manywheel export PYTORCH_FINAL_PACKAGE_DIR=/artifacts export PYTORCH_ROOT=/pytorch export GITHUB_WORKSPACE=/your_path_to_workspace // the '-e DESIRED_DEVTOOLSET' below is newly added for this container, // others are as usual set -x mkdir -p artifacts/ container_name=$(docker run \ -e BINARY_ENV_FILE \ -e BUILDER_ROOT \ -e DESIRED_CUDA \ -e DESIRED_PYTHON \ -e GPU_ARCH_TYPE \ -e IS_GHA \ -e PACKAGE_TYPE \ -e PYTORCH_FINAL_PACKAGE_DIR \ -e PYTORCH_ROOT \ -e DOCKER_IMAGE \ -e DESIRED_DEVTOOLSET \ --tty \ --detach \ -v "${GITHUB_WORKSPACE}/pytorch:/pytorch" \ -v "${GITHUB_WORKSPACE}/builder:/builder" \ -v "${RUNNER_TEMP}/artifacts:/artifacts" \ -w / \ "${DOCKER_IMAGE}" ) // build pip wheel as usual, // and the built wheel file name looks like: torch-1.12.0.dev20220312+cpu.cxx11.abi-cp37-cp37m-linux_x86_64.whl docker exec -t -w "${PYTORCH_ROOT}" "${container_name}" bash -c "bash .circleci/scripts/binary_populate_env.sh" docker exec -t "${container_name}" bash -c "source ${BINARY_ENV_FILE} && bash /builder/manywheel/build.sh" // to verify the built wheel file, we'll see 'True' $ pip install torch-1.12.0.dev20220312+cpu.cxx11.abi-cp37-cp37m-linux_x86_64.whl $ python -c 'import torch; print(torch._C._GLIBCXX_USE_CXX11_ABI)' True Co-authored-by: Guo Yejun <[email protected]> Co-authored-by: Zhu Hong <[email protected]>

facebook-github-bot · 2022-03-12T08:30:04Z

Hi @zhuhong61!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at [email protected]. Thanks!

facebook-github-bot · 2022-03-12T12:07:54Z

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks!

guoyejun · 2022-03-18T13:58:03Z

any comment? thanks.

guoyejun · 2022-03-31T01:58:22Z

@malfet any comment? thanks.

guoyejun · 2022-03-31T02:09:59Z

there is error to "Build manywheel docker images / build-docker-cuda (11.3) (pull_request) Failing after 1m "

I checked the log, and looks that this issue is not caused by this PR.

+ docker build -t docker.io/pytorch/manylinux-builder:cuda11.3 --build-arg BASE_CUDA_VERSION=11.3 --build-arg GPU_IMAGE=nvidia/cuda:10.2-devel-centos7 --target cuda_final -f /home/runner/work/builder/builder/manywheel/Dockerfile /home/runner/work/builder/builder
...
#11 [common  2/19] RUN yum install -y         aclocal         autoconf         automake         bison         bzip2         curl         diffutils         file         git         make         patch         perl         unzip         util-linux         wget         which         xz         yasm
#11 sha256:6c19de3706b80355181e9c4a16009985be81e232ce9826bbe69ac21eaba9150b
#11 0.447 Loaded plugins: fastestmirror, ovl
#11 0.604 Determining fastest mirrors
#11 1.180  * base: mirror.vacares.com
#11 1.180  * extras: centos.mirror.lstn.net
#11 1.181  * updates: atl.mirrors.clouvider.net
#11 1.635 https://developer.download.nvidia.com/compute/machine-learning/repos/rhel7/x86_64/repodata/repomd.xml: [Errno 14] HTTPS Error 404 - Not Found
#11 1.636 Trying other mirror.
#11 1.636 To address this issue please refer to the below wiki article 
#11 1.636 
#11 1.636 https://wiki.centos.org/yum-errors
#11 1.636 
#11 1.636 If above article doesn't help to resolve this issue please use [https://bugs.centos.org/.](https://bugs.centos.org/)
#11 1.636 
#11 1.638 
#11 1.638 
#11 1.638  One of the configured repositories failed (nvidia-ml),
#11 1.638  and yum doesn't have enough cached data to continue. At this point the only
#11 1.638  safe thing yum can do is fail. There are a few ways to work "fix" this:
#11 1.638 
#11 1.638      1. Contact the upstream for the repository and get them to fix the problem.
#11 1.638 
#11 1.638      2. Reconfigure the baseurl/etc. for the repository, to point to a working
#11 1.638         upstream. This is most often useful if you are using a newer
#11 1.638         distribution release than is supported by the repository (and the
#11 1.638         packages for the previous distribution release still work).
#11 1.638 
#11 1.638      3. Run the command with the repository temporarily disabled
#11 1.638             yum --disablerepo=nvidia-ml ...
#11 1.638 
#11 1.638      4. Disable the repository permanently, so yum won't use it by default. Yum
#11 1.638         will then just ignore the repository until you permanently enable it
#11 1.638         again or use --enablerepo for temporary usage:
#11 1.638 
#11 1.638             yum-config-manager --disable nvidia-ml
#11 1.638         or
#11 1.638             subscription-manager repos --disable=nvidia-ml
#11 1.638 
#11 1.638      5. Configure the failing repository to be skipped, if it is unavailable.
#11 1.638         Note that yum will try to contact the repo. when it runs most commands,
#11 1.638         so will have to try and fail each time (and thus. yum will be be much
#11 1.638         slower). If it is a very temporary problem though, this is often a nice
#11 1.638         compromise:
#11 1.638 
#11 1.638             yum-config-manager --save --setopt=nvidia-ml.skip_if_unavailable=true
#11 1.638 
#11 1.638 failure: repodata/repomd.xml from nvidia-ml: [Errno [256](https://github.com/pytorch/builder/runs/5764231634?check_suite_focus=true#step:4:256)] No more mirrors to try.
#11 1.638 https://developer.download.nvidia.com/compute/machine-learning/repos/rhel7/x86_64/repodata/repomd.xml: [Errno 14] HTTPS Error 404 - Not Found
#11 ERROR: executor failed running [/bin/sh -c yum install -y         aclocal         autoconf         automake         bison         bzip2         curl         diffutils         file         git         make         patch         perl         unzip         util-linux         wget         which         xz         yasm]: exit code: 1

#18 [base 2/8] RUN yum install -y wget curl perl util-linux xz bzip2 git patch which perl zlib-devel
#18 sha256:85b55d7[268](https://github.com/pytorch/builder/runs/5764231634?check_suite_focus=true#step:4:268)ad8c9b497852109d67f42ae51b7a367e627ee35635e6da9d5ff657
#18 CANCELED
------
 > [common  2/19] RUN yum install -y         aclocal         autoconf         automake         bison         bzip2         curl         diffutils         file         git         make         patch         perl         unzip         util-linux         wget         which         xz         yasm:
------
executor failed running [/bin/sh -c yum install -y         aclocal         autoconf         automake         bison         bzip2         curl         diffutils         file         git         make         patch         perl         unzip         util-linux         wget         which         xz         yasm]: exit code: 1
Error: Process completed with exit code 1.

malfet · 2022-03-31T02:21:51Z

@guoyejun yeah, NVIDIA is having an outage today, see pytorch/pytorch#74968

guoyejun · 2022-03-31T02:27:51Z

thanks @malfet

and looks that we can add a new check (build-docker-cpu-cxx11-abi) in the CI system once this PR is accepted. :)

guoyejun · 2022-04-08T11:20:16Z

@malfet possible to add new checks in CI system for github/pytorch/pytorch to verify the built linux pip wheel with cxx11-abi? thanks.

And we also need to build the cxx11-abi wheel file nightly and at release time. Take nightly as an example, to provide the pip cxx11 wheel file at https://download.pytorch.org/whl/nightly/cpu, and also mention the install commands at https://pytorch.org/get-started/locally/ (Preview (Nightly)--Linux--Pip--python--CPU)

guoyejun · 2022-04-20T02:01:02Z

looks that glibc version is a bit high with ubuntu 18.04 as base image in this PR, we'll try to see if centos8 (with lower glibc version) as base image works, want to know your comment, @malfet , we may create the change if you do not object, thanks

malfet · 2022-04-20T20:45:15Z

@malfet possible to add new checks in CI system for github/pytorch/pytorch to verify the built linux pip wheel with cxx11-abi? thanks.

@guoyejun feel free to propose the PR that does that (binary build matrix is defined in https://github.com/pytorch/pytorch/blob/master/.github/scripts/generate_binary_build_matrix.py )

guoyejun · 2022-04-21T04:06:45Z

@malfet possible to add new checks in CI system for github/pytorch/pytorch to verify the built linux pip wheel with cxx11-abi? thanks.

@guoyejun feel free to propose the PR that does that (binary build matrix is defined in https://github.com/pytorch/pytorch/blob/master/.github/scripts/generate_binary_build_matrix.py )

got it, thanks. We'll look at it after the base image is done.

zhuhong61 · 2022-06-18T03:17:04Z

Hi @malfet, we add new checks for build-docker-cpu-cxx11-abi in CI system, and the work-flowchecks in our PR https://github.com/pytorch/pytorch/pull/79409 has regenerated the .github/workflows/generated-linux-binary-manywheel-nightly.yml. How can we make sure the jobs in generated-linux-binary-manywheel-nightly.yml has been truly triggered, and whether the cpu-cxx-abi docker has been built? Is there any other works needed? Thanks!

…cpu-cxx11-abi (#79409) We added the linux pip wheel with cpu-cxx11-abi in pytorch/builder, see: pytorch/builder#990 and pytorch/builder#1023 The purpose of this PR is to add new checks in pytorch CI system to verify the linux pip wheel with cpu-cxx11-abi. Co-authored-by: Zhu Hong <[email protected]> Co-authored-by: Guo Yejun <[email protected]> Pull Request resolved: #79409 Approved by: https://github.com/malfet

facebook-github-bot added the cla signed label Mar 12, 2022

malfet approved these changes Mar 31, 2022

View reviewed changes

malfet merged commit 53b8397 into pytorch:main Mar 31, 2022

zhuhong61 mentioned this pull request Jun 13, 2022

Add new checks in CI system to verify the built linux pip wheel with cpu-cxx11-abi pytorch/pytorch#79409

Closed

zhuhong61 mentioned this pull request Jul 4, 2022

conda: add _GLIBCXX_USE_CXX11_ABI=1 support for linux cpu wheel #1062

Merged

zhuhong61 mentioned this pull request Nov 20, 2022

Status of pip wheels with _GLIBCXX_USE_CXX11_ABI=1 pytorch/pytorch#51039

Open

atalman mentioned this pull request Dec 15, 2022

Generate Manywheel packages for Intel #1239

Merged

angerson mentioned this pull request Apr 7, 2023

Importing TF 2.12, then torch, hangs, but not the other way around tensorflow/tensorflow#60109

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

manywheel: add _GLIBCXX_USE_CXX11_ABI=1 support for linux cpu wheel #990

manywheel: add _GLIBCXX_USE_CXX11_ABI=1 support for linux cpu wheel #990

Uh oh!

zhuhong61 commented Mar 12, 2022

Uh oh!

facebook-github-bot commented Mar 12, 2022

Uh oh!

facebook-github-bot commented Mar 12, 2022

Uh oh!

guoyejun commented Mar 18, 2022

Uh oh!

guoyejun commented Mar 31, 2022

Uh oh!

guoyejun commented Mar 31, 2022 •

edited

Loading

Uh oh!

malfet commented Mar 31, 2022

Uh oh!

guoyejun commented Mar 31, 2022

Uh oh!

guoyejun commented Apr 8, 2022

Uh oh!

guoyejun commented Apr 20, 2022

Uh oh!

malfet commented Apr 20, 2022 •

edited

Loading

Uh oh!

guoyejun commented Apr 21, 2022

Uh oh!

zhuhong61 commented Jun 18, 2022 •

edited

Loading

Uh oh!

Uh oh!

manywheel: add _GLIBCXX_USE_CXX11_ABI=1 support for linux cpu wheel #990

manywheel: add _GLIBCXX_USE_CXX11_ABI=1 support for linux cpu wheel #990

Uh oh!

Conversation

zhuhong61 commented Mar 12, 2022

Uh oh!

facebook-github-bot commented Mar 12, 2022

Action Required

Process

Uh oh!

facebook-github-bot commented Mar 12, 2022

Uh oh!

guoyejun commented Mar 18, 2022

Uh oh!

guoyejun commented Mar 31, 2022

Uh oh!

guoyejun commented Mar 31, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

malfet commented Mar 31, 2022

Uh oh!

guoyejun commented Mar 31, 2022

Uh oh!

guoyejun commented Apr 8, 2022

Uh oh!

guoyejun commented Apr 20, 2022

Uh oh!

malfet commented Apr 20, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

guoyejun commented Apr 21, 2022

Uh oh!

zhuhong61 commented Jun 18, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

guoyejun commented Mar 31, 2022 •

edited

Loading

malfet commented Apr 20, 2022 •

edited

Loading

zhuhong61 commented Jun 18, 2022 •

edited

Loading