Skip to content

Make build environment manylinux2014 compatible #57

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Mar 21, 2022

Conversation

nitins17
Copy link
Member

@nitins17 nitins17 commented Dec 2, 2021

Modifies the build environment to be manylinux2014 compatible

Changes include:

  • Added devtoolset-9 to the Dockerfile
  • Changed to the manylinux2014 toolchain in {cpu,gpu}.bazelrc
  • Added manylinux2014 compatible glibc and libstdc++ libraries in devtoolset-9 (/dt9)
  • Use the new libstdcxx ABI

@nitins17 nitins17 added the build and push to gcr.io for staging Create a staging release on gcr.io label Dec 2, 2021
@github-actions github-actions bot removed build and push to gcr.io for staging Create a staging release on gcr.io labels Dec 2, 2021
@github-actions
Copy link

github-actions bot commented Dec 2, 2021

I pushed these containers:

  • gcr.io/tensorflow-sigs/build:57-python3.9
  • gcr.io/tensorflow-sigs/build:57-python3.8
  • gcr.io/tensorflow-sigs/build:57-python3.7

Re-apply the build and push to gcr.io for staging label to
rebuild and push again. This comment will only be posted once.

@nitins17 nitins17 added build and push to gcr.io for staging Create a staging release on gcr.io and removed build and push to gcr.io for staging Create a staging release on gcr.io labels Dec 3, 2021
@nitins17 nitins17 added the build and push to gcr.io for staging Create a staging release on gcr.io label Jan 7, 2022
@github-actions
Copy link

github-actions bot commented Jan 7, 2022

I pushed these containers:

  • gcr.io/tensorflow-sigs/build:57-python3.10
  • gcr.io/tensorflow-sigs/build:57-python3.9
  • gcr.io/tensorflow-sigs/build:57-python3.8
  • gcr.io/tensorflow-sigs/build:57-python3.7

Re-apply the build and push to gcr.io for staging label to rebuild and push again. This comment will only be posted once.

@nitins17 nitins17 removed the build and push to gcr.io for staging Create a staging release on gcr.io label Jan 7, 2022
@nitins17 nitins17 added build and push to gcr.io for staging Create a staging release on gcr.io and removed build and push to gcr.io for staging Create a staging release on gcr.io labels Jan 29, 2022
@nitins17 nitins17 added build and push to gcr.io for staging Create a staging release on gcr.io and removed build and push to gcr.io for staging Create a staging release on gcr.io labels Feb 14, 2022
@guizili0
Copy link

Hi did this PR has relationships with -cxxopt="-D_GLIBCXX_USE_CXX11_ABI=1" and do you plan enable it in TensorFlow release binary? Thanks!

@bhack
Copy link
Contributor

bhack commented Feb 15, 2022

@nitins17 nitins17 force-pushed the nitin/manylinux2014 branch from 8044782 to df6cd72 Compare February 16, 2022 00:18
@nitins17
Copy link
Member Author

Hi did this PR has relationships with -cxxopt="-D_GLIBCXX_USE_CXX11_ABI=1" and do you plan enable it in TensorFlow release binary? Thanks!

Hi,

This PR is for updating the TensorFlow's Linux wheel to be manylinux2014 compliant. As far as I can tell, D_GLIBCXX_USE_CXX11_ABI would still be equal to 0 because we set --with-default-libstdcxx-abi="gcc4-compatible"; see with-default-libstdcxx-abi in https://gcc.gnu.org/onlinedocs/libstdc++/manual/configure.html

@bhack
Copy link
Contributor

bhack commented Feb 16, 2022

@nitins17 Do you know what was the reason for including --with-default-libstdcxx-abi="gcc4-compatible" flag?

@nitins17
Copy link
Member Author

nitins17 commented Mar 1, 2022

@nitins17 Do you know what was the reason for including --with-default-libstdcxx-abi="gcc4-compatible" flag?

I'm not sure. @mihaimaruseac wdyt?

@nitins17 nitins17 force-pushed the nitin/manylinux2014 branch 2 times, most recently from 219a8dc to 6773a82 Compare March 2, 2022 11:56
@bhack
Copy link
Contributor

bhack commented Mar 7, 2022

@nitins17 Do you know what was the reason for including --with-default-libstdcxx-abi="gcc4-compatible" flag?

I'm not sure. @mihaimaruseac wdyt?

What was the conclusion about this point in the last SIG build meeting?

@nitins17 nitins17 force-pushed the nitin/manylinux2014 branch from 87dca1d to ff4320f Compare March 15, 2022 21:43
# Note that the installation path for libstdc++ here is /${TARGET}/usr/lib64/
mv "/${TARGET}/usr/lib64/libstdc++.so.${LIBSTDCXX_VERSION}" \
"/${TARGET}/usr/lib64/libstdc++.so.${LIBSTDCXX_VERSION}.backup"
echo -e "OUTPUT_FORMAT(elf64-x86-64)\nINPUT ( libstdc++.so.6.0.18 -lstdc++_nonshared44 )" \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@perfinion Here's the part we didn't really understand where nonshared44 works but nonshared48 doesn't

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would obviously prefer we use 48, since that looks like it should be the correct version. but the .so is used first and only uses the nonshared for symbols that dont exist in the .so.6.0.18. so if somehow there are symbols in both of them they just wont get used so i dont think there will be any issues.

the critical part in this toolchain is libstdc++.so.6.0.18, that is fine so im good with rolling this out. lets try and figure it out later and maybe update again tho

Set up symlink for devtoolset-8

Combine Docker GCR presubmits and also push main to gcr

Commit missed files

Log in to GCR

Fix conditional, hopefully

Clarify

Add Python 3.10 support (#58)

Adds Python 3.10 support to the containers. Python 3.10 changes some library behavior and, for now, needs an alternative installation method to work.

Upgrade gcrpio for fast build and cleanup setup

Add utilities for running release tests (#56)

This adds the dependencies and notably bazelrc config options to run TensorFlow's Nightly and Release tests, which I've been working on replicating on internal CI. I still have documentation and migration work to do, but the major portion of the support work is here.

add gdb to the system packages

change to gcc 8.3.1 from centos7 for devtoolset8

fix libstdc++ symlink in devtoolset-8 environment

Undo ignoring other xml files

Update README

Deduplicate repeated messages

Squash long runfiles paths

Lock nvidia driver to 460

libtensorflow work

Fix libtensorflow script and start prelim check

Update Test Requirements to have same versions as  tf_sig_build_dockerfiles/devel.requirements.txt (#65)

* Add additional gitignore files

* Update requirements with same versions

Keep versions consistent with  tf_sig_build_dockerfiles/devel.requirements.txt

Cleanup

Fix Build issue from `python_include` (#67)

* Remove Python 3.10 pip special handling

* Link usr/include to usr/local/include

* Update location of python include

* Update setup.python.sh

Assorted changes -- see details

- Remove installation of nvidia-profiler, which depends on libcuda1,
which ultimately installs an nvidia driver package, which we don't want
because we're running in docker, in which the drivers are mounted. I
hope nvidia-profiler isn't necessary for anything important; otherwise
we'll need to synchronize driver versions between the containers and VM
images.
- Add less, colordiff and a newer version of clang-format
- Add code_check_changed_files, which is intended to replace the
"incremental" parts of ci_sanity. Still a work in progress because we
need to decide on valuable configurations (clang-format and pylint
cannot be run the same way as we have them configured internally and
currently have a lot of findings)
- Add code_check_full, which is intended to replace the "across entire
code base" parts of ci_sanity. I rewrote many of the clunkier tests.
Still a work in progress because we must verify that the changed tests
will still fail.
- Fix bad "bazel test " expansion for libtensorflow
- Fix bad chmod for libtensoflow repacker

Change libtensorflow config values to fix target selection

Fix a typo in venv installation

(Thanks to reedwm)

Remove extra lines

(Thanks again to reedwm)

Clarify ctrl-s warning

Correctly remove extra test filters

Make it possible to run isolated pip tests

More work on code checks

Fix a typo

Clean up code check full

Remove clang-format

Cleanup changed_files and move one to full

Add a missing test

Clean up and fix code_check_full

Update docs and create experimental RBE configs

Update docs and create experimental RBE configs

Update dependencies to 2.9.0.dev

Update Go API installation guide for TensorFlow 2.8.0 (#74)

Clarify usage of nightly commit

Fix mistaken 'test' command

Update docs and create experimental RBE configs

Update docs and create experimental RBE configs

Update dependencies to 2.9.0.dev

Update Go API installation guide for TensorFlow 2.8.0 (#74)

Clarify usage of nightly commit

Fix mistaken 'test' command

change to devtoolset-9 and gcc 9.3.1 for manylinux2014

change cachebuster value for ml2014 remote cache

change to new libstdcxx abi for devtoolset-9

change cachebuster value to use the new libstdcxx abi

link against nonshared44 in devtoolset-9

update the cachebuster value

change CACHEBUSTER value for gpu builds

remove redudant commands during build environment setup

change cachbuster variable name for gpu builds

store manylinux2014 cache in a different location

amend comment for accuracy
@nitins17 nitins17 force-pushed the nitin/manylinux2014 branch from ff4320f to b1ae52e Compare March 15, 2022 23:21
Copy link
Member

@perfinion perfinion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. lower version is fine with manylinux. I'd prefer the correct version obviously but lets work on that later as an unrelated change

# Note that the installation path for libstdc++ here is /${TARGET}/usr/lib64/
mv "/${TARGET}/usr/lib64/libstdc++.so.${LIBSTDCXX_VERSION}" \
"/${TARGET}/usr/lib64/libstdc++.so.${LIBSTDCXX_VERSION}.backup"
echo -e "OUTPUT_FORMAT(elf64-x86-64)\nINPUT ( libstdc++.so.6.0.18 -lstdc++_nonshared44 )" \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would obviously prefer we use 48, since that looks like it should be the correct version. but the .so is used first and only uses the nonshared for symbols that dont exist in the .so.6.0.18. so if somehow there are symbols in both of them they just wont get used so i dont think there will be any issues.

the critical part in this toolchain is libstdc++.so.6.0.18, that is fine so im good with rolling this out. lets try and figure it out later and maybe update again tho

@nitins17
Copy link
Member Author

Hi did this PR has relationships with -cxxopt="-D_GLIBCXX_USE_CXX11_ABI=1" and do you plan enable it in TensorFlow release binary? Thanks!

Hi,

This PR is for updating the TensorFlow's Linux wheel to be manylinux2014 compliant. As far as I can tell, D_GLIBCXX_USE_CXX11_ABI would still be equal to 0 because we set --with-default-libstdcxx-abi="gcc4-compatible"; see with-default-libstdcxx-abi in https://gcc.gnu.org/onlinedocs/libstdc++/manual/configure.html

On this, after the discussion in the March 2022 edition of the TF Sig Build meeting, the manylinux2014 upgrade will also include the change to new libstdcxx ABI. For more information on this and the manylinux2014 rollout, please take a look at this community thread.

Thanks!

@guizili0
Copy link

Hi did this PR has relationships with -cxxopt="-D_GLIBCXX_USE_CXX11_ABI=1" and do you plan enable it in TensorFlow release binary? Thanks!

Hi,
This PR is for updating the TensorFlow's Linux wheel to be manylinux2014 compliant. As far as I can tell, D_GLIBCXX_USE_CXX11_ABI would still be equal to 0 because we set --with-default-libstdcxx-abi="gcc4-compatible"; see with-default-libstdcxx-abi in https://gcc.gnu.org/onlinedocs/libstdc++/manual/configure.html

On this, after the discussion in the March 2022 edition of the TF Sig Build meeting, the manylinux2014 upgrade will also include the change to new libstdcxx ABI. For more information on this and the manylinux2014 rollout, please take a look at this community thread.

Thanks!

That's great, thanks for this!!

@angerson
Copy link
Contributor

Note that the images, when not cached, take quite a long time to build both dt7 and dt9 -- eventually we'll remove dt7, tentatively "after the TF 2.9 release."

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build and push to gcr.io for staging Create a staging release on gcr.io
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants