Skip to content

import fails on Linux due to missing library libcrypto.so.10 #418

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
pmeier opened this issue May 18, 2022 · 15 comments
Closed

import fails on Linux due to missing library libcrypto.so.10 #418

pmeier opened this issue May 18, 2022 · 15 comments

Comments

@pmeier
Copy link
Contributor

pmeier commented May 18, 2022

torchvision is seeing CI failures on Linux for a missing extension loaded by torchdata. I can reproduce with docker:

docker run python:3.7 bash -c \
    'pip install --pre torch torchdata --extra-index-url https://download.pytorch.org/whl/nightly/cpu && python -c "import torchdata"'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/local/lib/python3.7/site-packages/torchdata/__init__.py", line 7, in <module>
    from torchdata import _extension  # noqa: F401
  File "/usr/local/lib/python3.7/site-packages/torchdata/_extension.py", line 34, in <module>
    _init_extension()
  File "/usr/local/lib/python3.7/site-packages/torchdata/_extension.py", line 31, in _init_extension
    from torchdata import _torchdata as _torchdata
ImportError: libcrypto.so.10: cannot open shared object file: No such file or directory

#399 might be the offender. #415 did not make it into yesterday's nightly.

@NicolasHug
Copy link
Member

I think @ejguan is on it pytorch/text#1729 (comment)

@pmeier
Copy link
Contributor Author

pmeier commented May 18, 2022

I think I found the issue: the Linux wheel build is running in a specific container

container: ${{ startsWith( matrix.os, 'ubuntu' ) && 'quay.io/pypa/manylinux2014_x86_64' || null }}

which has a different version of libcrypto than ubuntu:

$ docker run quay.io/pypa/manylinux2014_x86_64 bash -c 'ls -l /usr/lib64 | grep libcrypto'
lrwxrwxrwx  1 root root      19 Apr 18 12:29 libcrypto.so.10 -> libcrypto.so.1.0.2k
-rwxr-xr-x  1 root root 2520920 Mar 28 15:42 libcrypto.so.1.0.2k
$ docker run ubuntu:22.04 bash -c 'ls -l /usr/lib/x86_64-linux-gnu | grep libcrypto'
-rw-r--r--  1 root root 4447536 Mar 16 08:35 libcrypto.so.3

I was unable to install libcrypto.so.10 on Ubuntu through vanilla apt.

Thus, if we depend on this specific version, we probably need to ship it with wheel.

@ejguan
Copy link
Contributor

ejguan commented May 18, 2022

@pmeier Thanks for reporting this. I will take a deeper look.

@ejguan
Copy link
Contributor

ejguan commented May 18, 2022

I think openssl-static in

yum -y install openssl-devel openssl-static curl-devel zlib-devel
is culprit. Let me give it a try.

@pmeier
Copy link
Contributor Author

pmeier commented May 18, 2022

I think openssl-static in [...] is culprit

Nope. As you can see above, libcrypto.so.10 is already present in the vanilla image without installing anything else.

$ docker run -it quay.io/pypa/manylinux2014_x86_64 bash
$ ls -l /usr/lib64 | grep libcrypto
lrwxrwxrwx  1 root root      19 Apr 18 12:29 libcrypto.so.10 -> libcrypto.so.1.0.2k
-rwxr-xr-x  1 root root 2520920 Mar 28 15:42 libcrypto.so.1.0.2k
$ yum -y install openssl-devel openssl-static curl-devel zlib-devel
[...]
$ ls -l /usr/lib64 | grep libcrypto
-rw-r--r--  1 root root 4697014 Mar 28 15:42 libcrypto.a
lrwxrwxrwx  1 root root      19 May 18 13:57 libcrypto.so -> libcrypto.so.1.0.2k
lrwxrwxrwx  1 root root      19 Apr 18 12:29 libcrypto.so.10 -> libcrypto.so.1.0.2k
-rwxr-xr-x  1 root root 2520920 Mar 28 15:42 libcrypto.so.1.0.2k

@datumbox
Copy link
Contributor

@ejguan Thanks for looking into this.

The specific issue is currently breaking TorchVision's CI for new PRs, which can cause some confusion to contributors. As Philip clarified, the issue only appears on Linux. Do you have any idea on when the fix will land? Do you recommend us to turn off the test until this is done? Thanks!

@pmeier
Copy link
Contributor Author

pmeier commented May 19, 2022

The problem is that the AWS package pulls in libcrypto, but we don't statically link against it. Thus, when trying to import the package it looks for libcrypto.so.10 on the system and errors out if it doesn't find that.

If we want to depend on the compiled AWS package, we need ship libcrypto as well. Note that this of course has some license implications. Furthermore, this is only needed for wheels, since on conda we can simply depend on another package that provides libcrypto.

Looking at the CMakeLists.txt of the AWS package, libcrypto is mentioned two times:

@ejguan
Copy link
Contributor

ejguan commented May 19, 2022

I am working on a PR to compile AWS with static libcrypto and shipped with TorchData.

@ejguan
Copy link
Contributor

ejguan commented May 19, 2022

Let me update TorchData binary for Linux without AWS first.

@datumbox
Copy link
Contributor

@ejguan Thanks for the update. Can you provide a very rough estimate on when you believe this will become available? We are trying to figure out if we should disable the specific CI job or leave it be. A rough estimate will help us decide this easier.

@ejguan
Copy link
Contributor

ejguan commented May 19, 2022

I don't think you need to disable CI. I can re-upload a new TorchData binary without AWSSDK. This should resolve the problem for TorchVision CI.
I will open a separate PR to support AWSSDK

@ejguan
Copy link
Contributor

ejguan commented May 19, 2022

All nightly should be updated https://github.com/pytorch/data/actions/runs/2353439246

@datumbox
Copy link
Contributor

@ejguan Thanks a lot for the help. As far as I can tell, this fixed the problem. If no additional action is required on your side, I think we can close the issue.

@ejguan
Copy link
Contributor

ejguan commented May 19, 2022

@ejguan Thanks a lot for the help. As far as I can tell, this fixed the problem. If no additional action is required on your side, I think we can close the issue.

@datumbox
Thank you for confirmation. Let's keep this issue opened until I fix AWS binary.

@ejguan
Copy link
Contributor

ejguan commented Jun 10, 2022

Closing this issue as #421 has landed.

@ejguan ejguan closed this as completed Jun 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants