-
Notifications
You must be signed in to change notification settings - Fork 900
OMPI master & 5.0.x branches fail to compile when CUDA is enabled. #8764
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Thanks - this is already an open issue here: |
I am also seeing this issue as well and is blocking my testing of #8762 against newer code (master and 5.0.x). |
Right, it's a different signature. May be the same root cause though. #8736 reported the same signature. |
Yes, I suspect that's likely. |
NVIDIA -- please have a look. |
@Akshay-Venkatesh Yes, there have been infrastructure changes recently. It would probably be best to try to compile again and see if you run into the same failures that others are describing. |
It passed CI and I built and ran fine with cuda enabled with device transfers. I'm wondering if this is a clean build, since I moved a lot of code from the cuda datatype file to common_cuda, could be the source of the duplicate symbols. I haven't had a chance to take a look at the missing symbols issue yet though. |
@wckzhang rather than rely on CI (which does not include any images with CUDA pre-installed), please test locally. Please get @mwheinz's environment, setup a duplicate environment, and test these changes in that environment. Same with #8656. George is also reporting that |
I used a completely clean git clone of origin/master,
Looks like the machine has 2 versions of CUDA installed from NVIDIA's repo:
|
Hi, |
No worries - we've all been there. Well, I mean, I've been there. Thanks for connecting the issues. |
@mwheinz Did this get resolved? |
I am not able to verify that right now - our entire lab is currently traveling down the highway in the back of a truck. |
Sorry - yes. I just checked. I can build 5.0.x and master without problems. |
Thanks @mwheinz ! |
Working from the master branch and/or the 5.0.x branch, I get duplicate symbol problems when compiling OMPI with CUDA enabled.
Configuration options:
The text was updated successfully, but these errors were encountered: