Skip to content

OMPI master & 5.0.x branches fail to compile when CUDA is enabled. #8764

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mwheinz opened this issue Apr 5, 2021 · 19 comments
Closed

OMPI master & 5.0.x branches fail to compile when CUDA is enabled. #8764

mwheinz opened this issue Apr 5, 2021 · 19 comments

Comments

@mwheinz
Copy link

mwheinz commented Apr 5, 2021

Working from the master branch and/or the 5.0.x branch, I get duplicate symbol problems when compiling OMPI with CUDA enabled.

Configuration options:

./configure --prefix=/usr/mpi/gcc/openmpi-expr --with-hwloc=internal --with-libevent=internal --with-pmix=internal --with-psm2 --with-cuda
make -j 32 2>&1 | tee -a make.log
...
...
...
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o): In function `mca_common_cuda_free':
common_cuda.c:(.text+0x1b0): multiple definition of `mca_common_cuda_free'
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):common_cuda.c:(.text+0x1b0): first defined here
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o): In function `opal_cuda_memcpy':
common_cuda.c:(.text+0x1f0): multiple definition of `opal_cuda_memcpy'
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):common_cuda.c:(.text+0x1f0): first defined here
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):(.bss+0x0): multiple definition of `opal_cuda_verbose'
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):(.bss+0x0): first defined here
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o): In function `mca_common_cuda_malloc':
common_cuda.c:(.text+0x4e0): multiple definition of `mca_common_cuda_malloc'
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):common_cuda.c:(.text+0x4e0): first defined here
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):(.bss+0x40): multiple definition of `mca_common_cuda_enabled'
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):(.bss+0x40): first defined here
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):(.data+0x4): multiple definition of `cuda_event_max'
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):(.data+0x4): first defined here
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):(.bss+0x38): multiple definition of `cuda_event_ipc_array'
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):(.bss+0x38): first defined here
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):(.bss+0x20): multiple definition of `cuda_event_ipc_frag_array'
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):(.bss+0x20): first defined here
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):(.bss+0x10): multiple definition of `cuda_event_htod_frag_array'
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):(.bss+0x10): first defined here
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):(.bss+0x30): multiple definition of `cuda_event_dtoh_array'
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):(.bss+0x30): first defined here
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):(.bss+0x18): multiple definition of `cuda_event_dtoh_frag_array'
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):(.bss+0x18): first defined here
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):(.bss+0x28): multiple definition of `cuda_event_htod_array'
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):(.bss+0x28): first defined here
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o): In function `mca_common_cuda_register_mca_variables':
common_cuda.c:(.text+0xf30): multiple definition of `mca_common_cuda_register_mca_variables'
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):common_cuda.c:(.text+0xf30): first defined here
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o): In function `mca_common_cuda_fini':
common_cuda.c:(.text+0x1140): multiple definition of `mca_common_cuda_fini'
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):common_cuda.c:(.text+0x1140): first defined here
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):(.bss+0x8): multiple definition of `libcuda_handle'
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):(.bss+0x8): first defined here
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o): In function `mca_common_cuda_register':
common_cuda.c:(.text+0x15e0): multiple definition of `mca_common_cuda_register'
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):common_cuda.c:(.text+0x15e0): first defined here
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):(.data.rel+0x0): multiple definition of `common_cuda_mem_regs_t_class'
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):(.data.rel+0x0): first defined here
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o): In function `mca_common_cuda_unregister':
common_cuda.c:(.text+0x1800): multiple definition of `mca_common_cuda_unregister'
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):common_cuda.c:(.text+0x1800): first defined here
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o): In function `cuda_getmemhandle':
common_cuda.c:(.text+0x19a0): multiple definition of `cuda_getmemhandle'
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):common_cuda.c:(.text+0x19a0): first defined here
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o): In function `cuda_ungetmemhandle':
common_cuda.c:(.text+0x1b70): multiple definition of `cuda_ungetmemhandle'
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):common_cuda.c:(.text+0x1b70): first defined here
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o): In function `cuda_openmemhandle':
common_cuda.c:(.text+0x1bb0): multiple definition of `cuda_openmemhandle'
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):common_cuda.c:(.text+0x1bb0): first defined here
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o): In function `cuda_closememhandle':
common_cuda.c:(.text+0x1cf0): multiple definition of `cuda_closememhandle'
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):common_cuda.c:(.text+0x1cf0): first defined here
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o): In function `mca_common_cuda_construct_event_and_handle':
common_cuda.c:(.text+0x1d80): multiple definition of `mca_common_cuda_construct_event_and_handle'
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):common_cuda.c:(.text+0x1d80): first defined here
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o): In function `mca_common_cuda_destruct_event':
common_cuda.c:(.text+0x1e20): multiple definition of `mca_common_cuda_destruct_event'
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):common_cuda.c:(.text+0x1e20): first defined here
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o): In function `mca_common_wait_stream_synchronize':
common_cuda.c:(.text+0x1e70): multiple definition of `mca_common_wait_stream_synchronize'
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):common_cuda.c:(.text+0x1e70): first defined here
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o): In function `mca_common_cuda_memcpy':
common_cuda.c:(.text+0x1e80): multiple definition of `mca_common_cuda_memcpy'
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):common_cuda.c:(.text+0x1e80): first defined here
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o): In function `mca_common_cuda_record_dtoh_event':
common_cuda.c:(.text+0x2350): multiple definition of `mca_common_cuda_record_dtoh_event'
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):common_cuda.c:(.text+0x2350): first defined here
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o): In function `mca_common_cuda_record_htod_event':
common_cuda.c:(.text+0x2530): multiple definition of `mca_common_cuda_record_htod_event'
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):common_cuda.c:(.text+0x2530): first defined here
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o): In function `mca_common_cuda_get_dtoh_stream':
common_cuda.c:(.text+0x2700): multiple definition of `mca_common_cuda_get_dtoh_stream'
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):common_cuda.c:(.text+0x2700): first defined here
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o): In function `mca_common_cuda_get_htod_stream':
common_cuda.c:(.text+0x2710): multiple definition of `mca_common_cuda_get_htod_stream'
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):common_cuda.c:(.text+0x2710): first defined here
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o): In function `progress_one_cuda_ipc_event':
common_cuda.c:(.text+0x2720): multiple definition of `progress_one_cuda_ipc_event'
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):common_cuda.c:(.text+0x2720): first defined here
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o): In function `progress_one_cuda_dtoh_event':
common_cuda.c:(.text+0x2920): multiple definition of `progress_one_cuda_dtoh_event'
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):common_cuda.c:(.text+0x2920): first defined here
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o): In function `progress_one_cuda_htod_event':
common_cuda.c:(.text+0x2b20): multiple definition of `progress_one_cuda_htod_event'
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):common_cuda.c:(.text+0x2b20): first defined here
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o): In function `mca_common_cuda_memhandle_matches':
common_cuda.c:(.text+0x2d20): multiple definition of `mca_common_cuda_memhandle_matches'
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):common_cuda.c:(.text+0x2d20): first defined here
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o): In function `mca_common_cuda_get_device':
common_cuda.c:(.text+0x2dc0): multiple definition of `mca_common_cuda_get_device'
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):common_cuda.c:(.text+0x2dc0): first defined here
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o): In function `mca_common_cuda_device_can_access_peer':
common_cuda.c:(.text+0x2e10): multiple definition of `mca_common_cuda_device_can_access_peer'
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):common_cuda.c:(.text+0x2e10): first defined here
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o): In function `mca_common_cuda_get_address_range':
common_cuda.c:(.text+0x2e40): multiple definition of `mca_common_cuda_get_address_range'
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):common_cuda.c:(.text+0x2e40): first defined here
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o): In function `mca_common_cuda_previously_freed_memory':
common_cuda.c:(.text+0x2f00): multiple definition of `mca_common_cuda_previously_freed_memory'
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):common_cuda.c:(.text+0x2f00): first defined here
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o): In function `mca_common_cuda_get_buffer_id':
common_cuda.c:(.text+0x2fd0): multiple definition of `mca_common_cuda_get_buffer_id'
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):common_cuda.c:(.text+0x2fd0): first defined here
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o): In function `opal_cuda_add_initialization_function':
common_cuda.c:(.text+0x30a0): multiple definition of `opal_cuda_add_initialization_function'
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):common_cuda.c:(.text+0x30a0): first defined here
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o): In function `mca_common_cuda_stage_one_init':
common_cuda.c:(.text+0x30b0): multiple definition of `mca_common_cuda_stage_one_init'
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):common_cuda.c:(.text+0x30b0): first defined here
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o): In function `mca_cuda_convertor_init':
common_cuda.c:(.text+0x43a0): multiple definition of `mca_cuda_convertor_init'
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):common_cuda.c:(.text+0x43a0): first defined here
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o): In function `opal_cuda_check_bufs':
common_cuda.c:(.text+0x4400): multiple definition of `opal_cuda_check_bufs'
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):common_cuda.c:(.text+0x4400): first defined here
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o): In function `opal_cuda_check_one_buf':
common_cuda.c:(.text+0x4470): multiple definition of `opal_cuda_check_one_buf'
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):common_cuda.c:(.text+0x4470): first defined here
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o): In function `opal_cuda_malloc':
common_cuda.c:(.text+0x44c0): multiple definition of `opal_cuda_malloc'
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):common_cuda.c:(.text+0x44c0): first defined here
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o): In function `opal_cuda_free':
common_cuda.c:(.text+0x4500): multiple definition of `opal_cuda_free'
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):common_cuda.c:(.text+0x4500): first defined here
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o): In function `opal_cuda_memcpy_sync':
common_cuda.c:(.text+0x4540): multiple definition of `opal_cuda_memcpy_sync'
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):common_cuda.c:(.text+0x4540): first defined here
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o): In function `opal_cuda_memmove':
common_cuda.c:(.text+0x4580): multiple definition of `opal_cuda_memmove'
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):common_cuda.c:(.text+0x4580): first defined here
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o): In function `opal_cuda_set_copy_function_async':
common_cuda.c:(.text+0x45c0): multiple definition of `opal_cuda_set_copy_function_async'
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):common_cuda.c:(.text+0x45c0): first defined here
collect2: error: ld returned 1 exit status
make[2]: *** [Makefile:1737: libopen-pal.la] Error 1
make[2]: Leaving directory '/home/mheinz/ompi/opal'
make[1]: *** [Makefile:1865: all-recursive] Error 1
make[1]: Leaving directory '/home/mheinz/ompi/opal'
make: *** [Makefile:1434: all-recursive] Error 1
``
@awlauria
Copy link
Contributor

awlauria commented Apr 5, 2021

Thanks - this is already an open issue here:
#8656

@mwheinz
Copy link
Author

mwheinz commented Apr 5, 2021

Thanks - this is already an open issue here:
#8656

That does not appear to be the same issue - kind of the opposite. #8656 indicates that CUDA symbols aren't defined - this issue shows them being defined multiple times.

@acgoldma
Copy link
Contributor

acgoldma commented Apr 5, 2021

I am also seeing this issue as well and is blocking my testing of #8762 against newer code (master and 5.0.x).

@awlauria
Copy link
Contributor

awlauria commented Apr 5, 2021

Right, it's a different signature. May be the same root cause though. #8736 reported the same signature.

@mwheinz
Copy link
Author

mwheinz commented Apr 5, 2021

Right, it's a different signature. May be the same root cause though. #8736 reported the same signature.

Yes, I suspect that's likely.

@jsquyres
Copy link
Member

jsquyres commented Apr 5, 2021

NVIDIA -- please have a look.

@Akshay-Venkatesh
Copy link
Contributor

@jsquyres @awlauria We haven't touched common_cuda component in a long time. Have there been changes in common infrastructure on master recently?

@jsquyres
Copy link
Member

jsquyres commented Apr 5, 2021

@Akshay-Venkatesh Yes, there have been infrastructure changes recently. It would probably be best to try to compile again and see if you run into the same failures that others are describing.

@Akshay-Venkatesh
Copy link
Contributor

@wckzhang Noticed the following changes in common/cuda from you. Just double checking, these commits passed cuda build tests right?

deb37ac
3a72df0
906017c

@wckzhang
Copy link
Contributor

wckzhang commented Apr 5, 2021

It passed CI and I built and ran fine with cuda enabled with device transfers. I'm wondering if this is a clean build, since I moved a lot of code from the cuda datatype file to common_cuda, could be the source of the duplicate symbols. I haven't had a chance to take a look at the missing symbols issue yet though.

@bwbarrett
Copy link
Member

@wckzhang rather than rely on CI (which does not include any images with CUDA pre-installed), please test locally. Please get @mwheinz's environment, setup a duplicate environment, and test these changes in that environment. Same with #8656.

George is also reporting that --without-cuda is broken on platforms with the CUDA libraries in the default search paths. Please confirm that works as well.

@mwheinz
Copy link
Author

mwheinz commented Apr 6, 2021

I used a completely clean git clone of origin/master,

./configure --prefix=/usr/mpi/gcc/openmpi-expr --with-hwloc=internal \
--with-libevent=internal --with-pmix=internal --with-psm2 --with-cuda \
 && make -j 32 2>&1 | tee -a make.log

Looks like the machine has 2 versions of CUDA installed from NVIDIA's repo:

[LINUX hds2fnce201 20210406_1044 ompi]# rpm -qa | grep cuda
cuda-visual-tools-11-2-11.2.2-1.x86_64
cuda-cufft-dev-10-2-10.2.89-1.x86_64
cuda-misc-headers-10-2-10.2.89-1.x86_64
cuda-nvml-dev-10-2-10.2.89-1.x86_64
cuda-gdb-10-2-10.2.89-1.x86_64
cuda-nsight-systems-10-2-10.2.89-1.x86_64
cuda-samples-10-2-10.2.89-1.x86_64
libnccl-2.8.4-1+cuda11.2.x86_64
libpsm2-11.2.201-1cuda.x86_64
cuda-nvml-devel-11-2-11.2.152-1.x86_64
cuda-samples-11-2-11.2.152-1.x86_64
cuda-gdb-11-2-11.2.152-1.x86_64
cuda-compiler-11-2-11.2.2-1.x86_64
cuda-nsight-compute-11-0-11.0.3-1.x86_64
cuda-cudart-10-2-10.2.89-1.x86_64
cuda-cusolver-dev-10-2-10.2.89-1.x86_64
cuda-nvgraph-dev-10-2-10.2.89-1.x86_64
cuda-runtime-10-2-10.2.89-1.x86_64
cuda-sanitizer-api-10-2-10.2.89-1.x86_64
cuda-10-2-10.2.89-1.x86_64
cuda-cudart-11-2-11.2.152-1.x86_64
cuda-nvtx-11-2-11.2.152-1.x86_64
cuda-cuxxfilt-11-2-11.2.152-1.x86_64
cuda-command-line-tools-11-2-11.2.2-1.x86_64
cuda-toolkit-11-2-11.2.2-1.x86_64
cuda-nsight-compute-11-1-11.1.1-1.x86_64
cuda-license-10-2-10.2.89-1.x86_64
cuda-cudart-dev-10-2-10.2.89-1.x86_64
cuda-curand-10-2-10.2.89-1.x86_64
cuda-cusparse-10-2-10.2.89-1.x86_64
cuda-npp-10-2-10.2.89-1.x86_64
cuda-nvjpeg-10-2-10.2.89-1.x86_64
cuda-nvrtc-10-2-10.2.89-1.x86_64
cuda-demo-suite-10-2-10.2.89-1.x86_64
cuda-nvprune-10-2-10.2.89-1.x86_64
cuda-command-line-tools-10-2-10.2.89-1.x86_64
cuda-nsight-10-2-10.2.89-1.x86_64
cuda-visual-tools-10-2-10.2.89-1.x86_64
libnccl-static-2.8.4-1+cuda11.2.x86_64
libpsm2-compat-11.2.201-1cuda.x86_64
cuda-driver-devel-11-2-11.2.152-1.x86_64
cuda-nvrtc-devel-11-2-11.2.152-1.x86_64
cuda-nvcc-11-2-11.2.152-1.x86_64
cuda-runtime-11-2-11.2.2-1.x86_64
cuda-sanitizer-11-2-11.2.152-1.x86_64
cuda-nsight-11-2-11.2.152-1.x86_64
cuda-demo-suite-11-2-11.2.152-1.x86_64
cuda-cuobjdump-11-2-11.2.152-1.x86_64
cuda-nsight-compute-11-2-11.2.2-1.x86_64
cuda-11-2-11.2.2-1.x86_64
cuda-repo-rhel8-10-2-local-10.2.89-440.33.01-1.0-1.x86_64
nvidia-driver-cuda-libs-460.32.03-1.el8.x86_64
cuda-driver-dev-10-2-10.2.89-1.x86_64
cuda-cufft-10-2-10.2.89-1.x86_64
cuda-curand-dev-10-2-10.2.89-1.x86_64
cuda-cusparse-dev-10-2-10.2.89-1.x86_64
cuda-npp-dev-10-2-10.2.89-1.x86_64
cuda-nvjpeg-dev-10-2-10.2.89-1.x86_64
cuda-nvrtc-dev-10-2-10.2.89-1.x86_64
cuda-libraries-dev-10-2-10.2.89-1.x86_64
cuda-cupti-10-2-10.2.89-1.x86_64
cuda-compiler-10-2-10.2.89-1.x86_64
cuda-nvvp-10-2-10.2.89-1.x86_64
cuda-tools-10-2-10.2.89-1.x86_64
libnccl-devel-2.8.4-1+cuda11.2.x86_64
kmod-ifs-kernel-updates-4.18.0_193.el8.x86_64-2197cuda.x86_64
libpsm2-devel-11.2.201-1cuda.x86_64
ifs-kernel-updates-devel-4.18.0_193.el8.x86_64-2197cuda.x86_64
mpitests_openmpi_gcc_cuda_hfi-4.1-932.x86_64
cuda-nvrtc-11-2-11.2.152-1.x86_64
cuda-nvdisasm-11-2-11.2.152-1.x86_64
cuda-libraries-11-2-11.2.2-1.x86_64
cuda-nvvp-11-2-11.2.152-1.x86_64
cuda-nsight-systems-11-2-11.2.2-1.x86_64
cuda-documentation-11-2-11.2.154-1.x86_64
cuda-11.2.2-1.x86_64
nvidia-driver-cuda-460.32.03-1.el8.x86_64
cuda-nvdisasm-10-2-10.2.89-1.x86_64
cuda-cusolver-10-2-10.2.89-1.x86_64
cuda-nvgraph-10-2-10.2.89-1.x86_64
cuda-libraries-10-2-10.2.89-1.x86_64
cuda-nvtx-10-2-10.2.89-1.x86_64
cuda-toolkit-10-2-10.2.89-1.x86_64
kmod-ifs-kernel-updates-debuginfo-4.18.0_193.el8.x86_64-2197cuda.x86_64
openmpi_gcc_cuda_hfi-4.0.5-10.el8.x86_64
cuda-cudart-devel-11-2-11.2.152-1.x86_64
cuda-nvprune-11-2-11.2.152-1.x86_64
cuda-cupti-11-2-11.2.152-1.x86_64
cuda-tools-11-2-11.2.2-1.x86_64
cuda-drivers-460.32.03-1.x86_64
cuda-cuobjdump-10-2-10.2.89-1.x86_64
cuda-nvcc-10-2-10.2.89-1.x86_64
cuda-nvprof-10-2-10.2.89-1.x86_64
cuda-memcheck-10-2-10.2.89-1.x86_64
cuda-documentation-10-2-10.2.89-1.x86_64
cuda-nsight-compute-10-2-10.2.89-1.x86_64
cuda-nvprof-11-2-11.2.152-1.x86_64
cuda-libraries-devel-11-2-11.2.2-1.x86_64
cuda-memcheck-11-2-11.2.152-1.x86_64
[LINUX hds2fnce201 20210406_1044 ompi]# 

@dbonner
Copy link

dbonner commented Apr 6, 2021

Hi,
I got this issue and created issue #8772 which is a duplicate so it was closed. Sorry I should have searched better to find the pre-existing issue.
My logs of all output of the build process are available to download at the above address (issue 8772).
Much appreciated for any help with this.
Dan

@mwheinz
Copy link
Author

mwheinz commented Apr 6, 2021

Hi,
I got this issue and created issue #8772 which is a duplicate so it was closed. Sorry I should have searched better to find the pre-existing issue.

No worries - we've all been there. Well, I mean, I've been there. Thanks for connecting the issues.

@jsquyres
Copy link
Member

jsquyres commented May 4, 2021

@mwheinz Did this get resolved?

@mwheinz
Copy link
Author

mwheinz commented May 4, 2021

@mwheinz Did this get resolved?

I am not able to verify that right now - our entire lab is currently traveling down the highway in the back of a truck.

@awlauria
Copy link
Contributor

I think this can be closed based on

master: #8788
v5.0.x: #8809

however, it would be good to verify.

@mwheinz
Copy link
Author

mwheinz commented Jun 29, 2021

Sorry - yes. I just checked. I can build 5.0.x and master without problems.

@mwheinz mwheinz closed this as completed Jun 29, 2021
@awlauria
Copy link
Contributor

Thanks @mwheinz !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants