CUDA build: make all fails with undefined references on master and v5.0.x

Thank you for taking the time to submit an issue!

## Background information

### What version of Open MPI are you using? (e.g., v3.0.5, v4.0.2, git branch name and hash, etc.)

branch: master
hash: d18d3f6172ee06087f9452c79a7f8bfb8732c01a

### Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)

(For machine 1: 256 threads, machine 2: 36 threads, machine 3: 12 threads)
git clone --recursive https://github.com/open-mpi/ompi.git -j 256
cd ompi 
export AUTOMAKE_JOBS=256
./autogen.pl
./configure --disable-picky --prefix=/usr/local --with-cuda=/usr/local/cuda-11.2 --with-ucx=/usr/local/ucx
make -j 256 all
---> ERROR

### If you are building/installing from a git clone, please copy-n-paste the output from `git submodule status`.

 7145774ed059c9c3eca277805b419a16b8a68ca3 3rd-party/openpmix (v1.1.3-2852-g7145774e)
 284d15d7b9be51c07ae3a3964b1567fde1a106e2 3rd-party/prrte (dev-31005-g284d15d7b9)

### Please describe the system on which you are running

* Operating system/version: 
* Computer hardware: 
* Network type: 

I have tried this on 3 machines' bare metal and all 3 machines showed the same error:
1) Dual AMD Epyc 7742, 8 x Nvidia A-100 40Gig
2) Intel i9-10980XE, Nvidia 2080 Ti
3) Intel i7-9750H, Nvidia 2080 MaxQ

All machines are set up with the same software:

Ubuntu 20.10
gcc-10
Cuda 11.2 update 2
nv_peer_memory built from latest source
gdrcopy built from latest source
ucx built from latest source
mlnx_ofed - latest version
-----------------------------

## Details of the problem

Please describe, in detail, the problem that you are having, including the behavior you expect to see, the actual behavior that you are seeing, steps to reproduce the problem, etc.  It is most helpful if you can attach a small program that a developer can use to reproduce your problem.

**Note**: If you include verbatim output (or a code block), please use a [GitHub Markdown](https://help.github.com/articles/creating-and-highlighting-code-blocks/) code block like below:
```shell
shell$ mpirun -np 2 ./hello_world
```

```shell
shell$ make -j 256 install
make[2]: Entering directory '/home/daniel/ompi/opal/tools/wrappers'
  CC       opal_wrapper.o
  CCLD     opal_wrapper
/usr/bin/ld: /usr/local/lib/libmca_common_cuda.so.0: undefined reference to `opal_cuda_add_initialization_function'
/usr/bin/ld: ../../../opal/.libs/libopen-pal.so: undefined reference to `opal_cuda_memmove'
/usr/bin/ld: ../../../opal/.libs/libopen-pal.so: undefined reference to `opal_cuda_memcpy'
/usr/bin/ld: ../../../opal/.libs/libopen-pal.so: undefined reference to `mca_cuda_convertor_init'
/usr/bin/ld: ../../../opal/.libs/libopen-pal.so: undefined reference to `opal_cuda_check_bufs'
/usr/bin/ld: ../../../opal/.libs/libopen-pal.so: undefined reference to `opal_cuda_memcpy_sync'
collect2: error: ld returned 1 exit status
make[2]: *** [Makefile:1443: opal_wrapper] Error 1
make[2]: Leaving directory '/home/daniel/ompi/opal/tools/wrappers'
make[1]: *** [Makefile:1868: all-recursive] Error 1
make[1]: Leaving directory '/home/daniel/ompi/opal'
make: *** [Makefile:1437: all-recursive] Error 1
Command exited with non-zero status 2
104.88user 28.62system 0:50.56elapsed 264%CPU (0avgtext+0avgdata 22904maxresident)k
3608inputs+327112outputs (0major+7489985minor)pagefaults 0swaps
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

CUDA build: make all fails with undefined references on master and v5.0.x #8656

Background information

What version of Open MPI are you using? (e.g., v3.0.5, v4.0.2, git branch name and hash, etc.)

Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)

If you are building/installing from a git clone, please copy-n-paste the output from `git submodule status`.

Please describe the system on which you are running

Ubuntu 20.10
gcc-10
Cuda 11.2 update 2
nv_peer_memory built from latest source
gdrcopy built from latest source
ucx built from latest source
mlnx_ofed - latest version

Details of the problem

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

CUDA build: make all fails with undefined references on master and v5.0.x #8656

Description

Background information

What version of Open MPI are you using? (e.g., v3.0.5, v4.0.2, git branch name and hash, etc.)

Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)

If you are building/installing from a git clone, please copy-n-paste the output from git submodule status.

Please describe the system on which you are running

Ubuntu 20.10 gcc-10 Cuda 11.2 update 2 nv_peer_memory built from latest source gdrcopy built from latest source ucx built from latest source mlnx_ofed - latest version

Details of the problem

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

If you are building/installing from a git clone, please copy-n-paste the output from `git submodule status`.

Ubuntu 20.10
gcc-10
Cuda 11.2 update 2
nv_peer_memory built from latest source
gdrcopy built from latest source
ucx built from latest source
mlnx_ofed - latest version