Skip to content

Compatibility of OpenMPI 4.1.8-Compiled Binaries with OpenMPI 5.0.7 #13253

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
puneet336 opened this issue May 15, 2025 · 4 comments
Closed

Compatibility of OpenMPI 4.1.8-Compiled Binaries with OpenMPI 5.0.7 #13253

puneet336 opened this issue May 15, 2025 · 4 comments

Comments

@puneet336
Copy link

puneet336 commented May 15, 2025

Hi Team,

We have some binary applications compiled using OpenMPI 4.1.8 on rocky linux 9.5. When I load the OpenMPI 5.0.7 environment on same server and run ldd on one of these binaries, I see the following output:

linux-vdso.so.1 (0x00007fffa29d8000)
libmpi_cxx.so.40 => /home/puneethpc_poc1/openmpi-4.1.8/install/lib/libmpi_cxx.so.40 (0x00007fe73f511000)
libmpi.so.40 => /home/puneethpc_poc1/openmpi-5.0.7/install/lib/libmpi.so.40 (0x00007fe73f000000)
libopen-pal.so.80 => /home/puneethpc_poc1/openmpi-5.0.7/install/lib/libopen-pal.so.80 (0x00007fe73eb0e000)

the path of openmpi4's libmpi_cxx is not present in LD_LIBRARY_PATH so probably the openmpi4's libmpi_cxx.so.40 is being referenced due to rpath #13252 ? .

I understand that libmpi_cxx.so.40 is not part of OpenMPI 5.0.7 and enabling C++ MPI bindings using --enable-mpi-cxx gives the following warning and error during configuration:

configure: WARNING: The MPI C++ bindings were deprecated in MPI-2.2 (2009), removed in MPI-3.0 (2012), and dropped from OpenMPI v5.0.0 (2022).

configure: error: Build cannot continue.

Despite this, I attempted to run the binary using mpirun from openmpiv5 and encountered the following runtime error:

./application1.bin: symbol lookup error: /home/puneethpc_poc1/openmpi-4.1.8/install/lib/libmpi_cxx.so.40: undefined symbol: ompi_mpi_errors_throw_exceptions

My question:
Is there a way to run binaries compiled with OpenMPI 4.1.8 using an OpenMPI 5.0.7 environment? Or is it generally not supported to run such MPI applications across these major version changes?

I understand that recompiling from source against OpenMPI 5.0.7 is the recommended path forward if the source is available. Please advise.

@jsquyres
Copy link
Member

Generally, binaries compiled with Open MPI v4.1.x should be forward compatible with Open MPI v5.0.x.

However, you have hit a feature that was actually removed in Open MPI v5.0.x: the MPI C++ bindings. As the configure error message indicates, the MPI C++ bindings were removed by the MPI Forum (i.e., the standards body that governs the MPI specification document) well over a decade ago. Hence, there's a whole library (libmpi_cxx) that is not present in Open MPI v5.0.x at all -- even if you attempt to recompile the application with Open MPI v5.0.x, it will likely fail to compile because the MPI C++ bindings classes and declarations are not available. I'm afraid that the only option for MPI applications that use the long-since deprecated and removed MPI C++ bindings is to stay with Open MPI v4.1.x.

A more long-term solution may be to update the application to not use the MPI C++ bindings (e.g., convert the application to use the MPI C bindings). I know that this is additional work, but the MPI C++ bindings won't be coming back in the form that existed back in MPI-2 days. Sorry! ☹

@ggouaillardet
Copy link
Contributor

@jsquyres is absolutely correct.

That being said, if the application does not explicitly uses the MPI C++ bindings, the error could be an overkill from the linker.
You can try building a dummy libmpi_cxx.so.40, install it in /home/puneethpc_poc1/openmpi-5.0.7/install/lib/ and see how it goes. It is definitely not a fix, but a hack that can save you some time.

@puneet336
Copy link
Author

Thank you for the insights and suggestions @jsquyres @ggouaillardet
here is ldd output before placing dummy libmpi_cxx.so.40 in LD_LIBRARY_PATH

        linux-vdso.so.1 (0x00007fff319a6000)
        libmpi_cxx.so.40 => /home/puneethpc_poc1/openmpi-4.1.8/install/lib/libmpi_cxx.so.40 (0x00007f2a47b8e000)
        libmpi.so.40 => /home/puneethpc_poc1/openmpi-5.0.7/install/lib/libmpi.so.40 (0x00007f2a47800000)
        libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00007f2a47400000)
        libm.so.6 => /lib64/libm.so.6 (0x00007f2a47725000)
        libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f2a47b70000)
        libc.so.6 => /lib64/libc.so.6 (0x00007f2a47000000)
        libopen-rte.so.40 => /home/puneethpc_poc1/openmpi-4.1.8/install/lib/libopen-rte.so.40 (0x00007f2a4766c000)
        libopen-pal.so.40 => /home/puneethpc_poc1/openmpi-4.1.8/install/lib/libopen-pal.so.40 (0x00007f2a472f7000)
        libopen-pal.so.80 => /home/puneethpc_poc1/openmpi-5.0.7/install/lib/libopen-pal.so.80 (0x00007f2a46f0e000)
        libpmix.so.2 => /home/puneethpc_poc1/openmpi-5.0.7/install/lib/libpmix.so.2 (0x00007f2a46c00000)
        libevent_core-2.1.so.7 => /home/puneethpc_poc1/openmpi-5.0.7/install/lib/libevent_core-2.1.so.7 (0x00007f2a47636000)
        libevent_pthreads-2.1.so.7 => /home/puneethpc_poc1/openmpi-5.0.7/install/lib/libevent_pthreads-2.1.so.7 (0x00007f2a47b67000)
        libhwloc.so.15 => /home/puneethpc_poc1/openmpi-5.0.7/install/lib/libhwloc.so.15 (0x00007f2a4729e000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f2a47bab000)

with the following code,

extern "C" void ompi_mpi_errors_throw_exceptions() {

}

i prepared a stub shared library -

g++ -shared -fPIC -o libmpi_cxx.so.40 dummy_mpi_cxx.cpp

and here is the output of ldd after the shared library generation

 linux-vdso.so.1 (0x00007ffd105d7000)
        libmpi_cxx.so.40 => /home/puneethpc_poc1/openmpi-4.1.8/install/lib/libmpi_cxx.so.40 (0x00007f794b1d7000)
        libmpi.so.40 => /home/puneethpc_poc1/openmpi-5.0.7/install/lib/libmpi.so.40 (0x00007f794ae00000)
        libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00007f794aa00000)
        libm.so.6 => /lib64/libm.so.6 (0x00007f794ad25000)
        libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f794b1b9000)
        libc.so.6 => /lib64/libc.so.6 (0x00007f794a600000)
        libopen-rte.so.40 => /home/puneethpc_poc1/openmpi-4.1.8/install/lib/libopen-rte.so.40 (0x00007f794ac6c000)
        libopen-pal.so.40 => /home/puneethpc_poc1/openmpi-4.1.8/install/lib/libopen-pal.so.40 (0x00007f794a8f7000)
        libopen-pal.so.80 => /home/puneethpc_poc1/openmpi-5.0.7/install/lib/libopen-pal.so.80 (0x00007f794a50e000)
        libpmix.so.2 => /home/puneethpc_poc1/openmpi-5.0.7/install/lib/libpmix.so.2 (0x00007f794a200000)
        libevent_core-2.1.so.7 => /home/puneethpc_poc1/openmpi-5.0.7/install/lib/libevent_core-2.1.so.7 (0x00007f794b17f000)
        libevent_pthreads-2.1.so.7 => /home/puneethpc_poc1/openmpi-5.0.7/install/lib/libevent_pthreads-2.1.so.7 (0x00007f794b17a000)
        libhwloc.so.15 => /home/puneethpc_poc1/openmpi-5.0.7/install/lib/libhwloc.so.15 (0x00007f794a89e000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f794b1f4000)

It seems along with lib_mpicxx the libopen-rte.so is also unavailable with the openmpiv5 but still, application ran successfully.
I see that there are multiple references to libopenpal.so (one from v5 another from v4)
so i removed the openmpi4 libraries as -

 mv /home/puneethpc_poc1/openmpi-4.1.8/install/lib/ /home/puneethpc_poc1/openmpi-4.1.8/install/lib1

now the ldd reports

linux-vdso.so.1 (0x00007ffc55391000)
        libmpi_cxx.so.40 => /home/puneethpc_poc1/openmpi-5.0.7/install/lib/libmpi_cxx.so.40 (0x00007f52a4ee3000)
        libmpi.so.40 => /home/puneethpc_poc1/openmpi-5.0.7/install/lib/libmpi.so.40 (0x00007f52a4a00000)
        libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00007f52a4600000)
        libm.so.6 => /lib64/libm.so.6 (0x00007f52a4e04000)
        libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f52a4dea000)
        libc.so.6 => /lib64/libc.so.6 (0x00007f52a4200000)
        libopen-pal.so.80 => /home/puneethpc_poc1/openmpi-5.0.7/install/lib/libopen-pal.so.80 (0x00007f52a490e000)
        libpmix.so.2 => /home/puneethpc_poc1/openmpi-5.0.7/install/lib/libpmix.so.2 (0x00007f52a3e00000)
        libevent_core-2.1.so.7 => /home/puneethpc_poc1/openmpi-5.0.7/install/lib/libevent_core-2.1.so.7 (0x00007f52a4db2000)
        libevent_pthreads-2.1.so.7 => /home/puneethpc_poc1/openmpi-5.0.7/install/lib/libevent_pthreads-2.1.so.7 (0x00007f52a4dad000)
        libhwloc.so.15 => /home/puneethpc_poc1/openmpi-5.0.7/install/lib/libhwloc.so.15 (0x00007f52a4d52000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f52a4eea000)

and the run is still successful.

in summary, with the dummy/stub mpi_cxx library method the binary works as expected.

@jsquyres
Copy link
Member

Glad it worked for you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants