Skip to content

singleton mode with v4.x fails when MCA binding policy is "numa" on AMD hardware #11097

@jngrad

Description

@jngrad

Background information

What version of Open MPI are you using?

v4.1.2

Describe how Open MPI was installed

From Ubuntu 22.04 package manager: libopenmpi-dev 4.1.2-2ubuntu1 and hwloc 2.7.0-2.

Also reproducible when building from sources in a Docker container:

dpkg-buildpackage commands (click to expand)
echo "deb-src http://archive.ubuntu.com/ubuntu/ jammy universe" >> /etc/apt/sources.list
apt-get update
apt-get install -y fakeroot
apt-get source libopenmpi3
apt-get build-dep -y libopenmpi3
cd openmpi-4.1.2/
dpkg-buildpackage -rfakeroot -b
cp -r debian /local/openmpi-debian-patched # copy to mounted folder to use on host machine

Please describe the system on which you are running

  • Operating system/version: Ubuntu 22.04
  • Computer hardware: reproducible on the following CPUs:
    • AMD Ryzen Threadripper 1950X 16-core processor with hyperthreading enabled
    • AMD EPYC 7351P 16-core processor with hyperthreading enabled
  • Network type: not relevant

Details of the problem

On AMD Ryzen and AMD EPYC, the MCA binding policy "numa" fails to set the processor affinity and generates a fatal error when running the executable in singleton mode. Running the executable with mpiexec -n 1 fixes the error.

MWE:

#include <mpi.h>
int main() {
  MPI_Init(NULL, NULL);
  MPI_Finalize();
}

Error message:

$ mpicxx mwe.cc
$ OMPI_MCA_hwloc_base_binding_policy="l3cache" ./a.out ; echo $?
0
$ OMPI_MCA_hwloc_base_binding_policy="none" ./a.out ; echo $?
0
$ OMPI_MCA_hwloc_base_binding_policy="core" ./a.out ; echo $?
0
$ OMPI_MCA_hwloc_base_binding_policy="numa" ./a.out ; echo $?
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  Setting processor affinity failed failed
  --> Returned value Error (-1) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
***    and potentially your MPI job)
[coyote10:741448] Local abort before MPI_INIT completed completed successfully,
but am not able to aggregate error messages, and not able to guarantee that all
other processes were killed!
1
$ OMPI_MCA_hwloc_base_binding_policy="numa" mpiexec -n 1 ./a.out ; echo $?
0

The issue also existed on v4.0.3, but it could be fixed with a binary patch that overrode the value of HWLOC_OBJ_NODE=0xd by 0xc at orte/mca/ess/base/ess_base_fns.c#L242 in the libopen-rte.so file. This is no longer possible in v4.1.2.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions