Bus error with btl/sm+XPMEM in MPI_Finalize() #9868

gkatev · 2022-01-13T11:24:36Z

Hi, I've been seeing some crashes related to btl/sm and XPMEM, during MPI_FInalize().

Environment:

Open MPI 5.0.x (#b640590) (from git)
CentOS 8, aarch64

Example execution:

$(which mpirun) --host localhost:160 --mca coll basic,libnbc --mca pml ob1 --mca btl sm,self --mca smsc xpmem osu_bcast

Backtrace:

Program terminated with signal SIGBUS, Bus error.
(gdb) bt
#0  0x0000ffffae4ed550 in mca_btl_sm_check_fboxes () at ../../../../opal/mca/btl/sm/btl_sm_fbox.h:241
#1  mca_btl_sm_component_progress () at btl_sm_component.c:578
#2  0x0000ffffae4717a8 in opal_progress () at runtime/opal_progress.c:224
#3  0x0000ffffaeaa625c in ompi_mpi_finalize () at runtime/ompi_mpi_finalize.c:299
#4  0x00000000004019f0 in main (argc=<optimized out>, argv=<optimized out>) at osu_bcast.c:119

I claim that it is related to XPMEM, because if I set smsc=cma it goes away (and because it is XPMEM that traditionally triggers bus errors?). This is an aarch64 system, but I can also reproduce the error in an x86 one. For what it's worth, I do remember a similar (or same?) bug even before smsc's time, so it might not be directly related to smsc.

The text was updated successfully, but these errors were encountered:

hppritcha · 2022-01-13T16:08:35Z

@gkatev this is indeed due to xpmem. unlike other shared memory mechanisms used by open mpi, xpmem is different in that the mappings exported by a process to other attaching processes (and subsequently mapped in to the virtual address space of these attaching processes) become invalid once the exporting process exits. This is different from the typical shared memory using system V, posix, or memory mapped files.

We could make sm smarter by mapping the mail boxes to memory mapped files rather than using xpmem, even when xpmem is available.

hppritcha · 2022-01-13T17:34:31Z

@gkatev how many ranks/nodes are you using when you see this problem?

gkatev · 2022-01-13T18:13:02Z

160 ranks/cores on the arm64 system, and 64 on the x86 one. Regarding the similar (or same(?)) issue I remember seeing in the past, that would occur on the 64-core system, but not on another 32-core one (or maybe rarely(?)), so it does indeed sound like that could be a factor -- I will see if I can reproduce it with less cores/ranks.

hppritcha · 2022-01-13T19:07:55Z

no need. I have access to a aarch64/tx2 system with that kind of core count/node. its sort of a race condition so if you use fewer ranks/node you're less likely to observe this.

bosilca · 2022-01-13T20:00:35Z

I think the issue is in ompi_mpi_finalize. We cannot wait for the PMIX barrier while calling opal_progress, because we will be subject to the kind of issues put forward by the XPMEM support, where a remote process release the memory used by the local process polls while we are still actively calling BTL progress functions.

The simplest fix is to add a second PMIX barrier, one where we wait for the completion without calling opal_progress (because we know that the first barrier already drained the network of all MPI related messages).

hppritcha · 2022-01-13T21:18:28Z

I don't like this fix as it will not work with mpi_session_finalize

bosilca · 2022-01-13T22:52:57Z

I'm not sure how this has anything to do with sessions as in session_finalize you are not tearing down the BTL infrastructure.

xpmem has different behvior than other shared memory support mechanisms. in particular, any xpmem-attached regions in a process will become invalid once the exporting process exits. Under certain circumstances, this behavior can result in SIGBUS errors during mpi finalize. Related to open-mpi#9868 Signed-off-by: Howard Pritchard <[email protected]>

hppritcha · 2022-01-14T21:10:24Z

@gkatev could you give #9880 a try?

gkatev · 2022-01-17T07:39:35Z

Yes, as far as I can tell that fixes the problem, I no longer see it on either of my systems.

xpmem has different behvior than other shared memory support mechanisms. in particular, any xpmem-attached regions in a process will become invalid once the exporting process exits. Under certain circumstances, this behavior can result in SIGBUS errors during mpi finalize. Related to open-mpi#9868 Signed-off-by: Howard Pritchard <[email protected]>

xpmem has different behvior than other shared memory support mechanisms. in particular, any xpmem-attached regions in a process will become invalid once the exporting process exits. Under certain circumstances, this behavior can result in SIGBUS errors during mpi finalize. Related to open-mpi#9868 Signed-off-by: Howard Pritchard <[email protected]> (cherry picked from commit 8bac539)

hppritcha · 2022-02-22T23:00:05Z

closed via #9954 and #9880

hppritcha self-assigned this Jan 13, 2022

jsquyres added bug Target: v5.0.x labels Jan 13, 2022

jsquyres added this to the v5.0.0 milestone Jan 13, 2022

hppritcha mentioned this issue Jan 14, 2022

btl/sm: avoid using xpmem for fast boxes #9880

Merged

jsquyres mentioned this issue Jan 23, 2022

System call failure: unlink during MPI_Finalize() #9905

Open

hppritcha mentioned this issue Feb 1, 2022

v5.0: btl/sm: avoid using xpmem for fast boxes #9954

Merged

hppritcha closed this as completed Feb 22, 2022

gkatev mentioned this issue Apr 29, 2022

XPMEM bus error during MPI_Finalize openucx/ucx#8183

Open

hppritcha mentioned this issue Jan 3, 2023

pml/ucx: move pmix finalize to the end of ompi_rte_finalize() #11228

Merged

gkatev mentioned this issue Mar 7, 2023

Random SIGBUS error with xpmem on openmpi4.1.4 #11463

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bus error with btl/sm+XPMEM in MPI_Finalize() #9868

Bus error with btl/sm+XPMEM in MPI_Finalize() #9868

gkatev commented Jan 13, 2022 •

edited

Loading

hppritcha commented Jan 13, 2022

hppritcha commented Jan 13, 2022

gkatev commented Jan 13, 2022 •

edited

Loading

hppritcha commented Jan 13, 2022

bosilca commented Jan 13, 2022

hppritcha commented Jan 13, 2022

bosilca commented Jan 13, 2022

hppritcha commented Jan 14, 2022

gkatev commented Jan 17, 2022 •

edited

Loading

hppritcha commented Feb 22, 2022

Bus error with btl/sm+XPMEM in MPI_Finalize() #9868

Bus error with btl/sm+XPMEM in MPI_Finalize() #9868

Comments

gkatev commented Jan 13, 2022 • edited Loading

hppritcha commented Jan 13, 2022

hppritcha commented Jan 13, 2022

gkatev commented Jan 13, 2022 • edited Loading

hppritcha commented Jan 13, 2022

bosilca commented Jan 13, 2022

hppritcha commented Jan 13, 2022

bosilca commented Jan 13, 2022

hppritcha commented Jan 14, 2022

gkatev commented Jan 17, 2022 • edited Loading

hppritcha commented Feb 22, 2022

gkatev commented Jan 13, 2022 •

edited

Loading

gkatev commented Jan 13, 2022 •

edited

Loading

gkatev commented Jan 17, 2022 •

edited

Loading