Skip to content

Deadlock with UCX when performing MPI_Fetch_and_op #6546

Closed
@devreal

Description

@devreal

I have configured Open MPI 4.0.1 using Open UCX 1.5 and with IB verbs disabled (both Open MPI and Open UCX were compiled with --with-debug). I'm running a benchmark that performs a bunch of MPI_Fetch_and_op, with the target rank waiting for all operations by other ranks to finish (by waiting for an ibarrier) and randomly performing local updates using MPI_Fetch_and_op. I'm attaching the code, it's the same that is used in #6536

Running with 2 ranks on 2 nodes of our IB cluster, the application gets stuck in the first MPI_Fetch_and_op. DDT reports:

Processes,Threads,Function
1,1,main (mpi_fetch_op_local_remote.c:55)
1,1,  PMPI_Fetch_and_op
1,1,    ompi_osc_ucx_fetch_and_op
1,1,      ucp_worker_progress (ucp_worker.c:1426)
1,1,        uct_worker_progress (uct.h:1677)
1,1,          ucs_callbackq_dispatch (callbackq.h:209)
1,1,            uct_rc_verbs_iface_progress (rc_verbs_iface.c:116)
1,1,              uct_rc_verbs_iface_poll_tx (rc_verbs_iface.c:83)
1,1,                uct_ib_poll_cq (ib_device.h:289)
1,1,                  ibv_poll_cq (verbs.h:2056)
1,1,                    ??
1,1,main (mpi_fetch_op_local_remote.c:74)
1,1,  PMPI_Fetch_and_op
1,1,    ompi_osc_ucx_fetch_and_op
1,1,      ucp_atomic_fetch_nb (amo_send.c:141)
1,1,        ucp_rma_send_request_cb (rma.inl:20)
1,1,          ucp_request_send (ucp_request.inl:201)
1,1,            ucp_request_try_send (ucp_request.inl:168)
1,1,              ucp_amo_sw_progress_fetch (amo_sw.c:84)
1,1,                ucp_amo_sw_progress (amo_sw.c:59)
1,1,                  uct_ep_am_bcopy (uct.h:1892)
1,1,                    uct_self_ep_am_bcopy (self.c:280)
1,1,                      uct_self_iface_sendrecv_am (self.c:130)
1,1,                        uct_iface_invoke_am (uct_iface.h:535)
1,1,                          ucp_atomic_req_handler (amo_sw.c:235)
1,1,                            ucp_request_send (ucp_request.inl:201)
1,1,                              ucp_request_try_send (ucp_request.inl:168)
1,1,                                ucp_progress_atomic_reply (amo_sw.c:121)
1,1,                                  uct_ep_am_bcopy (uct.h:1892)
1,1,                                    uct_self_ep_am_bcopy (self.c:280)
1,1,                                      uct_self_iface_sendrecv_am (self.c:133)
1,1,                                        ucs_mpool_put_inline (mpool.inl:77)
1,1,                                          ucs_mpool_obj_to_elem (mpool.inl:64)
2,2,ucs_async_thread_func (thread.c:93)
2,2,  epoll_wait
2,4,progress_engine
2,4,  opal_libevent2022_event_base_loop (event.c:1630)
2,2,    epoll_dispatch (epoll.c:407)
2,2,      epoll_wait
2,2,    poll_dispatch (poll.c:165)
2,2,      poll

The upper process is the process writing to the target, the second process (mpi_fetch_op_local_remote.c:74) is the target performing local updates.

The example code:
mpi_fetch_op_local_remote.tar.gz

Build with:

$ mpicc mpi_fetch_op_local_remote.c -o mpi_fetch_op_local_remote -g

Run with:

$ mpirun -n 2 -N 1 ./mpi_fetch_op_local_remote

Things work without problems using the OpenIB adapter.

Please let me know if I can provide more information. I hope the reproducer is helpful for someone.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions