Description
I have configured Open MPI 4.0.1 using Open UCX 1.5 and with IB verbs disabled (both Open MPI and Open UCX were compiled with --with-debug
). I'm running a benchmark that performs a bunch of MPI_Fetch_and_op
, with the target rank waiting for all operations by other ranks to finish (by waiting for an ibarrier) and randomly performing local updates using MPI_Fetch_and_op
. I'm attaching the code, it's the same that is used in #6536
Running with 2 ranks on 2 nodes of our IB cluster, the application gets stuck in the first MPI_Fetch_and_op
. DDT reports:
Processes,Threads,Function
1,1,main (mpi_fetch_op_local_remote.c:55)
1,1, PMPI_Fetch_and_op
1,1, ompi_osc_ucx_fetch_and_op
1,1, ucp_worker_progress (ucp_worker.c:1426)
1,1, uct_worker_progress (uct.h:1677)
1,1, ucs_callbackq_dispatch (callbackq.h:209)
1,1, uct_rc_verbs_iface_progress (rc_verbs_iface.c:116)
1,1, uct_rc_verbs_iface_poll_tx (rc_verbs_iface.c:83)
1,1, uct_ib_poll_cq (ib_device.h:289)
1,1, ibv_poll_cq (verbs.h:2056)
1,1, ??
1,1,main (mpi_fetch_op_local_remote.c:74)
1,1, PMPI_Fetch_and_op
1,1, ompi_osc_ucx_fetch_and_op
1,1, ucp_atomic_fetch_nb (amo_send.c:141)
1,1, ucp_rma_send_request_cb (rma.inl:20)
1,1, ucp_request_send (ucp_request.inl:201)
1,1, ucp_request_try_send (ucp_request.inl:168)
1,1, ucp_amo_sw_progress_fetch (amo_sw.c:84)
1,1, ucp_amo_sw_progress (amo_sw.c:59)
1,1, uct_ep_am_bcopy (uct.h:1892)
1,1, uct_self_ep_am_bcopy (self.c:280)
1,1, uct_self_iface_sendrecv_am (self.c:130)
1,1, uct_iface_invoke_am (uct_iface.h:535)
1,1, ucp_atomic_req_handler (amo_sw.c:235)
1,1, ucp_request_send (ucp_request.inl:201)
1,1, ucp_request_try_send (ucp_request.inl:168)
1,1, ucp_progress_atomic_reply (amo_sw.c:121)
1,1, uct_ep_am_bcopy (uct.h:1892)
1,1, uct_self_ep_am_bcopy (self.c:280)
1,1, uct_self_iface_sendrecv_am (self.c:133)
1,1, ucs_mpool_put_inline (mpool.inl:77)
1,1, ucs_mpool_obj_to_elem (mpool.inl:64)
2,2,ucs_async_thread_func (thread.c:93)
2,2, epoll_wait
2,4,progress_engine
2,4, opal_libevent2022_event_base_loop (event.c:1630)
2,2, epoll_dispatch (epoll.c:407)
2,2, epoll_wait
2,2, poll_dispatch (poll.c:165)
2,2, poll
The upper process is the process writing to the target, the second process (mpi_fetch_op_local_remote.c:74
) is the target performing local updates.
The example code:
mpi_fetch_op_local_remote.tar.gz
Build with:
$ mpicc mpi_fetch_op_local_remote.c -o mpi_fetch_op_local_remote -g
Run with:
$ mpirun -n 2 -N 1 ./mpi_fetch_op_local_remote
Things work without problems using the OpenIB adapter.
Please let me know if I can provide more information. I hope the reproducer is helpful for someone.