Skip to content

pml/ob1: fixed local handle sent during PUT control message #6556

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

EmmanuelBRELLE
Copy link
Contributor

In case of using a btl_put in ob1, the handle of the locally registered
memory is sent with a PUT control message. In the current master code
the sent handle is necessary the handle in the frag but if the handle
has been successfully registered in the request, the frag structure does
not have any valid handle and all fragments use the request one.

I suggest to check if the handle in the fragment is valid and if not to
send the handle from the request.

Signed-off-by: Brelle Emmanuel [email protected]

In case of using a btl_put in ob1, the handle of the locally registered
memory is sent with a PUT control message. In the current master code
the sent handle is necessary the handle in the frag but if the handle
has been successfully registered in the request, the frag structure does
not have any valid handle and all fragments use the request one.

I suggest to check if the handle in the fragment is valid and if not to
send the handle from the request.

Signed-off-by: Brelle Emmanuel <[email protected]>
@ompiteam-bot
Copy link

Can one of the admins verify this patch?

@hjelmn
Copy link
Member

hjelmn commented Apr 1, 2019

ok to test

@zuzuki
Copy link

zuzuki commented Apr 4, 2019

I think about the following correction for the problem that
the btl_register_mem function is called twice during communication
(#2486).
I will modify it to use the acquired memory if register_mem is already
called in rdma_btls.
My correction copies the acquisition processing of a memory handle
in the mca_pml_ob1_send_request_put_frag function to
the mca_pml_ob1_recv_request_put_frag function as it is.
It not only fixes getting a memory handle, but also avoids calling register_mem
again in the mca_pml_ob1_recv_request_schedule_once function.
It is limited to rdma put.
The code around here is very complex.
I think deep review by the maintainer is needed.

diff --git a/ompi/mca/pml/ob1/pml_ob1_recvreq.c b/ompi/mca/pml/ob1/pml_ob1_recvreq.c
index ea5fa6849d..3338c79f87 100644
--- a/ompi/mca/pml/ob1/pml_ob1_recvreq.c
+++ b/ompi/mca/pml/ob1/pml_ob1_recvreq.c
@@ -411,6 +411,7 @@ static int mca_pml_ob1_recv_request_put_frag (mca_pml_ob1_rdma_frag_t *frag)
     mca_pml_ob1_rdma_hdr_t *hdr;
     size_t reg_size;
     int rc;
+    mca_btl_base_registration_handle_t *local_handle = NULL;
 
     reg_size = bml_btl->btl->btl_registration_handle_size;
 
@@ -423,11 +424,33 @@ static int mca_pml_ob1_recv_request_put_frag (mca_pml_ob1_rdma_frag_t *frag)
     }
     ctl->des_cbfunc = mca_pml_ob1_recv_ctl_completion;
 
+    if(bml_btl->btl->btl_register_mem && NULL == frag->local_handle){
+        /* Check if the segment is already registered */
+        for (size_t i = 0 ; i < recvreq->req_rdma_cnt ; ++i) {
+            if (recvreq->req_rdma[i].bml_btl == frag->rdma_bml) {
+                /* do not copy the handle to the fragment to avoid deregistring it twice */
+                local_handle = recvreq->req_rdma[i].btl_reg;
+                break;
+            }
+        }
+        if (NULL == local_handle) {
+            /* Not already registered. Register the region with the BTL. */
+            mca_bml_base_register_mem (bml_btl, frag->local_address, frag->rdma_length, MCA_BTL_REG_FLAG_REMOTE_WRITE,
+                                       &frag->local_handle);
+            if (OPAL_UNLIKELY(NULL == frag->local_handle)) {
+                return OMPI_ERR_OUT_OF_RESOURCE;
+            }
+            local_handle = frag->local_handle;
+        }
+    }else{
+        local_handle = frag->local_handle;
+    }
+
     /* fill in rdma header */
     hdr = (mca_pml_ob1_rdma_hdr_t *) ctl->des_segments->seg_addr.pval;
     mca_pml_ob1_rdma_hdr_prepare (hdr, (!recvreq->req_ack_sent) ? MCA_PML_OB1_HDR_TYPE_ACK : 0,
                                   recvreq->remote_req_send.lval, frag, recvreq, frag->rdma_offset,
-                                  frag->local_address, frag->rdma_length, frag->local_handle,
+                                  frag->local_address, frag->rdma_length, local_handle,
                                   reg_size);
     ob1_hdr_hton(hdr, MCA_PML_OB1_HDR_TYPE_PUT, proc);
 
@@ -1026,15 +1049,6 @@ int mca_pml_ob1_recv_request_schedule_once( mca_pml_ob1_recv_request_t* recvreq,
         opal_convertor_get_current_pointer (&recvreq->req_recv.req_base.req_convertor, &data_ptr);
         OPAL_THREAD_UNLOCK(&recvreq->lock);
 
-        if (btl->btl_register_mem) {
-            mca_bml_base_register_mem (bml_btl, data_ptr, size, MCA_BTL_REG_FLAG_REMOTE_WRITE,
-                                       &frag->local_handle);
-            if (OPAL_UNLIKELY(NULL == frag->local_handle)) {
-                MCA_PML_OB1_RDMA_FRAG_RETURN(frag);
-                continue;
-            }
-        }
-
         /* fill in the minimum information needed to handle the fin message */
         frag->cbfunc        = mca_pml_ob1_put_completion;
         frag->rdma_length   = size;
diff --git a/ompi/mca/pml/ob1/pml_ob1_sendreq.c b/ompi/mca/pml/ob1/pml_ob1_sendreq.c
index 1626e13e35..2bad26fd7c 100644
--- a/ompi/mca/pml/ob1/pml_ob1_sendreq.c
+++ b/ompi/mca/pml/ob1/pml_ob1_sendreq.c
@@ -1174,7 +1174,7 @@ int mca_pml_ob1_send_request_put_frag( mca_pml_ob1_rdma_frag_t *frag )
             }
         }
 
-        if (NULL == frag->local_handle) {
+        if (NULL == local_handle) {
             /* Not already registered. Register the region with the BTL. */
             mca_bml_base_register_mem (bml_btl, frag->local_address, frag->rdma_length, 0,
                                        &frag->local_handle);
(END)

@hjelmn
Copy link
Member

hjelmn commented Apr 4, 2019

I plan to review this in the next couple of days.

@EmmanuelBRELLE
Copy link
Contributor Author

@zuzuki, if I am not mistaking the mca_pml_ob1_recv_request_put_frag function may also be called after a rget control message (if the btl_get failed with the error OMPI_ERR_NOT_AVAILABLE ). In this case the memory may have already been registered and the handle would be stored in the request.
Your patch would still register the memory twice. Moreover if a handle has been created for the whole request, creating new handle for each fragment would introduce performances degradation.

@jsquyres
Copy link
Member

@hjelmn You said you were going to have a look a look. Do you have the cycles to review?

Copy link
Member

@bosilca bosilca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 This patch brings the PUT path in sync with the GET path.

@bosilca
Copy link
Member

bosilca commented Apr 27, 2019

This PR and #6621 should go in the all branches.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants