Skip to content

How to initialize smsc when btl/sm is not used? #10342

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
gkatev opened this issue Apr 29, 2022 · 8 comments · Fixed by #10897
Closed

How to initialize smsc when btl/sm is not used? #10342

gkatev opened this issue Apr 29, 2022 · 8 comments · Fixed by #10897

Comments

@gkatev
Copy link
Contributor

gkatev commented Apr 29, 2022

Hi, I'm using opal/smsc in a collectives component. When pml/ob1+btl/sm are used, all works correctly. However, if instead ucx is configured as the PML, smsc remains uninitialized. I noticed that btl/sm calls mca_smsc_base_select(), and tried calling that in my code, but even so, later calls to get_endpoint() fail.

So, how should I go about initializing smsc in my code when it's not initialized elsewhere?


In case the the mca_smsc_base_select call is all that is needed, consider this a bug report :-)

I'm on v5.0.0rc6

To reproduce:

diff --git a/ompi/mca/coll/sm/coll_sm_module.c b/ompi/mca/coll/sm/coll_sm_module.c
index ba3c62ce1c..b89f048f51 100644
--- a/ompi/mca/coll/sm/coll_sm_module.c
+++ b/ompi/mca/coll/sm/coll_sm_module.c
@@ -220,6 +220,7 @@ mca_coll_sm_comm_query(struct ompi_communicator_t *comm, int *priority)
     return &(sm_module->super);
 }
 
+#include "opal/mca/smsc/base/base.h"
 
 /*
  * Init module on the communicator
@@ -234,7 +235,21 @@ static int sm_module_enable(mca_coll_base_module_t *module,
                            ompi_comm_print_cid (comm), comm->c_name);
         return OMPI_ERROR;
     }
-
+    
+    if(mca_smsc == NULL) {
+        mca_smsc_base_select();
+        printf("smsc base init %s\n", (mca_smsc ? "success" : "fail"));
+    } else
+        printf("smsc already initialized\n");
+    
+    int rank = ompi_comm_rank(comm);
+    int comm_size = ompi_comm_size(comm);
+    
+    ompi_proc_t *peer = ompi_comm_peer_lookup(comm, (rank + 1) % comm_size);
+    mca_smsc_endpoint_t *smsc_ep = MCA_SMSC_CALL(get_endpoint, &peer->super);
+    
+    printf("smsc_ep = %p\n", smsc_ep);
+    
     /* We do everything lazily in ompi_coll_sm_enable() */
     return OMPI_SUCCESS;
 }
diff --git a/opal/mca/smsc/xpmem/smsc_xpmem_module.c b/opal/mca/smsc/xpmem/smsc_xpmem_module.c
index 6a3444a35d..4bb688f66c 100644
--- a/opal/mca/smsc/xpmem/smsc_xpmem_module.c
+++ b/opal/mca/smsc/xpmem/smsc_xpmem_module.c
@@ -42,6 +42,8 @@ mca_smsc_endpoint_t *mca_smsc_xpmem_get_endpoint(opal_proc_t *peer_proc)
     OPAL_MODEX_RECV_IMMEDIATE(rc, &mca_smsc_xpmem_component.super.smsc_version,
                               &peer_proc->proc_name, (void **) &modex, &modex_size);
     if (OPAL_UNLIKELY(OPAL_SUCCESS != rc)) {
+        printf("OPAL_MODEX_RECV_IMMEDIATE() failed @ smsc/xpmem get_endpoint\n");
+        
         OBJ_RELEASE(endpoint);
         return NULL;
     }
$ mpirun -n 2 --mca coll basic,libnbc,sm --mca coll_sm_priority 100 --mca smsc xpmem --mca pml ucx osu_bcast
smsc base init success
smsc base init success
OPAL_MODEX_RECV_IMMEDIATE() failed @ smsc/xpmem get_endpoint
smsc_ep = (nil)
OPAL_MODEX_RECV_IMMEDIATE() failed @ smsc/xpmem get_endpoint
smsc_ep = (nil)

With pml=ob1, all works ok:

smsc already initialized
smsc_ep = 0x2a4f8270
smsc already initialized
smsc_ep = 0x215581e0
@bosilca
Copy link
Member

bosilca commented May 2, 2022

The issue is that when OB1 is not the PML nobody stores the the modex info you are trying to read. The complete fix would be to also store the modex info in the open part of your component.

@gkatev
Copy link
Contributor Author

gkatev commented May 2, 2022

Oh I see! Do you happen to have any readily available resource about the modex's workings?

I see this modex send inside smsc/xpmem, that sends exactly what is needed in get_endpoint

OPAL_MODEX_SEND(rc, PMIX_LOCAL, &mca_smsc_xpmem_component.super.smsc_version, &modex,

Evidently this is not enough? Could you point to the code in pml/ob1 or btl/sm that does what I would need to do in my component?

I also tried placing the call to mca_smsc_base_select that triggers the above modex send earlier in the initialization chain (init_query, comm_query), but with no success.

@bosilca
Copy link
Member

bosilca commented May 2, 2022

Are you certain that function is called on all processes ?

@gkatev
Copy link
Contributor Author

gkatev commented May 2, 2022

Yes, I see that each process enters it exactly once.
Edit: I also see that the return code of OPAL_MODEX_SEND is successful (0)

@hjelmn
Copy link
Member

hjelmn commented May 3, 2022

Have you tried to call from within the coll/sm component initialization? That is generally where the modex is sent.

Also, might be worth chatting offline about your plans for coll/sm. I have been working on it as well to make it NUMA aware. The first cut is to remove all existing collectives and re-implement starting with barrier. So far it is beating Intel MPI's NUMA-aware barrier on a 2-NUMA node. I haven't had time to work on broadcast (the next obvious one).

@hjelmn
Copy link
Member

hjelmn commented May 3, 2022

Also, smsc was split out of btl/sm specifically for coll/sm so I am happy to help get that integration complete.

@gkatev
Copy link
Contributor Author

gkatev commented May 3, 2022

Where exactly is the component initialization? I have also tried placing the call in mca_coll_sm_init_query and mca_coll_sm_comm_query.

I don't have any plans for coll/sm specifically, here I just used it as a testing base. However, we have developed a new component for topology-aware collectives, and from your description it appears that there is overlap. I believe that the current plan is to open-source it at some point in the coming months.

@gkatev
Copy link
Contributor Author

gkatev commented Oct 6, 2022

I created #10897 to fix this. Please let me know how the proposed solution looks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants