-
Notifications
You must be signed in to change notification settings - Fork 900
ompi_mpi_instance_init_common needs to be local #11239
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
related to #11212 |
This patch removes redundant PMIx Fences in the initialization procedure for MPI when using the World Process Model (WPM). See chapter 11 section 2 and 3 of the MPI-4 standard for a discussion of the WPM. The patch does, however, require that what should have been a local operation to support initialization of an MPI session, into a global one. Note this does not disable the sessions feature but just restricts when it will work at this point to use cases that are similar to MPI initialization using the WPM. Refactoring to make ompi_mpi_instance_init_common purely local will require changes that would be too impactive for the current state of the 5.0.0 release cycle. See issue open-mpi#11239. This patch also fixes up the timings reported when building using the timing infrastructure: mpirun -np 8 ./ring_c ------------------ ompi_mpi_init ------------------ -- [opal_init_core.c:opal_init_util:opal_malloc_init]: 0.000031 / 0.000023 / 0.000043 -- [opal_init_core.c:opal_init_util:opal_show_help_init]: 0.000094 / 0.000085 / 0.000108 -- [opal_init_core.c:opal_init_util:opal_var_init]: 0.000002 / 0.000001 / 0.000003 -- [opal_init_core.c:opal_init_util:opal_var_cache]: 0.000399 / 0.000345 / 0.000442 -- [opal_init_core.c:opal_init_util:opal_arch_init]: 0.000057 / 0.000054 / 0.000065 -- [opal_init_core.c:opal_init_util:mca_base_open]: 0.000201 / 0.000178 / 0.000243 !! [opal_init_core.c:opal_init_util:total]: 0.000784 / 0.000686 / 0.000904 -- [opal_init.c:opal_init:opal_if_init]: 0.000074 / 0.000062 / 0.000084 -- [opal_init.c:opal_init:opal_init_psm]: 0.000010 / 0.000009 / 0.000011 -- [opal_init.c:opal_init:opal_net_init]: 0.000010 / 0.000008 / 0.000012 -- [opal_init.c:opal_init:opal_datatype_init]: 0.003596 / 0.000519 / 0.012865 !! [opal_init.c:opal_init:total]: 0.003689 / 0.000598 / 0.012972 -- [instance.c:ompi_mpi_instance_init_common:initialization]: 0.000991 / 0.000924 / 0.001064 -- [instance.c:ompi_mpi_instance_init_common:ompi_rte_init]: 0.007519 / 0.004406 / 0.016369 -- [instance.c:ompi_mpi_instance_init_common:PMIx_Commit]: 0.003164 / 0.002496 / 0.003640 -- [instance.c:ompi_mpi_instance_init_common:pmix-barrier-1]: 0.007725 / 0.000072 / 0.010423 -- [instance.c:ompi_mpi_instance_init_common:pmix-barrier-2]: 0.000138 / 0.000068 / 0.000159 -- [instance.c:ompi_mpi_instance_init_common:modex]: 0.000181 / 0.000115 / 0.000333 -- [instance.c:ompi_mpi_instance_init_common:modex-barrier]: 0.003143 / 0.002944 / 0.003308 -- [instance.c:ompi_mpi_instance_init_common:barrier]: 0.000373 / 0.000161 / 0.000618 !! [instance.c:ompi_mpi_instance_init_common:total]: 0.023234 / 0.011186 / 0.035914 [ompi_mpi_init.c:ompi_mpi_init:barrier-finish]: 0.023557 / 0.023051 / 0.024240 [ompi_mpi_init:total] 0.023557 / 0.023051 / 0.024240 [ompi_mpi_init:overhead]: 0.000240 The timing points can be refined by others depending on their needs. Related to open-mpi#11166 Signed-off-by: Howard Pritchard <[email protected]>
This patch removes redundant PMIx Fences in the initialization procedure for MPI when using the World Process Model (WPM). See chapter 11 section 2 and 3 of the MPI-4 standard for a discussion of the WPM and new Sessions model. The patch does, however, require that what should have been a local operation to support initialization of an MPI session, into a global one. Note this does not disable the sessions feature but just restricts when it will work at this point to use cases that are similar to MPI initialization using the WPM. Refactoring to make ompi_mpi_instance_init_common purely local will require changes that would be too impactive for the current state of the 5.0.0 release cycle. See issue open-mpi#11239. This patch also fixes up the timings reported when building using the timing infrastructure: mpirun -np 8 ./ring_c ------------------ ompi_mpi_init ------------------ -- [opal_init_core.c:opal_init_util:opal_malloc_init]: 0.000031 / 0.000023 / 0.000043 -- [opal_init_core.c:opal_init_util:opal_show_help_init]: 0.000094 / 0.000085 / 0.000108 -- [opal_init_core.c:opal_init_util:opal_var_init]: 0.000002 / 0.000001 / 0.000003 -- [opal_init_core.c:opal_init_util:opal_var_cache]: 0.000399 / 0.000345 / 0.000442 -- [opal_init_core.c:opal_init_util:opal_arch_init]: 0.000057 / 0.000054 / 0.000065 -- [opal_init_core.c:opal_init_util:mca_base_open]: 0.000201 / 0.000178 / 0.000243 !! [opal_init_core.c:opal_init_util:total]: 0.000784 / 0.000686 / 0.000904 -- [opal_init.c:opal_init:opal_if_init]: 0.000074 / 0.000062 / 0.000084 -- [opal_init.c:opal_init:opal_init_psm]: 0.000010 / 0.000009 / 0.000011 -- [opal_init.c:opal_init:opal_net_init]: 0.000010 / 0.000008 / 0.000012 -- [opal_init.c:opal_init:opal_datatype_init]: 0.003596 / 0.000519 / 0.012865 !! [opal_init.c:opal_init:total]: 0.003689 / 0.000598 / 0.012972 -- [instance.c:ompi_mpi_instance_init_common:initialization]: 0.000991 / 0.000924 / 0.001064 -- [instance.c:ompi_mpi_instance_init_common:ompi_rte_init]: 0.007519 / 0.004406 / 0.016369 -- [instance.c:ompi_mpi_instance_init_common:PMIx_Commit]: 0.003164 / 0.002496 / 0.003640 -- [instance.c:ompi_mpi_instance_init_common:pmix-barrier-1]: 0.007725 / 0.000072 / 0.010423 -- [instance.c:ompi_mpi_instance_init_common:pmix-barrier-2]: 0.000138 / 0.000068 / 0.000159 -- [instance.c:ompi_mpi_instance_init_common:modex]: 0.000181 / 0.000115 / 0.000333 -- [instance.c:ompi_mpi_instance_init_common:modex-barrier]: 0.003143 / 0.002944 / 0.003308 -- [instance.c:ompi_mpi_instance_init_common:barrier]: 0.000373 / 0.000161 / 0.000618 !! [instance.c:ompi_mpi_instance_init_common:total]: 0.023234 / 0.011186 / 0.035914 [ompi_mpi_init.c:ompi_mpi_init:barrier-finish]: 0.023557 / 0.023051 / 0.024240 [ompi_mpi_init:total] 0.023557 / 0.023051 / 0.024240 [ompi_mpi_init:overhead]: 0.000240 The timing points can be refined by others depending on their needs. Related to open-mpi#11166 Signed-off-by: Howard Pritchard <[email protected]>
This patch removes redundant PMIx Fences in the initialization procedure for MPI when using the World Process Model (WPM). See chapter 11 section 2 and 3 of the MPI-4 standard for a discussion of the WPM and new Sessions model. The patch does, however, require that what should have been a local operation to support initialization of an MPI session, into a global one. Note this does not disable the sessions feature but just restricts when it will work at this point to use cases that are similar to MPI initialization using the WPM. Refactoring to make ompi_mpi_instance_init_common purely local will require changes that would be too impactive for the current state of the 5.0.0 release cycle. See issue open-mpi#11239. This patch also fixes up the timings reported when building using the timing infrastructure: mpirun -np 8 ./ring_c ------------------ ompi_mpi_init ------------------ -- [opal_init_core.c:opal_init_util:opal_malloc_init]: 0.000031 / 0.000023 / 0.000043 -- [opal_init_core.c:opal_init_util:opal_show_help_init]: 0.000094 / 0.000085 / 0.000108 -- [opal_init_core.c:opal_init_util:opal_var_init]: 0.000002 / 0.000001 / 0.000003 -- [opal_init_core.c:opal_init_util:opal_var_cache]: 0.000399 / 0.000345 / 0.000442 -- [opal_init_core.c:opal_init_util:opal_arch_init]: 0.000057 / 0.000054 / 0.000065 -- [opal_init_core.c:opal_init_util:mca_base_open]: 0.000201 / 0.000178 / 0.000243 !! [opal_init_core.c:opal_init_util:total]: 0.000784 / 0.000686 / 0.000904 -- [opal_init.c:opal_init:opal_if_init]: 0.000074 / 0.000062 / 0.000084 -- [opal_init.c:opal_init:opal_init_psm]: 0.000010 / 0.000009 / 0.000011 -- [opal_init.c:opal_init:opal_net_init]: 0.000010 / 0.000008 / 0.000012 -- [opal_init.c:opal_init:opal_datatype_init]: 0.003596 / 0.000519 / 0.012865 !! [opal_init.c:opal_init:total]: 0.003689 / 0.000598 / 0.012972 -- [instance.c:ompi_mpi_instance_init_common:initialization]: 0.000991 / 0.000924 / 0.001064 -- [instance.c:ompi_mpi_instance_init_common:ompi_rte_init]: 0.007519 / 0.004406 / 0.016369 -- [instance.c:ompi_mpi_instance_init_common:PMIx_Commit]: 0.003164 / 0.002496 / 0.003640 -- [instance.c:ompi_mpi_instance_init_common:pmix-barrier-1]: 0.007725 / 0.000072 / 0.010423 -- [instance.c:ompi_mpi_instance_init_common:pmix-barrier-2]: 0.000138 / 0.000068 / 0.000159 -- [instance.c:ompi_mpi_instance_init_common:modex]: 0.000181 / 0.000115 / 0.000333 -- [instance.c:ompi_mpi_instance_init_common:modex-barrier]: 0.003143 / 0.002944 / 0.003308 -- [instance.c:ompi_mpi_instance_init_common:barrier]: 0.000373 / 0.000161 / 0.000618 !! [instance.c:ompi_mpi_instance_init_common:total]: 0.023234 / 0.011186 / 0.035914 [ompi_mpi_init.c:ompi_mpi_init:barrier-finish]: 0.023557 / 0.023051 / 0.024240 [ompi_mpi_init:total] 0.023557 / 0.023051 / 0.024240 [ompi_mpi_init:overhead]: 0.000240 The timing points can be refined by others depending on their needs. Related to open-mpi#11166 Signed-off-by: Howard Pritchard <[email protected]>
This patch removes redundant PMIx Fences in the initialization procedure for MPI when using the World Process Model (WPM). See chapter 11 sections 2 and 3 of the MPI-4 standard for a discussion of the WPM and new Sessions model. The patch does, however, require that what should have been a local operation to support initialization of an MPI session, into a global one. Note this does not disable the sessions feature but just restricts when it will work at this point to use cases that are similar to MPI initialization using the WPM. Refactoring to make ompi_mpi_instance_init_common purely local will require changes that would be too impactive for the current state of the 5.0.0 release cycle. See issue open-mpi#11239. This patch also fixes up the timings reported when building using the timing infrastructure: mpirun -np 8 ./ring_c ------------------ ompi_mpi_init ------------------ -- [opal_init_core.c:opal_init_util:opal_malloc_init]: 0.000031 / 0.000023 / 0.000043 -- [opal_init_core.c:opal_init_util:opal_show_help_init]: 0.000094 / 0.000085 / 0.000108 -- [opal_init_core.c:opal_init_util:opal_var_init]: 0.000002 / 0.000001 / 0.000003 -- [opal_init_core.c:opal_init_util:opal_var_cache]: 0.000399 / 0.000345 / 0.000442 -- [opal_init_core.c:opal_init_util:opal_arch_init]: 0.000057 / 0.000054 / 0.000065 -- [opal_init_core.c:opal_init_util:mca_base_open]: 0.000201 / 0.000178 / 0.000243 !! [opal_init_core.c:opal_init_util:total]: 0.000784 / 0.000686 / 0.000904 -- [opal_init.c:opal_init:opal_if_init]: 0.000074 / 0.000062 / 0.000084 -- [opal_init.c:opal_init:opal_init_psm]: 0.000010 / 0.000009 / 0.000011 -- [opal_init.c:opal_init:opal_net_init]: 0.000010 / 0.000008 / 0.000012 -- [opal_init.c:opal_init:opal_datatype_init]: 0.003596 / 0.000519 / 0.012865 !! [opal_init.c:opal_init:total]: 0.003689 / 0.000598 / 0.012972 -- [instance.c:ompi_mpi_instance_init_common:initialization]: 0.000991 / 0.000924 / 0.001064 -- [instance.c:ompi_mpi_instance_init_common:ompi_rte_init]: 0.007519 / 0.004406 / 0.016369 -- [instance.c:ompi_mpi_instance_init_common:PMIx_Commit]: 0.003164 / 0.002496 / 0.003640 -- [instance.c:ompi_mpi_instance_init_common:pmix-barrier-1]: 0.007725 / 0.000072 / 0.010423 -- [instance.c:ompi_mpi_instance_init_common:pmix-barrier-2]: 0.000138 / 0.000068 / 0.000159 -- [instance.c:ompi_mpi_instance_init_common:modex]: 0.000181 / 0.000115 / 0.000333 -- [instance.c:ompi_mpi_instance_init_common:modex-barrier]: 0.003143 / 0.002944 / 0.003308 -- [instance.c:ompi_mpi_instance_init_common:barrier]: 0.000373 / 0.000161 / 0.000618 !! [instance.c:ompi_mpi_instance_init_common:total]: 0.023234 / 0.011186 / 0.035914 [ompi_mpi_init.c:ompi_mpi_init:barrier-finish]: 0.023557 / 0.023051 / 0.024240 [ompi_mpi_init:total] 0.023557 / 0.023051 / 0.024240 [ompi_mpi_init:overhead]: 0.000240 The timing points can be refined by others depending on their needs. Related to open-mpi#11166 Signed-off-by: Howard Pritchard <[email protected]>
This patch removes redundant PMIx Fences in the initialization procedure for MPI when using the World Process Model (WPM). See chapter 11 sections 2 and 3 of the MPI-4 standard for a discussion of the WPM and new Sessions model. The patch does, however, require that what should have been a local operation to support initialization of an MPI session, into a global one. Note this does not disable the sessions feature but just restricts when it will work at this point to use cases that are similar to MPI initialization using the WPM. Refactoring to make ompi_mpi_instance_init_common purely local will require changes that would be too impactive for the current state of the 5.0.0 release cycle. See issue open-mpi#11239. This patch also fixes up the timings reported when building using the timing infrastructure: mpirun -np 8 ./ring_c ------------------ ompi_mpi_init ------------------ -- [opal_init_core.c:opal_init_util:opal_malloc_init]: 0.000031 / 0.000023 / 0.000043 -- [opal_init_core.c:opal_init_util:opal_show_help_init]: 0.000094 / 0.000085 / 0.000108 -- [opal_init_core.c:opal_init_util:opal_var_init]: 0.000002 / 0.000001 / 0.000003 -- [opal_init_core.c:opal_init_util:opal_var_cache]: 0.000399 / 0.000345 / 0.000442 -- [opal_init_core.c:opal_init_util:opal_arch_init]: 0.000057 / 0.000054 / 0.000065 -- [opal_init_core.c:opal_init_util:mca_base_open]: 0.000201 / 0.000178 / 0.000243 !! [opal_init_core.c:opal_init_util:total]: 0.000784 / 0.000686 / 0.000904 -- [opal_init.c:opal_init:opal_if_init]: 0.000074 / 0.000062 / 0.000084 -- [opal_init.c:opal_init:opal_init_psm]: 0.000010 / 0.000009 / 0.000011 -- [opal_init.c:opal_init:opal_net_init]: 0.000010 / 0.000008 / 0.000012 -- [opal_init.c:opal_init:opal_datatype_init]: 0.003596 / 0.000519 / 0.012865 !! [opal_init.c:opal_init:total]: 0.003689 / 0.000598 / 0.012972 -- [instance.c:ompi_mpi_instance_init_common:initialization]: 0.000991 / 0.000924 / 0.001064 -- [instance.c:ompi_mpi_instance_init_common:ompi_rte_init]: 0.007519 / 0.004406 / 0.016369 -- [instance.c:ompi_mpi_instance_init_common:PMIx_Commit]: 0.003164 / 0.002496 / 0.003640 -- [instance.c:ompi_mpi_instance_init_common:pmix-barrier-1]: 0.007725 / 0.000072 / 0.010423 -- [instance.c:ompi_mpi_instance_init_common:pmix-barrier-2]: 0.000138 / 0.000068 / 0.000159 -- [instance.c:ompi_mpi_instance_init_common:modex]: 0.000181 / 0.000115 / 0.000333 -- [instance.c:ompi_mpi_instance_init_common:modex-barrier]: 0.003143 / 0.002944 / 0.003308 -- [instance.c:ompi_mpi_instance_init_common:barrier]: 0.000373 / 0.000161 / 0.000618 !! [instance.c:ompi_mpi_instance_init_common:total]: 0.023234 / 0.011186 / 0.035914 [ompi_mpi_init.c:ompi_mpi_init:barrier-finish]: 0.023557 / 0.023051 / 0.024240 [ompi_mpi_init:total] 0.023557 / 0.023051 / 0.024240 [ompi_mpi_init:overhead]: 0.000240 The timing points can be refined by others depending on their needs. Related to open-mpi#11166 Signed-off-by: Howard Pritchard <[email protected]>
This patch removes redundant PMIx Fences in the initialization procedure for MPI when using the World Process Model (WPM). See chapter 11 sections 2 and 3 of the MPI-4 standard for a discussion of the WPM and new Sessions model. The patch does, however, require that what should have been a local operation to support initialization of an MPI session, into a global one. Note this does not disable the sessions feature but just restricts when it will work at this point to use cases that are similar to MPI initialization using the WPM. Refactoring to make ompi_mpi_instance_init_common purely local will require changes that would be too impactive for the current state of the 5.0.0 release cycle. See issue open-mpi#11239. Related to open-mpi#11166 Signed-off-by: Howard Pritchard <[email protected]>
This patch removes redundant PMIx Fences in the initialization procedure for MPI when using the World Process Model (WPM). See chapter 11 sections 2 and 3 of the MPI-4 standard for a discussion of the WPM and new Sessions model. The patch does, however, require that what should have been a local operation to support initialization of an MPI session, into a global one. Note this does not disable the sessions feature but just restricts when it will work at this point to use cases that are similar to MPI initialization using the WPM. Refactoring to make ompi_mpi_instance_init_common purely local will require changes that would be too impactive for the current state of the 5.0.0 release cycle. See issue open-mpi#11239. Related to open-mpi#11166 Signed-off-by: Howard Pritchard <[email protected]>
@hppritcha Just wondering: I have been working on a side-branch to ensure that PMIx_Group_construct properly returns all the job-level info and modex info for all group participants. If that is correctly completed, then would that remove this problem? Just trying to understand why "add_procs" would require a fence operation if all the job-level and modex info were known. |
i'm not sure that would help. i'd need to double check but i'm pretty sure one thing that messes things up is the way smsc tries to get itself set up using the IMMEDIATE attribute for a pmix_get operation. |
Precisely - |
This patch removes redundant PMIx Fences in the initialization procedure for MPI when using the World Process Model (WPM). See chapter 11 sections 2 and 3 of the MPI-4 standard for a discussion of the WPM and new Sessions model. The patch does, however, require that what should have been a local operation to support initialization of an MPI session, into a global one. Note this does not disable the sessions feature but just restricts when it will work at this point to use cases that are similar to MPI initialization using the WPM. Refactoring to make ompi_mpi_instance_init_common purely local will require changes that would be too impactive for the current state of the 5.0.0 release cycle. See issue open-mpi#11239. Related to open-mpi#11166 Signed-off-by: Howard Pritchard <[email protected]>
This patch removes redundant PMIx Fences in the initialization procedure for MPI when using the World Process Model (WPM). See chapter 11 sections 2 and 3 of the MPI-4 standard for a discussion of the WPM and new Sessions model. The patch does, however, require that what should have been a local operation to support initialization of an MPI session, into a global one. Note this does not disable the sessions feature but just restricts when it will work at this point to use cases that are similar to MPI initialization using the WPM. Refactoring to make ompi_mpi_instance_init_common purely local will require changes that would be too impactive for the current state of the 5.0.0 release cycle. See issue open-mpi#11239. Related to open-mpi#11166 Signed-off-by: Howard Pritchard <[email protected]>
Had a few minutes and took a peek at that component, and you are correct. The way that was done forces a "fence" to be called prior to getting the endpt. What you want to do is remove the "immediate" attribute. The PMIx library will call up to the server to see if the value is available there. Since the peer is a local one, the PMIx server will not pass the request up to the host daemon for a "dmodex" operation. Instead, it will wait for the local peer to commit it (the "immediate" attribute causes the server to return "not found" without waiting). You technically don't need a timeout, but if you want to ensure you don't hang in the component (e.g., if the peer never provides its info for some reason), then just replace the "immediate" attribute with a "timeout" and provide some reasonable value (e.g., 2 secs). You'll probably have to just directly call |
thanks for the info. I wonder why that IMMEDIATE option was being used in the first place. |
In fact, I think the way that component (and perhaps the others in that framework as well) uses PMIx is incorrect. These are shared memory components, and so they should not be posting their modex information on the "global" scope. They should be putting it solely on |
This patch removes redundant PMIx Fences in the initialization procedure for MPI when using the World Process Model (WPM). See chapter 11 sections 2 and 3 of the MPI-4 standard for a discussion of the WPM and new Sessions model. The patch does, however, require that what should have been a local operation to support initialization of an MPI session, into a global one. Note this does not disable the sessions feature but just restricts when it will work at this point to use cases that are similar to MPI initialization using the WPM. Refactoring to make ompi_mpi_instance_init_common purely local will require changes that would be too impactive for the current state of the 5.0.0 release cycle. See issue open-mpi#11239. This patch also fixes up the timings reported when building using the timing infrastructure: mpirun -np 8 ./ring_c ------------------ ompi_mpi_init ------------------ -- [opal_init_core.c:opal_init_util:opal_malloc_init]: 0.000031 / 0.000023 / 0.000043 -- [opal_init_core.c:opal_init_util:opal_show_help_init]: 0.000094 / 0.000085 / 0.000108 -- [opal_init_core.c:opal_init_util:opal_var_init]: 0.000002 / 0.000001 / 0.000003 -- [opal_init_core.c:opal_init_util:opal_var_cache]: 0.000399 / 0.000345 / 0.000442 -- [opal_init_core.c:opal_init_util:opal_arch_init]: 0.000057 / 0.000054 / 0.000065 -- [opal_init_core.c:opal_init_util:mca_base_open]: 0.000201 / 0.000178 / 0.000243 !! [opal_init_core.c:opal_init_util:total]: 0.000784 / 0.000686 / 0.000904 -- [opal_init.c:opal_init:opal_if_init]: 0.000074 / 0.000062 / 0.000084 -- [opal_init.c:opal_init:opal_init_psm]: 0.000010 / 0.000009 / 0.000011 -- [opal_init.c:opal_init:opal_net_init]: 0.000010 / 0.000008 / 0.000012 -- [opal_init.c:opal_init:opal_datatype_init]: 0.003596 / 0.000519 / 0.012865 !! [opal_init.c:opal_init:total]: 0.003689 / 0.000598 / 0.012972 -- [instance.c:ompi_mpi_instance_init_common:initialization]: 0.000991 / 0.000924 / 0.001064 -- [instance.c:ompi_mpi_instance_init_common:ompi_rte_init]: 0.007519 / 0.004406 / 0.016369 -- [instance.c:ompi_mpi_instance_init_common:PMIx_Commit]: 0.003164 / 0.002496 / 0.003640 -- [instance.c:ompi_mpi_instance_init_common:pmix-barrier-1]: 0.007725 / 0.000072 / 0.010423 -- [instance.c:ompi_mpi_instance_init_common:pmix-barrier-2]: 0.000138 / 0.000068 / 0.000159 -- [instance.c:ompi_mpi_instance_init_common:modex]: 0.000181 / 0.000115 / 0.000333 -- [instance.c:ompi_mpi_instance_init_common:modex-barrier]: 0.003143 / 0.002944 / 0.003308 -- [instance.c:ompi_mpi_instance_init_common:barrier]: 0.000373 / 0.000161 / 0.000618 !! [instance.c:ompi_mpi_instance_init_common:total]: 0.023234 / 0.011186 / 0.035914 [ompi_mpi_init.c:ompi_mpi_init:barrier-finish]: 0.023557 / 0.023051 / 0.024240 [ompi_mpi_init:total] 0.023557 / 0.023051 / 0.024240 [ompi_mpi_init:overhead]: 0.000240 The timing points can be refined by others depending on their needs. Related to open-mpi#11166 Signed-off-by: Howard Pritchard <[email protected]>
This patch removes redundant PMIx Fences in the initialization procedure for MPI when using the World Process Model (WPM). See chapter 11 sections 2 and 3 of the MPI-4 standard for a discussion of the WPM and new Sessions model. The patch does, however, require that what should have been a local operation to support initialization of an MPI session, into a global one. Note this does not disable the sessions feature but just restricts when it will work at this point to use cases that are similar to MPI initialization using the WPM. Refactoring to make ompi_mpi_instance_init_common purely local will require changes that would be too impactive for the current state of the 5.0.0 release cycle. See issue open-mpi#11239. This patch also fixes up the timings reported when building using the timing infrastructure: mpirun -np 8 ./ring_c ------------------ ompi_mpi_init ------------------ -- [opal_init_core.c:opal_init_util:opal_malloc_init]: 0.000031 / 0.000023 / 0.000043 -- [opal_init_core.c:opal_init_util:opal_show_help_init]: 0.000094 / 0.000085 / 0.000108 -- [opal_init_core.c:opal_init_util:opal_var_init]: 0.000002 / 0.000001 / 0.000003 -- [opal_init_core.c:opal_init_util:opal_var_cache]: 0.000399 / 0.000345 / 0.000442 -- [opal_init_core.c:opal_init_util:opal_arch_init]: 0.000057 / 0.000054 / 0.000065 -- [opal_init_core.c:opal_init_util:mca_base_open]: 0.000201 / 0.000178 / 0.000243 !! [opal_init_core.c:opal_init_util:total]: 0.000784 / 0.000686 / 0.000904 -- [opal_init.c:opal_init:opal_if_init]: 0.000074 / 0.000062 / 0.000084 -- [opal_init.c:opal_init:opal_init_psm]: 0.000010 / 0.000009 / 0.000011 -- [opal_init.c:opal_init:opal_net_init]: 0.000010 / 0.000008 / 0.000012 -- [opal_init.c:opal_init:opal_datatype_init]: 0.003596 / 0.000519 / 0.012865 !! [opal_init.c:opal_init:total]: 0.003689 / 0.000598 / 0.012972 -- [instance.c:ompi_mpi_instance_init_common:initialization]: 0.000991 / 0.000924 / 0.001064 -- [instance.c:ompi_mpi_instance_init_common:ompi_rte_init]: 0.007519 / 0.004406 / 0.016369 -- [instance.c:ompi_mpi_instance_init_common:PMIx_Commit]: 0.003164 / 0.002496 / 0.003640 -- [instance.c:ompi_mpi_instance_init_common:pmix-barrier-1]: 0.007725 / 0.000072 / 0.010423 -- [instance.c:ompi_mpi_instance_init_common:pmix-barrier-2]: 0.000138 / 0.000068 / 0.000159 -- [instance.c:ompi_mpi_instance_init_common:modex]: 0.000181 / 0.000115 / 0.000333 -- [instance.c:ompi_mpi_instance_init_common:modex-barrier]: 0.003143 / 0.002944 / 0.003308 -- [instance.c:ompi_mpi_instance_init_common:barrier]: 0.000373 / 0.000161 / 0.000618 !! [instance.c:ompi_mpi_instance_init_common:total]: 0.023234 / 0.011186 / 0.035914 [ompi_mpi_init.c:ompi_mpi_init:barrier-finish]: 0.023557 / 0.023051 / 0.024240 [ompi_mpi_init:total] 0.023557 / 0.023051 / 0.024240 [ompi_mpi_init:overhead]: 0.000240 The timing points can be refined by others depending on their needs. Related to open-mpi#11166 Signed-off-by: Howard Pritchard <[email protected]>
This patch removes redundant PMIx Fences in the initialization procedure for MPI when using the World Process Model (WPM). See chapter 11 sections 2 and 3 of the MPI-4 standard for a discussion of the WPM and new Sessions model. The patch does, however, require that what should have been a local operation to support initialization of an MPI session, into a global one. Note this does not disable the sessions feature but just restricts when it will work at this point to use cases that are similar to MPI initialization using the WPM. Refactoring to make ompi_mpi_instance_init_common purely local will require changes that would be too impactive for the current state of the 5.0.0 release cycle. See issue open-mpi#11239. This patch also fixes up the timings reported when building using the timing infrastructure: mpirun -np 8 ./ring_c ------------------ ompi_mpi_init ------------------ -- [opal_init_core.c:opal_init_util:opal_malloc_init]: 0.000031 / 0.000023 / 0.000043 -- [opal_init_core.c:opal_init_util:opal_show_help_init]: 0.000094 / 0.000085 / 0.000108 -- [opal_init_core.c:opal_init_util:opal_var_init]: 0.000002 / 0.000001 / 0.000003 -- [opal_init_core.c:opal_init_util:opal_var_cache]: 0.000399 / 0.000345 / 0.000442 -- [opal_init_core.c:opal_init_util:opal_arch_init]: 0.000057 / 0.000054 / 0.000065 -- [opal_init_core.c:opal_init_util:mca_base_open]: 0.000201 / 0.000178 / 0.000243 !! [opal_init_core.c:opal_init_util:total]: 0.000784 / 0.000686 / 0.000904 -- [opal_init.c:opal_init:opal_if_init]: 0.000074 / 0.000062 / 0.000084 -- [opal_init.c:opal_init:opal_init_psm]: 0.000010 / 0.000009 / 0.000011 -- [opal_init.c:opal_init:opal_net_init]: 0.000010 / 0.000008 / 0.000012 -- [opal_init.c:opal_init:opal_datatype_init]: 0.003596 / 0.000519 / 0.012865 !! [opal_init.c:opal_init:total]: 0.003689 / 0.000598 / 0.012972 -- [instance.c:ompi_mpi_instance_init_common:initialization]: 0.000991 / 0.000924 / 0.001064 -- [instance.c:ompi_mpi_instance_init_common:ompi_rte_init]: 0.007519 / 0.004406 / 0.016369 -- [instance.c:ompi_mpi_instance_init_common:PMIx_Commit]: 0.003164 / 0.002496 / 0.003640 -- [instance.c:ompi_mpi_instance_init_common:pmix-barrier-1]: 0.007725 / 0.000072 / 0.010423 -- [instance.c:ompi_mpi_instance_init_common:pmix-barrier-2]: 0.000138 / 0.000068 / 0.000159 -- [instance.c:ompi_mpi_instance_init_common:modex]: 0.000181 / 0.000115 / 0.000333 -- [instance.c:ompi_mpi_instance_init_common:modex-barrier]: 0.003143 / 0.002944 / 0.003308 -- [instance.c:ompi_mpi_instance_init_common:barrier]: 0.000373 / 0.000161 / 0.000618 !! [instance.c:ompi_mpi_instance_init_common:total]: 0.023234 / 0.011186 / 0.035914 [ompi_mpi_init.c:ompi_mpi_init:barrier-finish]: 0.023557 / 0.023051 / 0.024240 [ompi_mpi_init:total] 0.023557 / 0.023051 / 0.024240 [ompi_mpi_init:overhead]: 0.000240 The timing points can be refined by others depending on their needs. Related to open-mpi#11166 Signed-off-by: Howard Pritchard <[email protected]>
ompi_mpi_instance_init_common
is invoked when an application uses eitherMPI_Init
orMPI_Session_init
. The later function has purely local semantics to the invoking application. As is though, this function has to have a PMIx fence involving all procs in a "job" as some PMLs require this for the add_procs method to work correctly, even if just the "local" procs are included in the argument array.This call and the preceding PMIx Fence need to be moved into the
MPI_Init
specific code. Code needs to be added in theMPI_Comm_create_from_group
andMPI_Intercomm_create_from_groups
to do the add_procs at that point. This may require either new PML methods or modification to add_proc methods that don't check for previously added procs.The text was updated successfully, but these errors were encountered: