-
Notifications
You must be signed in to change notification settings - Fork 900
PMIx_Fences - remove unneeded ones during MPI initialization #11305
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
bot:aws-v2:retest |
@jsquyres do you know how to retest this that PR failure? |
It's the new bot command: |
bot:aws-v2:retest |
not sure this bot:aws-v2:retest is actually working |
@hppritcha can you rebase this PR on the head of main? That will fix the CI issue. The long story on why it will fix it is that when we initially started rolling out the PR tester, we had it configured to rebase the PR on the target branch and test that. Because the target branch (main in this case) had the jenkinsfile committed, Jenkins tried to run the test. But the downside is that every time there is a commit to the target branch, all the target PRs rebuild, which is ugh. So we changed the config to build the PR branch, which for PRs opened after the Jenkinsfile was committed, worked fine. But for PRs like yours that were last rebased before the Jenkinsfile was committed, Jenkins is now in a state where it doesn't know how to build this PR. Rebasing will pull in the Jenkinsfile and the test will run again. |
This patch removes redundant PMIx Fences in the initialization procedure for MPI when using the World Process Model (WPM). See chapter 11 sections 2 and 3 of the MPI-4 standard for a discussion of the WPM and new Sessions model. The patch does, however, require that what should have been a local operation to support initialization of an MPI session, into a global one. Note this does not disable the sessions feature but just restricts when it will work at this point to use cases that are similar to MPI initialization using the WPM. Refactoring to make ompi_mpi_instance_init_common purely local will require changes that would be too impactive for the current state of the 5.0.0 release cycle. See issue open-mpi#11239. This patch also fixes up the timings reported when building using the timing infrastructure: mpirun -np 8 ./ring_c ------------------ ompi_mpi_init ------------------ -- [opal_init_core.c:opal_init_util:opal_malloc_init]: 0.000031 / 0.000023 / 0.000043 -- [opal_init_core.c:opal_init_util:opal_show_help_init]: 0.000094 / 0.000085 / 0.000108 -- [opal_init_core.c:opal_init_util:opal_var_init]: 0.000002 / 0.000001 / 0.000003 -- [opal_init_core.c:opal_init_util:opal_var_cache]: 0.000399 / 0.000345 / 0.000442 -- [opal_init_core.c:opal_init_util:opal_arch_init]: 0.000057 / 0.000054 / 0.000065 -- [opal_init_core.c:opal_init_util:mca_base_open]: 0.000201 / 0.000178 / 0.000243 !! [opal_init_core.c:opal_init_util:total]: 0.000784 / 0.000686 / 0.000904 -- [opal_init.c:opal_init:opal_if_init]: 0.000074 / 0.000062 / 0.000084 -- [opal_init.c:opal_init:opal_init_psm]: 0.000010 / 0.000009 / 0.000011 -- [opal_init.c:opal_init:opal_net_init]: 0.000010 / 0.000008 / 0.000012 -- [opal_init.c:opal_init:opal_datatype_init]: 0.003596 / 0.000519 / 0.012865 !! [opal_init.c:opal_init:total]: 0.003689 / 0.000598 / 0.012972 -- [instance.c:ompi_mpi_instance_init_common:initialization]: 0.000991 / 0.000924 / 0.001064 -- [instance.c:ompi_mpi_instance_init_common:ompi_rte_init]: 0.007519 / 0.004406 / 0.016369 -- [instance.c:ompi_mpi_instance_init_common:PMIx_Commit]: 0.003164 / 0.002496 / 0.003640 -- [instance.c:ompi_mpi_instance_init_common:pmix-barrier-1]: 0.007725 / 0.000072 / 0.010423 -- [instance.c:ompi_mpi_instance_init_common:pmix-barrier-2]: 0.000138 / 0.000068 / 0.000159 -- [instance.c:ompi_mpi_instance_init_common:modex]: 0.000181 / 0.000115 / 0.000333 -- [instance.c:ompi_mpi_instance_init_common:modex-barrier]: 0.003143 / 0.002944 / 0.003308 -- [instance.c:ompi_mpi_instance_init_common:barrier]: 0.000373 / 0.000161 / 0.000618 !! [instance.c:ompi_mpi_instance_init_common:total]: 0.023234 / 0.011186 / 0.035914 [ompi_mpi_init.c:ompi_mpi_init:barrier-finish]: 0.023557 / 0.023051 / 0.024240 [ompi_mpi_init:total] 0.023557 / 0.023051 / 0.024240 [ompi_mpi_init:overhead]: 0.000240 The timing points can be refined by others depending on their needs. Related to open-mpi#11166 Signed-off-by: Howard Pritchard <[email protected]>
1382a6d
to
b3ae758
Compare
bot:aws:retest |
bot:ibm:retest |
Signed-off-by: Howard Pritchard <[email protected]>
5b93b3b
to
712f0e4
Compare
@jjhursey could you review this? I'd like to get this in before 5.0.0 release. |
This patch removes redundant PMIx Fences in the initialization procedure for MPI when using the World Process Model (WPM). See chapter 11 sections 2 and 3 of the MPI-4 standard for a discussion of the WPM and new Sessions model.
The patch does, however, require that what should have been a local operation to support initialization of an MPI session, into a global one. Note this does not disable the sessions feature but just restricts when it will work at this point to use cases that are similar to MPI initialization using the WPM.
Refactoring to make ompi_mpi_instance_init_common purely local will require changes that would be too impactive for the current state of the 5.0.0 release cycle. See issue #11239.
This patch also fixes up the timings reported when building using the timing infrastructure:
mpirun -np 8 ./ring_c
------------------ ompi_mpi_init ------------------
-- [opal_init_core.c:opal_init_util:opal_malloc_init]: 0.000031 / 0.000023 / 0.000043
-- [opal_init_core.c:opal_init_util:opal_show_help_init]: 0.000094 / 0.000085 / 0.000108
-- [opal_init_core.c:opal_init_util:opal_var_init]: 0.000002 / 0.000001 / 0.000003
-- [opal_init_core.c:opal_init_util:opal_var_cache]: 0.000399 / 0.000345 / 0.000442
-- [opal_init_core.c:opal_init_util:opal_arch_init]: 0.000057 / 0.000054 / 0.000065
-- [opal_init_core.c:opal_init_util:mca_base_open]: 0.000201 / 0.000178 / 0.000243
!! [opal_init_core.c:opal_init_util:total]: 0.000784 / 0.000686 / 0.000904
-- [opal_init.c:opal_init:opal_if_init]: 0.000074 / 0.000062 / 0.000084
-- [opal_init.c:opal_init:opal_init_psm]: 0.000010 / 0.000009 / 0.000011
-- [opal_init.c:opal_init:opal_net_init]: 0.000010 / 0.000008 / 0.000012
-- [opal_init.c:opal_init:opal_datatype_init]: 0.003596 / 0.000519 / 0.012865
!! [opal_init.c:opal_init:total]: 0.003689 / 0.000598 / 0.012972
-- [instance.c:ompi_mpi_instance_init_common:initialization]: 0.000991 / 0.000924 / 0.001064
-- [instance.c:ompi_mpi_instance_init_common:ompi_rte_init]: 0.007519 / 0.004406 / 0.016369
-- [instance.c:ompi_mpi_instance_init_common:PMIx_Commit]: 0.003164 / 0.002496 / 0.003640
-- [instance.c:ompi_mpi_instance_init_common:pmix-barrier-1]: 0.007725 / 0.000072 / 0.010423
-- [instance.c:ompi_mpi_instance_init_common:pmix-barrier-2]: 0.000138 / 0.000068 / 0.000159
-- [instance.c:ompi_mpi_instance_init_common:modex]: 0.000181 / 0.000115 / 0.000333
-- [instance.c:ompi_mpi_instance_init_common:modex-barrier]: 0.003143 / 0.002944 / 0.003308
-- [instance.c:ompi_mpi_instance_init_common:barrier]: 0.000373 / 0.000161 / 0.000618
!! [instance.c:ompi_mpi_instance_init_common:total]: 0.023234 / 0.011186 / 0.035914
[ompi_mpi_init.c:ompi_mpi_init:barrier-finish]: 0.023557 / 0.023051 / 0.024240 [ompi_mpi_init:total] 0.023557 / 0.023051 / 0.024240 [ompi_mpi_init:overhead]: 0.000240
The timing points can be refined by others depending on their needs.
Related to #11166
Signed-off-by: Howard Pritchard [email protected]