Skip to content

remove redundant PMIx fence calls at initialization #11212

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

hppritcha
Copy link
Member

related to issue #11166

Turns out making these changes revealed some issues with the way MPI_Session_init is implemented. Unless async modex and disabling of collective info is turned on, the path through MPI initialization for Sessions isn't really local. The code in ompi_mpi_instance_init_common will eventually need to be refactored to move synchronizing elements up into ompi_mpi_init for the world process model and into some delayed-till- comm creation routine(s) when using the sessions process model.

also, remove a second wait on a debugger. We should only do this once. At some point a solution for attaching a debugger to an application which only uses the sessions process model will need to implemented. Given the semantics of sessions, its not clear when is the right point to synchronize processes with a debugger.

Signed-off-by: Howard Pritchard [email protected]

Copy link
Contributor

@rhc54 rhc54 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That does indeed remove the duplication - whether the code belongs here or there is beyond my knowledge. However, this should get rid of the extra "fence" calls.

Thanks @hppritcha !

@hppritcha
Copy link
Member Author

curious about the fortran test failures, rerun.
bot:ompi:test

@hppritcha
Copy link
Member Author

bot:ompi:retest

related to issue open-mpi#11166

Turns out making these changes revealed some issues with the way
MPI_Session_init is implemented.  Unless async modex and disabling
of collective info is turned on, the path through MPI initialization
for Sessions isn't really local.  The code in ompi_mpi_instance_init_common
will eventually need to be refactored to move synchronizing elements
up into ompi_mpi_init for the world process model and into some delayed-till-
comm creation routine(s) when using the sessions process model.

also, remove a second wait on a debugger.  We should only do this
once.  At some point a solution for attaching a debugger to an application
which only uses the sessions process model will need to implemented.
Given the semantics of sessions, its not clear when is the right point to
synchronize processes with a debugger.

Signed-off-by: Howard Pritchard <[email protected]>
@hppritcha
Copy link
Member Author

closing this PR and will open a fresh one with a simpler approach to address issue #11166

@hppritcha hppritcha closed this Jan 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants