-
Notifications
You must be signed in to change notification settings - Fork 900
ompi,opal/progress: preserve the OPAL_EVLOOP_ONCE flag for opal_event_loop #9480
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks correct to me
ompi/runtime/ompi_mpi_init.c
Outdated
@@ -999,7 +999,7 @@ int ompi_mpi_init(int argc, char **argv, int requested, int *provided, | |||
CPU utilization for the remainder of MPI_INIT when we are | |||
blocking on RTE-level events, but may greatly reduce non-TCP | |||
latency. */ | |||
opal_progress_set_event_flag(OPAL_EVLOOP_NONBLOCK); | |||
opal_progress_add_event_flag(OPAL_EVLOOP_NONBLOCK); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
opal_progress_add_event_flag
returns the old flag. I would prefer not to add a second function but instead do
int old_flag = opal_progress_set_event_flag(OPAL_EVLOOP_NONBLOCK);
if( !(old_flag & OPAL_EVLOOP_NONBLOCK) ) {
opal_progress_set_event_flag( old_flag | OPAL_EVLOOP_NONBLOCK );
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! I think this is a great idea! I removed the commit the introduced the new function, and implemented the commit as you suggested (with minor adjustment).
opal/runtime/opal_progress.h
Outdated
* OPAL_EVLOOP_NONBLOCK | OPAL_EVLOOP_ONCE | ||
* | ||
* OPAL_EVLOOP_NONBLOCK means opal_event_loop() should NOT block on | ||
* waiting for active events. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We talked about this on the issue, and this is not exactly what this flag is about, because for as long as there are new events the opal_progress never returns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. I adjusted the comment in the new revision. The point I wanted to make is that this flag only impact the behavior when there is no active events.
The comment for opal_progress_set_event_flags states that default opal_progress_event_flags is OPAL_EVLOOP_ONELOOP, which is not accurate, because OPAL_EVLOOP_ONELOOP has been removed in commit 33c3b71, and was replaced by OPAL_EVLOOP_NONBLOCK | OPAL_EVLOOP_ONCE. This patch fixed the comment about the default argument. It also fixed a typo in the argument section: "vlags" -> "flags" Signed-off-by: Wei Zhang <[email protected]>
I ran the following test to verify the patch: intel mpi benchmark IMB-MPI1 (with btl/tcp). osu micro benchmark (latency, bw, mbw_mr) applications like lammps (with the LJ, EAM test case) and hpcg. |
ompi/runtime/ompi_mpi_init.c
Outdated
@@ -393,7 +393,7 @@ static void evhandler_reg_callbk(pmix_status_t status, | |||
int ompi_mpi_init(int argc, char **argv, int requested, int *provided, | |||
bool reinit_ok) | |||
{ | |||
int ret; | |||
int ret, old_event_flags; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will lead to compiler warnings when OPAL_ENABLE_PROGRESS_THREADS is not 0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in new revision.
The comment indicates that this was done with the goal of improving the latency for non TCP cases. The only way for this to affect the latency would be in a setup with multiple recv posted. Did you notice any performance impact with your test ? |
Currently, ompi_mpi_init() call opal_progress_set_event_flag(OPAL_EVLOOP_NONBLOCK), whith the intention to ensure OPAL_EVLOOP_NONBLOCK is set in opal_progress_event_flag. However, this call will remove other existing flags (like OPAL_EVLOOP_ONCE) in opal_progress_event_flag, which can cause deadlock. This patch address the issue by adding OPAL_EVLOOP_NONBLOCK to that flag. Signed-off-by: Wei Zhang <[email protected]>
I tested was using btl/tcp using For For The following is
The following is
The following is
The following is
I will run same tests on btl/ofi (on EFA) to get data for non-tcp cases |
The patch looks good. Let's wait for the performance results with ofi before merging this. |
I ran the same latency, bw, allreduce (of 1152 ranks), allgather (of 1152 ranks) test on mtl/ofi, the conclusion is similiar to that for btl/tcp:
The follow is the data: osu_latency, data show in the table is 3 run average, number in the parenthesis is the standard deviation:
osu_bw, data show in the table is 3 run average, number in the parenthesis is the standard deviation:
data for allreduce (1152 ranks)
Data for allgather (1152 ranks)
|
addresses #9447