v4.0.x: Fix oob_tcp tcp_component_close segfault with active listeners #6800
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
oob_tcp in non-HNP mode shares libevent event_base with oob_base [1].
orte_oob_base_close calls:
(1) oob_tcp component_shutdown, then
(2) opal_progress_thread_finalize, then
(3) oob_tcp tcp_component_close [2].
opal_progress_thread_finalize calls tracker_destructor [3] that frees the
event_base [4]. If any oob_tcp event listeners are active at this time, oob_tcp
will crash trying to delete them at [5] [6].
This change moves oob_tcp event listener cleanup from component_close to
component_shutdown so that it happens before the event_base is freed.
[1] https://github.com/open-mpi/ompi/blob/v4.0.1/orte/mca/oob/tcp/oob_tcp_listener.c#L160
[2] https://github.com/open-mpi/ompi/blob/v4.0.1/orte/mca/oob/base/oob_base_frame.c#L95
[3] https://github.com/open-mpi/ompi/blob/v4.0.1/opal/runtime/opal_progress_threads.c#L232
[4] https://github.com/open-mpi/ompi/blob/v4.0.1/opal/runtime/opal_progress_threads.c#L65
[5] https://github.com/open-mpi/ompi/blob/v4.0.1/orte/mca/oob/tcp/oob_tcp_component.c#L192
[6] https://github.com/open-mpi/ompi/blob/v4.0.1/orte/mca/oob/tcp/oob_tcp_listener.c#L955
Signed-off-by: Orivej Desh [email protected]
(cherry picked from commit 78b7e34)
Refs #6796