Skip to content

v5.0.x: coll/base: Fix the error handling in couple of collectives. #8934

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 10, 2021

Conversation

zhngaj
Copy link
Contributor

@zhngaj zhngaj commented May 6, 2021

In the error handling of ompi_coll_base_alltoallv_intra_basic_linear,
an ompi request's status could be MPI_SUCCESS, and could be returned,
instead of a real error code.

Add a check against ompi request's req_status.MPI_ERROR to ensure
a real error code is returned.

Other collectives (alltoall, barrier, bcast, gather, reduce, scatter)
had the similar error handling, so apply the change as well.

Signed-off-by: Jie Zhang [email protected]

@ompiteam-bot
Copy link

Can one of the admins verify this patch?

@bwbarrett
Copy link
Member

ok to test

@bwbarrett
Copy link
Member

@zhngaj Please update this commit with the cherry-pick -x similar to the 4.1 branch.

In the error handling of ompi_coll_base_alltoallv_intra_basic_linear,
an ompi request's status could be MPI_SUCCESS, and could be returned,
instead of a real error code.

Add a check against ompi request's req_status.MPI_ERROR to ensure
a real error code is returned.

Other collectives (alltoall, barrier, bcast, gather, reduce, scatter)
had the similar error handling, so apply the change as well.

Signed-off-by: Jie Zhang <[email protected]>
(cherry picked from commit a7613cd)
@zhngaj zhngaj force-pushed the v5.0.x-err-handling-fix branch from 19bb15d to 10b29c5 Compare May 6, 2021 20:39
@zhngaj
Copy link
Contributor Author

zhngaj commented May 6, 2021

@zhngaj Please update this commit with the cherry-pick -x similar to the 4.1 branch.

Done.

@awlauria awlauria added this to the v5.0.0 milestone May 6, 2021
@awlauria awlauria merged commit e333d4e into open-mpi:v5.0.x May 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants