Skip to content

v5.0.x: btl: introduce flag MCA_BTL_FLAGS_RDMA_REMOTE_COMPLETION #9938

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Feb 1, 2022

Conversation

awlauria
Copy link
Contributor

This patch introduced a new flag MCA_BTL_FLAGS_RDMA_REMOTE_COMPLETION,
which is used to indicate whether a btl's RDMA/atomic operations support
remote completion.

btl/self, btl/ofi and btl/ugni support remote completion, thus this patch
added the flags to them.

Active message RDMA/atomic supports remote completion under certain
condition, this patch implemented that logic.

Signed-off-by: Wei Zhang [email protected]
(cherry picked from commit 8f4cda3)

This patch introduced a new flag MCA_BTL_FLAGS_RDMA_REMOTE_COMPLETION,
which is used to indicate whether a btl's RDMA/atomic operations support
remote completion.

btl/self, btl/ofi and btl/ugni support remote completion, thus this patch
added the flags to them.

Active message RDMA/atomic supports remote completion under certain
condition, this patch implemented that logic.

Signed-off-by: Wei Zhang <[email protected]>
(cherry picked from commit 8f4cda3)
@awlauria awlauria added this to the v5.0.0 milestone Jan 28, 2022
@wzamazon
Copy link
Contributor

Thank you! Can you also add
#9695 and

#9708

wzamazon and others added 2 commits January 28, 2022 10:07
Active message RDMA uses btl_send to send the initial
request and RDMA response.

btl_send will return 0 when the descriptor has been
successfully queued for send, and will return 1 when
the desciptor has been successfully sent.

Currently, active message RDMA treats the return value
1 as an error, thus will either return the value
to caller, or retry the send.

This patch address the issue by correctly handling
the return value 1.

Signed-off-by: Wei Zhang <[email protected]>
(cherry picked from commit 7b177ce)
Static analysis found a few typos in 7b177ce: "&&" should have
been "&".

Fixes CIDs 1494439 and 1494440.

Signed-off-by: Jeff Squyres <[email protected]>
(cherry picked from commit 5ca231c)
@awlauria
Copy link
Contributor Author

@wzamazon thanks ! I pushed them to this PR.

@awlauria awlauria merged commit 3eadd81 into open-mpi:v5.0.x Feb 1, 2022
@awlauria awlauria deleted the btl_remote_completion_v5.0.x branch February 1, 2022 00:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants