Skip to content

osc/rdma: make locking code more robust #3045

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Mar 31, 2017

Conversation

hjelmn
Copy link
Member

@hjelmn hjelmn commented Feb 27, 2017

Under heavy load the locking code could fail if the underlying btl
module started to return OPAL_ERR_OUT_OF_RESOURCE on atomic
operations. This commit updates the code to gracefully handle btl
errors.

Signed-off-by: Nathan Hjelm [email protected]
(cherry picked from commit 4707c7c)
Signed-off-by: Nathan Hjelm [email protected]

Under heavy load the locking code could fail if the underlying btl
module started to return OPAL_ERR_OUT_OF_RESOURCE on atomic
operations. This commit updates the code to gracefully handle btl
errors.

Signed-off-by: Nathan Hjelm <[email protected]>
(cherry picked from commit 4707c7c)
Signed-off-by: Nathan Hjelm <[email protected]>
@hjelmn hjelmn added the bug label Feb 27, 2017
@hjelmn hjelmn added this to the v2.1.0 milestone Feb 27, 2017
@hjelmn hjelmn requested a review from regrant February 27, 2017 15:53
@hjelmn
Copy link
Member Author

hjelmn commented Feb 27, 2017

@jsquyres Found this bug during stress testing.

@jsquyres jsquyres modified the milestones: v2.1.0, v2.1.1 Feb 27, 2017
Copy link
Contributor

@regrant regrant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code looks fine, and solves the stated problem.

Signed-off-by: Nathan Hjelm <[email protected]>
(cherry picked from commit 032bcf9)
Signed-off-by: Nathan Hjelm <[email protected]>
@hjelmn
Copy link
Member Author

hjelmn commented Mar 1, 2017

@jsquyres Added the commit to clean up the warnings. Good to go now.

@hppritcha hppritcha merged commit 88e139f into open-mpi:v2.x Mar 31, 2017
@artpol84 artpol84 mentioned this pull request Apr 1, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants