Skip to content

v3.1.x: UCX osc: properly release exclusive lock to avoid lockup #6972

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 6, 2020

Conversation

devreal
Copy link
Contributor

@devreal devreal commented Sep 11, 2019

Cherry-pick of #6933 to v3.1.x

See #6931 for the issue this fixes.

Signed-off-by: Joseph Schuchart [email protected]
(cherry picked from commit a5cc380)

Signed-off-by: Joseph Schuchart <[email protected]>
(cherry picked from commit a5cc380)
@hppritcha
Copy link
Member

cori may need to be taken off line - try again
bot:ompi:retest

@devreal
Copy link
Contributor Author

devreal commented Oct 8, 2019

Can we get this into the next v3.1.x release? Should it cherry-picked to 3.0.x as well?

@jsquyres
Copy link
Member

jsquyres commented Oct 8, 2019

This wasn't labeled or milestoned, so I didn't see it until now...

Needs a review (ASAP).

@jsquyres jsquyres changed the title UCX osc: properly release exclusive lock to avoid lockup v3.1.x: UCX osc: properly release exclusive lock to avoid lockup Oct 8, 2019
@jsquyres jsquyres added this to the v3.1.5 milestone Oct 8, 2019
@devreal
Copy link
Contributor Author

devreal commented Oct 9, 2019

@jsquyres I cannot assign people or tags here.

@artpol84 @janjust can you please review? The PR to 4.0.x was merged a while ago: #6934

@artpol84
Copy link
Contributor

@devreal will do this week.

@jsquyres jsquyres requested review from artpol84 and janjust October 17, 2019 19:07
@gpaulsen
Copy link
Member

@artpol84 please review.

@artpol84
Copy link
Contributor

Will do today, sorry for the delay

Copy link
Contributor

@artpol84 artpol84 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.
I wonder if after this change we should revisit the shared lock acquisition link.
Instead of looping over

taken = 0;
while(!taken){
    val = atomic_inc(var);
    if( val > EXCLUSIVE ) {
        atomic_dec(var);
    } else {
        taken = 1;
    }
}

we can now

val = atomic_inc(var);
while( val > EXCLUSIVE ) {
    val = atomic_read(val); // or just read
}

Need to see if this will be faster.

@artpol84
Copy link
Contributor

@yosefe, what do you think regarding the next step above?

@bwbarrett bwbarrett modified the milestones: v3.1.5, v3.1.6 Nov 26, 2019
@jsquyres jsquyres merged commit 7f7267c into open-mpi:v3.1.x Jan 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants