Skip to content

osc/pt2pt: Previous locks must complete in order #2593

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

jjhursey
Copy link
Member

To prevent deadlock force all previously requested locks to complete before starting a new lock. Otherwise this can lead to deadlock if the locks are processed in arbitrary order.

 * To prevent deadlock force all previously requested locks to complete
   before starting a new lock. Otherwise this can lead to deadlock if
   the locks are processes in arbitrary order.

Signed-off-by: Mark Allen <[email protected]>
@jjhursey jjhursey added this to the v2.0.2 milestone Dec 16, 2016
@jjhursey jjhursey requested a review from hjelmn December 16, 2016 18:34
This was referenced Dec 16, 2016
@jjhursey
Copy link
Member Author

Per discussion on the teleconf today: This issue is somewhat an MPI standard interpretation issue. Should the MPI library be responsible for avoiding deadlock in this scenario or should the application.

@markalle has a small repeater test case that we can probably share to help the discussion.

Until we reach a decision I'm going to mark this as do-not-merge.

@markalle
Copy link
Contributor

markalle commented Dec 20, 2016

Here's a gist for a testcase example that acquires multiple locks in the same order, at least from the lock requester's perspective. If the locks are just lock requests being sent out simultaneously that might be satisfied in any order this would deadlock.

https://gist.github.com/markalle/5a1cd53da9a1fa12f06e2c3e42951beb

If the above is declared illegal, I think quite a few smaller cases of overlap would become illegal by implication. Even if no individual rank pair overlapped by more than one lock, you could still deadlock with "0 locks a b, 1 locks b c, 2 locks a c" if 0 gets a first, 1 gets b first, and due to non-ordering of the requests 2 gets c first.

I think you'd end up with some sort of weird looking transitive property if you tried to define the allowed overlap cases.

@hppritcha
Copy link
Member

moving to 2.0.3

@hppritcha hppritcha modified the milestones: v2.0.3, v2.0.2 Dec 24, 2016
@hjelmn
Copy link
Member

hjelmn commented Jan 12, 2017

Will review this later today.

@jjhursey
Copy link
Member Author

@markalle @hjelmn Where did you all come down on this issue?

@jsquyres
Copy link
Member

jsquyres commented Feb 6, 2017

@markalle @hjelmn Where are we on this PR?

@hppritcha
Copy link
Member

Has anyone tried this test with mpich using either ch3 or ch4 device? @jjhursey thinks this passed with platform mpi. @markalle could you try this test with an mpich variant?

@jsquyres
Copy link
Member

@markalle @hjelmn @jjhursey 🏓

@jjhursey jjhursey removed this from the v2.0.3 milestone Mar 6, 2017
@jjhursey
Copy link
Member Author

jjhursey commented Mar 6, 2017

@markalle Did you ever followup with the MPI Forum on the required semantics here?

@markalle
Copy link
Contributor

I didn't open a forum topic, but I did find the part of the standard that made me question the original test:

Distinct access epochs for win at the same process must be disjoint. On the other hand, epochs pertaining to different win arguments may overlap.

My first test multiple_locks.c violates this statement, but it's easy to modify the test to comply with the above and still hit the deadlock this pull request is about:
https://gist.github.com/markalle/a9d461129f8bec665b9b203681a4bb0a
This version, multiple_locks2.c, has each epoch on a different window so I can't see any standards argument against it.

Basically I just changed the test from

    MPI_Win_lock(MPI_LOCK_EXCLUSIVE, peer=1, , win);
    MPI_Win_lock(MPI_LOCK_EXCLUSIVE, peer=2, , win);

to

    MPI_Win_lock(MPI_LOCK_EXCLUSIVE, peer=1, , win[1]);
    MPI_Win_lock(MPI_LOCK_EXCLUSIVE, peer=2, , win[2]);

I didn't test mpich, but I did try osc=rdma and it passes.

Philosophically I don't think you can say "it's the user's responsibility to not deadlock" with the current code. The standard is explicitly allowing overlapping access epochs if they're on different windows, it makes no statement about this not applying to passive target, so it should be legal to hold multiple locks. But there's no way to correctly code multiple locks to avoid deadlock when it's implemented this way.

@ibm-ompi
Copy link

The IBM CI (PGI Compiler) build failed! Please review the log, linked below.

Gist: https://gist.github.com/d1fd2b3422fc00176a2b3840dd404050

@markalle
Copy link
Contributor

@jhursey can you make it retest the PGI part? I wasn't able to see the failure there.

Regarding the testcase addressing the MPI-standard question, I had to update the test again:
https://gist.github.com/markalle/5fb3f8de41d36645e05ba7492df3b4c3

Anyway I still think this has to be legal code. The standard specifically says multiple access epochs can overlap (to different windows). But w/o the ordering guarantee that this commit adds, I think it's impossible for an app to code correctly a deadlock-free acquisition of two locks A and B.

@jjhursey
Copy link
Member Author

bot:ibm:pgi:retest

@markalle
Copy link
Contributor

"Pull Request Build Checker" is saying "failed". Where does the information hide about what it's saying failed?

@jsquyres
Copy link
Member

@markalle Here's how you dig into the PRBC:

  • Click on the "Details" to the right
    • Remember that the PRBC fronts several different builds across many different OSs / platforms
    • It's basically manifested as a 2-level system
  • Scroll alllllll the way down to the bottom
  • You'll see several entries at the bottom; at least one will be red
  • Click on the "Build #xxx" for the red entry (do NOT click on the "console output" for it!)
  • Again, scroll allllll the way down to the bottom
  • There will be several more configs / platforms listed; at least one of them will be red
  • Click on the red one
  • Now click on "Console output"
    • This is the console output of that specific failed build

Repeat for any / all of the red / failed configs and/or individual builds.

@markalle
Copy link
Contributor

Thanks, I never would have found those. Anyway I don't think these failures are real, or at least not related to this checkin. Guess I'll hit retest again?

  • freebsd 11.0 : "opal_pmix_base_select failed .. It looks like orte_init failed for some reason; your parallel process is likely to abort .."
  • OS-X : "git fetch --tags --progress https://github.com/open-mpi/ompi/ +refs/heads/:refs/remotes/origin/ +refs/pull/:refs/remotes/origin/pr/
    FATAL: java.nio.channels.ClosedChannelException"

@markalle
Copy link
Contributor

bot:ibm:pgi:retest

@markalle
Copy link
Contributor

retest

@markalle
Copy link
Contributor

bot:retest

@jjhursey
Copy link
Member Author

@markalle Do we need this PR any more?

@jjhursey
Copy link
Member Author

This PR has grown stale. We can reopen if we need it later. @markalle If the problem is still present let's make sure there is an Issue to track progress.

@jjhursey jjhursey closed this Oct 10, 2017
@jjhursey jjhursey deleted the topic/osc-pt2pt-lock-ordering branch October 10, 2017 15:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants