Skip to content

Update MCA mutexes to use the qthreads ULT backend #10832

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Oct 18, 2022

Conversation

janciesko
Copy link
Contributor

  • This replaces the use of opal atomics with the use of qthread's atomics in the opal/thread MCA if ompi is configured with qthreads.
  • This resolves the deadlock reported originating from expecting recursive locks to work (List of issues with OMPI master+ULT+PartCom #10459).
  • Requires Qthreads version >=1.18 as we added support for recursive locks in that version. I will issue a PR for the configury changes to version check.

Signed-off-by: Jan Ciesko [email protected]

@ompiteam-bot
Copy link

Can one of the admins verify this patch?

@gpaulsen
Copy link
Member

ok to test

@gpaulsen
Copy link
Member

@janciesko Would you be able to rebase your branch on main somewhere after
7dbfbeea - build: Use open-mpi/oac for oac submodule commit? We're having an issue with the IBM CI when it tries to test a Pull Request that doesn't include that commit.

@jsquyres
Copy link
Member

@gpaulsen
Copy link
Member

bot:ibm:retest

@janciesko
Copy link
Contributor Author

@gpaulsen, rebased.

@jsquyres
Copy link
Member

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@jsquyres
Copy link
Member

../../opal/mca/threads/pthreads/threads_pthreads_mutex.h: In function 'opal_thread_internal_mutex_lock':
../../opal/mca/threads/pthreads/threads_pthreads_mutex.h:113:9: error: implicit declaration of function 'opal_show_help' [-Werror=implicit-function-declaration]
         opal_show_help("help-opal-threads.txt", "mutex lock failed", true);

You need to #include "opal/util/show_help.h".

@janciesko
Copy link
Contributor Author

janciesko commented Sep 23, 2022

I think we do have that:

#include "opal/util/show_help.h"

Is the from the Mellanox CI?
Missing for pthreads. Updating.

@janciesko
Copy link
Contributor Author

Added missing include.

@janciesko janciesko requested a review from jsquyres September 26, 2022 20:38
@jsquyres
Copy link
Member

bot:aws:retest

1 similar comment
@bwbarrett
Copy link
Member

bot:aws:retest

@bwbarrett
Copy link
Member

The failure from the Pull Request Build Checker is real, but caused by a bad commit to the main branch. If you push a rebase on top of main, it should go away.

@janciesko
Copy link
Contributor Author

Rebased to today's main. Thanks @bwbarrett.

@awlauria
Copy link
Contributor

is this good to go or waiting on something?

@janciesko
Copy link
Contributor Author

LGTM

@awlauria awlauria merged commit b55e183 into open-mpi:main Oct 18, 2022
janciesko pushed a commit to janciesko/ompi that referenced this pull request Oct 18, 2022
Update MCA mutexes to use the qthreads ULT backend
(cherry picked from commit b55e183)
janciesko pushed a commit to janciesko/ompi that referenced this pull request Oct 18, 2022
Update MCA mutexes to use the qthreads ULT backend
(cherry picked from commit b55e183)

Signed-off-by: Jan Ciesko [email protected]
janciesko pushed a commit to janciesko/ompi that referenced this pull request Oct 18, 2022
Update MCA mutexes to use the qthreads ULT backend
(cherry picked from commit b55e183)

Signed-off-by: Jan Ciesko <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants