Skip to content

Move yield capability to opal thread component #8037

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 23, 2021

Conversation

devreal
Copy link
Contributor

@devreal devreal commented Sep 7, 2020

This PR is based on a discussion at #6433 and aims to introduce alternative yielding mechanisms using usleep and nanosleep, to be chosen at runtime through MCA parameters. In doing so, it moves the yielding implementation into the opal thread components and thus provide the correct yielding capabilities for Argobots/Qthreads.

This adds two new mca parameters:
threads_pthreads_yield_strategy to choose the strategy (sched_yield and nanosleep, pthreads only), and
threads_pthreads_nanosleep_time (time passed to nanosleep, pthreads only)

A thread component may also signal that yield-when-idle should be the default. This is true for Argobots/Qhtreads because they are cooperatively scheduled and thus have to rely on MPI to yield the underlying thread and allow for all communication operations to be initiated eventually.

Marking this as WIP as I am unable to properly test the Argobots/Qthreads integration at this time, see #8036. I still wanted to put it out for people to comment and in case someone else was following the discussion linked above.

Fixes #7702

Signed-off-by: Joseph Schuchart [email protected]

@ibm-ompi
Copy link

ibm-ompi commented Sep 7, 2020

The IBM CI (GNU/Scale) build failed! Please review the log, linked below.

Gist: https://gist.github.com/a2564f8c8ba9bdb2332b6d1fe978e3b7

@ibm-ompi
Copy link

ibm-ompi commented Sep 7, 2020

The IBM CI (PGI) build failed! Please review the log, linked below.

Gist: https://gist.github.com/949782597258de0c7ad48b8833f04858

@ibm-ompi
Copy link

ibm-ompi commented Sep 7, 2020

The IBM CI (XL) build failed! Please review the log, linked below.

Gist: https://gist.github.com/6fd4bdb517db3a0833ed3c590fa23024

@devreal devreal force-pushed the opal-thread-yield branch 2 times, most recently from 49fb30c to ae8b5f9 Compare September 8, 2020 07:46
@rhc54
Copy link
Contributor

rhc54 commented Sep 8, 2020

@devreal I suspect you are aware of this, but just as a reminder: calling something via an MCA interface costs about 5ns. So doing that to execute a 1ns nanosleep doesn't make a lot of sense. I realize that the point of the function call is to force a context change, but I'm just pointing out that the time cost in the opal_progress critical path is non-zero.

@devreal
Copy link
Contributor Author

devreal commented Sep 8, 2020

I think it boils down to a direct function call. Given that all pthread options either require a syscall or at least activate some low-level user-space thread scheduler the added function call should be negligible. I agree that a 1ns sleep is unrealistic but I wasn't sure what a good value actually is so this was the conservative choice. In part that is why it's configurable :) For Argobots and Qthreads the calls to opal_thread_yield can be inlined since they are directly dispatching to the ULT libraries.

@devreal devreal force-pushed the opal-thread-yield branch 3 times, most recently from c5f3e3d to 6610871 Compare September 9, 2020 11:52
@devreal
Copy link
Contributor Author

devreal commented Sep 9, 2020

I moved some around some of the pthread-specific code to reduce the number of non-static symbols and address the comments above. I also made the threads_pthreads_yield_strategy mca parameter an enum to catch invalid values and allow for text-based selection.

@bosilca
Copy link
Member

bosilca commented Sep 9, 2020

We end up doing a function call into a function call, for nothing. Why not having opal_thread_yield the function pointer that gets called only if non NULL, and then the threading layer set it to whatever it needs ?

@devreal
Copy link
Contributor Author

devreal commented Sep 9, 2020

I think we should avoid function pointers where possible because indirect calls are more costly and prevent some optimizations (sibling call optimization and LTO, for example). Right now, opal_thread_yield is marked static inline so I expect compilers to directly issue calls to the underlying yield function where possible (Argobots, Qthreads) and the function pointer for pthreads. We could make it a macro but I don't think it makes a difference.

@bosilca
Copy link
Member

bosilca commented Sep 9, 2020

OMPI embraced function pointers long ago, we evaluated the approach and it was well performing.

@bwbarrett
Copy link
Member

We end up doing a function call into a function call, for nothing. Why not having opal_thread_yield the function pointer that gets called only if non NULL, and then the threading layer set it to whatever it needs ?

It would be cheaper to have an empty function than an if and a call... function pointers are cheap(ish). Branches in front of function pointers, not so much.

@hjelmn
Copy link
Member

hjelmn commented Sep 10, 2020

Is there a good reason to have both usleep and nanosleep? From what I remember nanosleep is considered superior to usleep in most respects.

@devreal
Copy link
Contributor Author

devreal commented Sep 17, 2020

None in particular, other than giving people options to experiment with. After looking around the internet it looks like nanosleep is the way to go over usleep now. I guess we could remove usleep.

@devreal
Copy link
Contributor Author

devreal commented Sep 22, 2020

I removed the usleep option and will squash the commit before merge (kept it separate in case someone still wants it to be there)

@lanl-ompi
Copy link
Contributor

Can one of the admins verify this patch?

@devreal
Copy link
Contributor Author

devreal commented Nov 19, 2020

Ping :) Could someone please review and let me know if this needs more changes?

This adds two new mca parameters for the pthreads component:
threads_pthreads_yield_strategy to choose the strategy (valid values: sched_yield or nanosleep),
threads_pthreads_nanosleep_time (time passed to nanosleep)

A thread component may also signal that yield-when-idle should be the default (used for Argobots and Qthreads)

Signed-off-by: Joseph Schuchart <[email protected]>
@devreal
Copy link
Contributor Author

devreal commented Nov 20, 2020

I squashed and updated the commit messages. @rajachan We might want to have that in v4.1.x as this fixes issues with the combination of Argobots/Qthreads and the UCX integration (see #7702).

@devreal
Copy link
Contributor Author

devreal commented Nov 20, 2020

Oops, I realized that the Argobots integration is not part of 4.1.x. We don't need to port this one back then.

@jjhursey
Copy link
Member

Looks like some results from IBM CI failed to report back. Let's re-run.
bot:ibm:retest

@hjelmn
Copy link
Member

hjelmn commented Jan 25, 2021

👍 We can probably replace the release CPU function in opal_lifo (https://github.com/open-mpi/ompi/blob/master/opal/class/opal_lifo.h#L92) with this.

@hppritcha
Copy link
Member

Any objections to merging this PR?

@yasushi-saito
Copy link
Contributor

Can we get this merged? We really want this feature in

@devreal devreal merged commit 46a4bc3 into open-mpi:master Feb 23, 2021
@devreal devreal deleted the opal-thread-yield branch October 3, 2022 15:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Argobots integration vs UCX PML
10 participants