-
Notifications
You must be signed in to change notification settings - Fork 902
Document binding behavior (especially w.r.t. threads) #4845
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
??? how do you intend to do that? |
@ggouaillardet you need to make sure that main and progress threads are on the same core to get the best performance in a single-threaded case. |
@artpol84 Do we really know that it has to be the same core? I'm wondering if it is possible that it only needs to be (for example) the same L3 or L2 cache, or some other level above core. I believe your numbers indicated that the same NUMA wasn't sufficient - yes? But that leaves some room in-between. |
@rhc54 |
@rhc54 if a MPI tasks is bound on cores @artpol84 well, in this case what I suggested has to be improved. As @jsquyres pointed, there is "no right answer" here, and I am just suggesting an idea to improve the out of the box performance. |
@ggouaillardet |
@ggouaillardet I don't think it has to be more complicated. My concern was that you are implying that there is some global knowledge regarding progress threads, and I don't believe it exists. So I'm still a little puzzled as to how you know you are the nth progress thread, and therefore should go on a specific core. |
This is precisely the problem, and one of the reasons why there is no good answer here: we're currently binding to package, so the "main" application thread(s) (MATs) can float anywhere in the package. So if binding the progress thread(s) in some kind of proximity to the MATs (that is smaller than a package) is necessary for performance, we have no way of knowing that where the MATs will be. And even if you did, the MATs may move. And if we don't let the MATs move -- by binding them to something smaller than the package -- then we're going against the reason we expanded to bind-to-package (what used to be called "socket") in the first place: being friendly to MPI+OpenMPI/THREAD_MULTIPLE applications. |
One possible resolution might be thru the PMIx OpenMP/MPI working group. We now have a method by which the MPI layer learns of the OpenMP layer becoming "active", and vice versa. So we will know that the app is multi-threaded, how many threads it intends to use, and what each side's desired binding looks like. When we get that info (which is when either side calls "init"), then we could perhaps determine a binding pattern within the envelope given to us by the RM. There has even been discussion about making the worker thread pool "common" between the two sides, though that is strictly at the head scratching phase. |
Per discussion on the 2018-02-20 webex, and per #4799:
The general issue appears to be that since Open MPI binds to socket by default (for np>2), progress threads may not be located on the same core as the "main" thread(s). #4799 talks about this in the context of PMIx, but the issue actually exists for all progress threads in the MPI process.
The short version is that we agreed that the best way to move forward is to document the current behavior and provide information for people who want different behavior (e.g., enable binding to core). This probably entails:
Points made during the discussion:
The text was updated successfully, but these errors were encountered: