-
Notifications
You must be signed in to change notification settings - Fork 900
hook/prot: Connectivity Map #2825
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
198137b
to
40346fb
Compare
bot:mellanox:retest |
Some notes on the tunable options and output.
Output lower than or equal to
Output above
|
} | ||
|
||
hostidprotptr = getenv("MPI_PROT_BRIEF"); | ||
if (hostidprotptr) { hostidprotbrief = atoi(hostidprotptr); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@markalle It looks like MPI_PROT_BRIEF
value is not used. Was the intention to force the "brief" output for all -prot output?
@markalle @gpaulsen We need to review PR #1974
|
Also, there are some more details from face to face at the bottom of the face to face minutes |
For what it's worth the part they really want is 9a908f5b9bddd6675621c784c832441533cb4f73
that changes "unic" to "usnic"
Then the extra change 8f775b1f3a1ca856791164f5b971a2de274f4733was just for fun, because I thought it was neat to be able to fit larger tables on the screen.
Mark
----- Original message -----From: Geoff Paulsen <[email protected]>To: open-mpi/ompi <[email protected]>Cc: Mark Allen/Dallas/IBM@IBMUS, Mention <[email protected]>Subject: Re: [open-mpi/ompi] hook/prot: Connectivity Map (#2825)Date: Thu, Jan 26, 2017 1:10 PM
Also, there are some more details from face to face at the bottom of the face to face minutes
—You are receiving this because you were mentioned.Reply to this email directly, view it on GitHub, or mute the thread.
|
I think we might be able to use something like PR #1974 to get the string associated with the component. However, I think we might need some discussion about how the community feels about the section of this PR where we get the list of components actually being used between rank pairs. Currently, we added a call to |
bot:lanl:retest |
the LANL dlopen-disable seems to have found a legit problem with this PR:
|
@hppritcha Yeah, you are right. We need to do some work on this PR anyway before it's ready to review. We'll note that error for fixing during the next revision. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reports are that --disable-dlopen fails with this PR, but upon code inspection, it's unclear how the two can be related.
bot:lanl:retest |
This PR needs some significant work before it's ready to merge. So I wouldn't worry too much about the CI until that work has started. |
bot:lanl:retest |
Now that the hook framework has been committed to |
* `-mca hook_prot_verbose VALUE` * General component vebosity * `-mca hook_prot_enable_mpi_init BOOL` * Enable map display at the bottom of `MPI_Init` * `-mca hook_prot_enable_mpi_finalize BOOL` * Enable map display at the top of `MPI_Finalize` * `-mca hook_prot_platform_prot VALUE` * Alias environment variable: `MPI_PROT` * `1 : Same as -mca hook_prot_enable_mpi_init t` * `2 : Same as -mca hook_prot_enable_mpi_finalize t` Signed-off-by: Joshua Hursey <[email protected]>
Normally we print a -prot table up to 16 hosts that looks like this, where 16 can be changed via MPI_PROT_MAX: ``` host | 0 1 2 3 4 5 6 7 8 ======|============================================== 0 : shm ib ib ib ib ib ib ib ib 1 : ib shm ib ib ib ib ib ib ib 2 : ib ib self ib ib ib ib ib ib 3 : ib ib ib self ib ib ib ib ib 4 : ib ib ib ib self ib ib ib ib 5 : ib ib ib ib ib self ib ib ib 6 : ib ib ib ib ib ib self ib ib 7 : ib ib ib ib ib ib ib self ib 8 : ib ib ib ib ib ib ib ib self ``` This checkin reduces MPI_PROT_MAX to 12 but adds a shorter table output that looks like this: ``` host | 0 1 2 3 4 8 ======|==================== 0 : A C C C C C C C C 1 : C A C C C C C C C 2 : C C B C C C C C C 3 : C C C B C C C C C 4 : C C C C B C C C C 5 : C C C C C B C C C 6 : C C C C C C B C C 7 : C C C C C C C B C 8 : C C C C C C C C B key: A == shm key: B == self key: C == ib ``` That is used from 13 up to 36 ranks (or 3*MPI_PROT_MAX). Signed-off-by: Joshua Hursey <[email protected]>
685176a
to
a760be1
Compare
The branch has been rebased onto master. The two commits represent the There is still work to do on this feature - so I'm keeping the WIP label on it. |
:retest: |
@jjhursey are you looking to add this to the 3.0 release? |
It still needs some work. I don't think it'll make it for v3.0, but should be ready for the release that follows. |
The community decided on a call 2 weeks ago, to NOT to take this for v3.0, but to aim for v3.1. |
Thanks, @gpaulsen. Apparently, I have no short term memory, hence writing everything down... |
The IBM CI (PGI Compiler) build failed! Please review the log, linked below. Gist: https://gist.github.com/73ca9ce0ceed7b37f7d60da3250d7259 |
This Functionality was later re-implemented for the v5.0 timeframe in #5507 |
hook
framework PR.