v4.0.x: usnic fixes and optimizations #7043

jsquyres · 2019-10-05T01:22:23Z

See individual commit messages for more details.

Rename "get_nsec()" to "get_ticks()" to more accurately reflect that this function has no correlation to wall clock time at all. Signed-off-by: Jeff Squyres <[email protected]> (cherry picked from commit ce2910a)

New MCA parameter: btl_usnic_ack_iteration_delay. Set this to the number of times through the usNIC component progress function before sending a standalone ACK (vs. piggy-backing the ACK on any other send going to the target peer). Use "ticks" language to clarify that we're really counting the number of times through the usNIC component DATA_CHANNEL completion check (to check for incoming messages) -- it has no relation to wall clock time whatsoever. Also slightly change the channel-checking scheme in usNIC component progress: only check the PRIORITY channel once (vs. checking it once, not finding anything, and then falling through the progress_2() where we check PRIORITY again and then check the DATA channel). As before, if our "progress" libevent fires, increment the tick counter enough to guarantee that all endpoints that need an ACK will get triggered to send standalone ACKs the next time through progress, if necessary. Signed-off-by: Jeff Squyres <[email protected]> (cherry picked from commit 968b1a5)

Significantly increase the default retrans timeout. If the retrans timeout is too soon, we can end up in a retransmission storm where the logic will continually re-transmit the same frames during a single run through the usNIC progress function (because the timer for a single frame expires before we have run through re-transmitting all the frames pending re-transmission). Signed-off-by: Jeff Squyres <[email protected]> (cherry picked from commit 3cc95d8)

New MCA param: btl_usnic_max_resends_per_iteration. This is the max number of resends we'll do in a single pass through usNIC component progress. This prevents progress from getting stuck in an endless loop of retransmissions (i.e., if more retransmissions are triggered during the sending of retransmissions). Specifically: we need to leave the resend loop to allow receives to happen (which may ACK messages we have sent previously, and therefore cause pending resends to be moot). Signed-off-by: Jeff Squyres <[email protected]> (cherry picked from commit 27e3040)

Move the prefix area from the head to the body in relevant size computations. This fixes a problem in high traffic situations where usNIC may have sent from unregistered memory. Signed-off-by: Jeff Squyres <[email protected]> (cherry picked from commit fe7f772)

gpaulsen · 2019-10-07T19:16:35Z

Sorry, too late for v4.0.2, moving to v4.0.3

It was previously accidentally set to 0. Signed-off-by: Jeff Squyres <[email protected]> (cherry picked from commit 132e4ca)

Signed-off-by: Jeff Squyres <[email protected]> (cherry picked from commit 3080033)

nrescobar

The changes look good.

Note: I am a coworker of Jeff's at Cisco.

jsquyres · 2019-10-15T22:32:23Z

@gpaulsen @hppritcha Can you approve and merge? I have no one left in the OMPI community who can review usNIC stuff, so I got a co-worker to review it for me (but he doesn't have write perms to the OMPI repo, so his review "doesn't count" in terms of github's requirements).

Thanks.

bwbarrett

I'm a little curious about the long description of making the timeout longer, and then the lack of description on making it shorter, but I assume you have your reasons and it's not wrong.

jsquyres · 2019-10-16T10:11:31Z

Yes, I probably could/should have made that explanation better.

The real reason was that I merged the commits on master before getting the code review. In the code review, it came out that it really wasn't necessary to increase the timeout -- 5 miliseconds is long enough already. The real reason for (effectively) that timeout being too short was the lack of a cap on retransmitting during a single iteration of usNIC progress. I.e., when sending many large messages to a very large number of peers, we could get into a situation where the retransmit queue got so long that it effectively took longer to retransmit everything than the retrans timers on the sends. Hence, by the time we finished retransmitting everything, at least some of the timers had already expired, which would then lead to immediately retransmitting again, ...etc. Meaning: the retrans timer wasn't the issue. The amount of work we allowed to happen during a single iteration of usNIC progress was the problem.

gpaulsen · 2019-10-18T18:58:57Z

Thanks @bwbarrett for reviewing.
@hppritcha and I haven't yet opened the v4.0.x for merging, but once we open that, we'll get this in.

hppritcha · 2019-10-19T13:51:44Z

Ready to merge imho

jsquyres added 5 commits October 4, 2019 16:47

btl/usnic: s/get_nsec/get_nticks/g

0839a9c

Rename "get_nsec()" to "get_ticks()" to more accurately reflect that this function has no correlation to wall clock time at all. Signed-off-by: Jeff Squyres <[email protected]> (cherry picked from commit ce2910a)

jsquyres added bug Target: v4.0.x labels Oct 5, 2019

jsquyres added this to the v4.0.2 milestone Oct 5, 2019

gpaulsen modified the milestones: v4.0.2, v4.0.3 Oct 7, 2019

jsquyres added 2 commits October 15, 2019 07:54

btl/usnic: set ack_iteration_delay default to 4

1565239

It was previously accidentally set to 0. Signed-off-by: Jeff Squyres <[email protected]> (cherry picked from commit 132e4ca)

btl/usnic: set retrans_timeout back down to 5ms

c659282

Signed-off-by: Jeff Squyres <[email protected]> (cherry picked from commit 3080033)

nrescobar approved these changes Oct 15, 2019

View reviewed changes

bwbarrett approved these changes Oct 16, 2019

View reviewed changes

hppritcha merged commit 106109a into open-mpi:v4.0.x Oct 22, 2019

jsquyres deleted the pr/v4.0.x/usnic-fixes-and-optimizations branch July 17, 2022 12:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

v4.0.x: usnic fixes and optimizations #7043

v4.0.x: usnic fixes and optimizations #7043

Uh oh!

jsquyres commented Oct 5, 2019

Uh oh!

gpaulsen commented Oct 7, 2019

Uh oh!

nrescobar left a comment

Uh oh!

jsquyres commented Oct 15, 2019

Uh oh!

bwbarrett left a comment

Uh oh!

jsquyres commented Oct 16, 2019

Uh oh!

gpaulsen commented Oct 18, 2019

Uh oh!

hppritcha commented Oct 19, 2019

Uh oh!

Uh oh!

v4.0.x: usnic fixes and optimizations #7043

v4.0.x: usnic fixes and optimizations #7043

Uh oh!

Conversation

jsquyres commented Oct 5, 2019

Uh oh!

gpaulsen commented Oct 7, 2019

Uh oh!

nrescobar left a comment

Choose a reason for hiding this comment

Uh oh!

jsquyres commented Oct 15, 2019

Uh oh!

bwbarrett left a comment

Choose a reason for hiding this comment

Uh oh!

jsquyres commented Oct 16, 2019

Uh oh!

gpaulsen commented Oct 18, 2019

Uh oh!

hppritcha commented Oct 19, 2019

Uh oh!

Uh oh!