-
Notifications
You must be signed in to change notification settings - Fork 902
coll/han: Fix issue in reduce leading to invalid rbuf pointer #12434
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Alternative fix: diff --git a/ompi/mca/coll/han/coll_han_reduce.c b/ompi/mca/coll/han/coll_han_reduce.c
index a208b41d78..d4facdf26c 100644
--- a/ompi/mca/coll/han/coll_han_reduce.c
+++ b/ompi/mca/coll/han/coll_han_reduce.c
@@ -176,7 +176,7 @@ mca_coll_han_reduce_intra(const void *sbuf,
t->sbuf = (char *) t->sbuf + extent * t->seg_count;
}
- if (up_rank == root_up_rank) {
+ if (up_rank == root_up_rank && low_rank == root_low_rank) {
t->rbuf = (char *) t->rbuf + extent * t->seg_count;
}
t->cur_seg = t->cur_seg + 1;
|
ompi/mca/coll/han/coll_han_reduce.c
Outdated
@@ -176,7 +176,7 @@ mca_coll_han_reduce_intra(const void *sbuf, | |||
t->sbuf = (char *) t->sbuf + extent * t->seg_count; | |||
} | |||
|
|||
if (up_rank == root_up_rank) { | |||
if (up_rank == root_up_rank && NULL != t->rbuf) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm should this be if (root_low_rank == low_rank)
instead?
Reading L142-L148, only the root and node leaders should have valid rbuf.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say no, because the root has a fully sized rbuf, while non-root node leaders only have a temporary buffer that fits two segments (L122, L146). But what we could do is if (up_rank == root_up_rank && low_rank == root_low_rank)
. Or if (w_rank == root)
; should also be equivalent. These might indeed be more in tune with the rest of the code (opinions welcome).
Let me take another look at the code, perhaps one or two touchups in related spots would be a good idea, to make everything a bit clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
up_rank == root_up_rank
is true only for the root of the original reduce call, so the only process that has a user provided output buffer. Thus, it makes sense to move to base pointer as we advance in the segments.
For all leaders, aka. processes that become root for the local reductions, with the exception of the original root we only allocated a temporary buffer with size 2*extent*seg_count
, so there is no room to move the pointer around (more than the double buffering in mca_coll_han_reduce_t1_task
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
up_rank == root_up_rank
is true only for the root of the original reduce call
-> For the global root, and for the ranks in the same node as it, right?
Alright all done with the second look. I'm warming up to adjusting the PR in favor of this: @@ -176,7 +182,7 @@ mca_coll_han_reduce_intra(const void *sbuf,
t->sbuf = (char *) t->sbuf + extent * t->seg_count;
}
- if (up_rank == root_up_rank) {
+ if (low_rank == root_low_rank && up_rank == root_up_rank) {
t->rbuf = (char *) t->rbuf + extent * t->seg_count;
}
t->cur_seg = t->cur_seg + 1; Also, RE possible clearing up of related spots, here is one suggestion (I don't feel too strongly about it): @@ -139,12 +139,18 @@ mca_coll_han_reduce_intra(const void *sbuf,
"[%d]: root_low_rank %d root_up_rank %d\n", w_rank, root_low_rank,
root_up_rank));
- void *tmp_rbuf = rbuf;
+ void *tmp_rbuf = NULL;
void *tmp_rbuf_to_free = NULL;
- if (low_rank == root_low_rank && root_up_rank != up_rank) {
- /* allocate 2 segments on node leaders that are not the global root */
- tmp_rbuf = malloc(2*extent*seg_count);
- tmp_rbuf_to_free = tmp_rbuf;
+ /* node leaders require a buffer to store reduction results */
+ if(low_rank == root_low_rank) {
+ if(up_rank == root_up_rank) {
+ /* the global root already has one */
+ tmp_rbuf = rbuf;
+ } else {
+ /* allocate 2 temporary segments on node leaders that are not the global root */
+ tmp_rbuf = malloc(2*extent*seg_count);
+ tmp_rbuf_to_free = tmp_rbuf;
+ }
}
/* Create t0 tasks for the first segment */ This is supposed to:
Please let me know what you think of the above, and I'll adjust the PR (or leave it as is). |
Thanks for looking closely. Just 2 small suggestions:
|
Improve handling of the rbuf pointer. Previously, it was possible for the pointer to become non-NULL, even though it does not point to valid memory. This becomes a problem for the assurances that open-mpi#11552 tries to introduce. Signed-off-by: George Katevenis <[email protected]>
62f7a8c
to
0f0727e
Compare
I agree on w_rank being cleaner. I updated the PR to use this approach. I also adjusted tmp_rbuf's initialization to match this, and added error checking for the allocation. Please re-review. |
Running AWS CI |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change passed AWS CI.
This adjustment fixes the problem outlined in #11650. I looked through the other collectives in HAN and this seems to be the only spot where this particular discrepancy appears.
For non-node-leader ranks,
t->rbuf
will not point to valid memory (as far as we know). In the current code, however it's incremented (for non-leader ranks in the same node as the global root) for each task, as if valid.This has not been a problem until now because this pointer is not actually utilized anywhere, but it will be when #11552 is merged. The reason is that components downstream from HAN, like XHC, are allowed to rely on the passed rbuf pointer being either
NULL
or pointing to a fully valid buffer.Fixes #11650.
cc @bosilca.