Skip to content

Commit ac98b32

Browse files
committed
Fix alltoallv with inplace
Signed-off-by: Mikhail Brinskii <[email protected]>
1 parent 1233c38 commit ac98b32

File tree

1 file changed

+10
-4
lines changed

1 file changed

+10
-4
lines changed

ompi/mca/coll/base/coll_base_alltoallv.c

Lines changed: 10 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,7 @@
4343
/*
4444
* We want to minimize the amount of temporary memory needed while allowing as many ranks
4545
* to exchange data simultaneously. We use a variation of the ring algorithm, where in a
46-
* single step a process echange the data with both neighbors at distance k (on the left
46+
* single step a process exchange the data with both neighbors at distance k (on the left
4747
* and the right on a logical ring topology). With this approach we need to pack the data
4848
* for a single of the two neighbors, as we can then use the original buffer (and datatype
4949
* and count) to send the data to the other.
@@ -58,16 +58,22 @@ mca_coll_base_alltoallv_intra_basic_inplace(const void *rbuf, const int *rcounts
5858
ptrdiff_t extent;
5959
ompi_request_t *req = MPI_REQUEST_NULL;
6060
char *tmp_buffer;
61-
size_t packed_size = 0, max_size;
61+
size_t packed_size = 0, max_size, type_size;
6262
opal_convertor_t convertor;
6363

6464
/* Initialize. */
6565

6666
size = ompi_comm_size(comm);
6767
rank = ompi_comm_rank(comm);
68+
ompi_datatype_type_size(rdtype, &type_size);
6869

69-
ompi_datatype_type_size(rdtype, &max_size);
70-
max_size *= rcounts[rank];
70+
for (i = 0, max_size = 0 ; i < size ; ++i) {
71+
if (i == rank) {
72+
continue;
73+
}
74+
packed_size = rcounts[i] * type_size;
75+
max_size = packed_size > max_size ? packed_size : max_size;
76+
}
7177

7278
/* Easy way out */
7379
if ((1 == size) || (0 == max_size) ) {

0 commit comments

Comments
 (0)