Skip to content

bug in datatype handling #7019

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
tjahns opened this issue Sep 27, 2019 · 11 comments
Closed

bug in datatype handling #7019

tjahns opened this issue Sep 27, 2019 · 11 comments

Comments

@tjahns
Copy link
Contributor

tjahns commented Sep 27, 2019

I've found and now finally generated a small reproducer for a bug in handling of mildly complex datatypes introduced in commit 639f4b1 by @bosilca

In 2.0.3 the problem can simply be fixed by copying over the prior version of opal/datatype/opal_datatype_add.c from 2.0.2 but for more recent versions the fix is not so straight-forward obviously

The issue affects all versions of OpenMPI from 2.0.3 onwards including 4.0.1.

I've tested 2.0.2, 2.0.3, 2.0.4 and 4.0.1 and a colleague tested 2.1. In all cases except 2.0.2 for which I had an operating system package (but also verified from source tarball) we built from released source tarballs.

  • Operating system/version:

I've tested this on multiple installations of OpenMPI on Linux (Debian 9 and RedHat 6)

  • Computer hardware:

x86_64

  • Network type:

Can be reproduced with IB, tcp, MPI_COMM_SELF and shared memory.


Details of the problem

The two attached example programs both build with a simple mpicc invocation. In case the C compiler is a little older, -std=gnu99 might be needed for the second example. Both programs are expected to silently report success but will fail with diagnostic output on affected OpenMPI versions.

$ mpicc -o dt openmpi_datatype.c
$ mpirun -np 1 ./dt
pack_buffer[1] = 1 != src_data[0] = 0
pack_buffer[2] = 2 != src_data[0] = 0
dst_data[1] = 1 != ref_dst_data[1] = 0
dst_data[2] = 2 != ref_dst_data[2] = 0
-------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code.. Per user-direction, the job has been aborted.
-------------------------------------------------------
--------------------------------------------------------------------------
mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:

  Process name: [[11460,1],0]
  Exit code:    1
--------------------------------------------------------------------------
$ mpicc -std=gnu99 -o dt2 openmpi_datatype2.c
$ mpirun -np 1 ./dt2
data mismatch at j=1, i=0
b = {
     9, 14, 13, 12, 11, 10,  9, 14,
    15,  9, 10, 11, 12, 13, 16,  9,
    22, 17, 18, 19, 20, 21, 22, 17,
    30, 25, 26, 27, 28, 29, 30, 25 }
b expected = {
     9, 14, 13, 12, 11, 10,  9, 14,
    14,  9, 10, 11, 12, 13, 14,  9,
    22, 17, 18, 19, 20, 21, 22, 17,
    30, 25, 26, 27, 28, 29, 30, 25 }
-------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code.. Per user-direction, the job has been aborted.
-------------------------------------------------------
--------------------------------------------------------------------------
mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:

  Process name: [[11281,1],0]
  Exit code:    1
--------------------------------------------------------------------------

openmpi_datatype.zip

@gpaulsen
Copy link
Member

@tjahns Thank you for reporting, and root causing.

Our active release branches are currently v3.0.x, v3.1.x, and v4.0.x, and so we are not currently planning any more releases on earlier branches. There is some precedent to put important fixes on earlier branches in case users wish to download and build themselves.

@bosilca do you have time to look at this?

@bosilca
Copy link
Member

bosilca commented Sep 27, 2019

The problem is that for a yet unknown reason the datatype engine thinks both data generated by your tests are empty, containing no predefined types.

[speedy.local:84981] Datatype 0x7fc2bce319e0[Dup ] id -1 size 0 align 1 opal_id 0 length 3 used 0
true_lb 9223372036854775807 true_ub -9223372036854775808 (true_extent 1) lb 36 ub 68 (extent 32)
nbElems 0 loops 0 flags 114 (committed contiguous )-cC----GD--[---][---]
   contain
No optimized description

@bosilca
Copy link
Member

bosilca commented Sep 27, 2019

I take that back, the test is using MPI_INTEGER (a fortran datatype) that translate to an empty datatype when Fortran support is turned off.

@ggouaillardet
Copy link
Contributor

Here is a trimmed version of the second program (the issue is really in the send type)

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <mpi.h>

int main(int argc, char **argv)
{
  MPI_Init(&argc, &argv);
  MPI_Datatype types[6], ddt;
  MPI_Type_vector(4, 1, 0, MPI_INT, types+0);
  MPI_Type_dup(MPI_INT, types+1);
  MPI_Type_vector(4, 2, 1, MPI_INT, types+2);
  MPI_Type_vector(3, 1, 0, MPI_INT, types+3);
  MPI_Type_dup(MPI_INT, types+4);
  int blkls[5] = { 6,  1,  1,  6,  1 };
  int displs[5] = { 0,  5,  8,  8,  13 };
  MPI_Type_indexed(5, blkls, displs, MPI_INT, types+5);
  int blkls2[6] = { 1,  1,  1,  1,  1,  1 };
  MPI_Aint disps2[6] = { 36, 40, 40, 56, 68, 68};
  MPI_Datatype ddt_send_from_0_to_0_0_0;
  MPI_Type_create_struct(6, blkls2, disps2, types, &ddt);
  for (int i=0; i<6; i++) MPI_Type_free(types+i);
  MPI_Type_commit(&ddt);
  int A[32], B[32];
  for (int i=0; i<32; i++) A[i] = i;
  for (int i=0; i<32; i++) B[i] = -1;
  int ref[32] = {
    9, 9, 9, 9,
    10,
    10,11,11,12,12,13,13,14,
    14,14,14,
    17,
      17,18,19,20,21,22,
      22,
      25,
      25,26,27,28,29,30,
      30
  };
  MPI_Sendrecv(A, 1, ddt, 0, 1,
               B, 32, MPI_INT, 0, 1,
               MPI_COMM_SELF, MPI_STATUS_IGNORE);

  if (memcmp(B, ref, 32*sizeof(int))) {
    printf ("KO\n");
    printf ("EXPECTED: ");
    for (int i=0; i<32; i++) printf ("%2d ", ref[i]);
    printf ("\n");
    printf ("GOT     : ");
    for (int i=0; i<32; i++) printf ("%2d ", B[i]);
    printf ("\n");
    printf ("          ");
    for (int i=0; i<32; i++) printf ((ref[i] == B[i])?"   ":"^^ ");
    printf ("\n");
    MPI_Abort(MPI_COMM_WORLD, 1);
  } else {
    printf ("OK\n");
  }

  MPI_Finalize();
  return 0;
}

@bosilca
Copy link
Member

bosilca commented Sep 27, 2019

The provided tests work find in master and the next release (branch 4.x). Thus the issue seems to have been solved by #6952 (and the corresponding PR in the 4.x branch). As decided in #6978 the fix cannot be back-ported to the 3.x branches, nor to anything older.

@hppritcha
Copy link
Member

hppritcha commented Sep 27, 2019

just checked and with 4.0.2rc2 Gilles version passes.

@bosilca
Copy link
Member

bosilca commented Sep 27, 2019

I understand how this issue is triggered. Basically, the optimization of the datatype in the commit is overzealous and merges 2 descriptions that should not be merged. It basically transform

------PGD--[---][---]      OPAL_INT4 count 3 disp 0x38 (56) blen 0 extent 0 (size 12)
--C---P-D--[---][---]      OPAL_INT4 count 1 disp 0x44 (68) blen 0 extent 4 (size 4)

into

-cC---P-DB-[---][---]      OPAL_INT4 count 4 disp 0x38 (56) blen 1 extent 4 (size 16)

The problem here was that the first of the two predefined types in the original description has an extent of 0 (it is a legal way to send multiple times the same data), but this 0 extent is lost during the optimization as it is transformed into an extent of 4 bytes (with an updated count).

This is exactly the problem that was fixed in #6952 but as I mentionned in #6978 it is too time consuming to backport the patch.

That being said you are not completely out of luck. I came up with a quick fix, dirty as it might cut others optimizations, but at least it seems to solve the issue highlighted here.

diff --git a/opal/datatype/opal_datatype_optimize.c b/opal/datatype/opal_datatype_optimize.c
index 882e3a8..2ab02db 100644
--- a/opal/datatype/opal_datatype_optimize.c
+++ b/opal/datatype/opal_datatype_optimize.c
@@ -210,7 +210,8 @@ opal_datatype_optimize_short( opal_datatype_t* pData,
             continuity = ((last_disp + (ptrdiff_t)last_length * (ptrdiff_t)opal_datatype_basicDatatypes[last_type]->size)
                           == (total_disp + pData->desc.desc[pos_desc].elem.disp));

-            if( (pData->desc.desc[pos_desc].elem.common.flags & OPAL_DATATYPE_FLAG_CONTIGUOUS) && continuity &&
+            if( ((pData->desc.desc[pos_desc].elem.common.flags & OPAL_DATATYPE_FLAG_CONTIGUOUS) && continuity) &&
+                ((0 == last_length) || (last_extent == (int32_t)opal_datatype_basicDatatypes[last_type]->size)) &&
                 (pData->desc.desc[pos_desc].elem.extent == (int32_t)opal_datatype_basicDatatypes[type]->size) ) {
                 if( type == last_type ) {
                     last_length += pData->desc.desc[pos_desc].elem.count;

@gpaulsen
Copy link
Member

@bosilca Thanks for investigating, and the clear write up as to how it might affect the release branches. We're tagging v4.0.2rc3 today, and this write-up helped us understand that we don't need to hold it for this issue.

@tjahns
Copy link
Contributor Author

tjahns commented Sep 30, 2019

Thanks for investigating, I should probably check master too next time. I'll see if I can get packagers like spack to adopt the patch for as many versions as feasible.

@tjahns tjahns closed this as completed Sep 30, 2019
@jsquyres
Copy link
Member

Hey @bosilca -- can that 2-line patch be applied to v3.0.x / v3.1.x?

@bosilca
Copy link
Member

bosilca commented Sep 30, 2019

It was designed for the 3.x series, it should apply smoothly to both branches. A quick check seems to indicate it could also be applied (shifted by few lines) to the 2.x if someone cares.

jsquyres added a commit to jsquyres/ompi that referenced this issue Sep 30, 2019
Do not remove 0-extent instances while optimizing the datatype.  This
is a distillation of a larger set of DDT bug fixes and optimizations
from master.  George tried to cherry-pick these fixes+optimizations
back to the v3.0.x/v3.1.x branches and it resulted in a giant list of
heavily-modified cherry-picked commits (see
open-mpi#6978).  This did not seem like a
good idea.

George therefore distilled the critical bug fix down to this small
commit, suitable for both the v3.0.x/v3.1.x branches.  See
open-mpi#7019 (comment)
for details.

Signed-off-by: Jeff Squyres <[email protected]>
jsquyres added a commit to jsquyres/ompi that referenced this issue Sep 30, 2019
Do not remove 0-extent instances while optimizing the datatype.  This
is a distillation of a larger set of DDT bug fixes and optimizations
from master.  George tried to cherry-pick these fixes+optimizations
back to the v3.0.x/v3.1.x branches and it resulted in a giant list of
heavily-modified cherry-picked commits (see
open-mpi#6978).  This did not seem like a
good idea.

George therefore distilled the critical bug fix down to this small
commit, suitable for both the v3.0.x/v3.1.x branches.  See
open-mpi#7019 (comment)
for details.

Signed-off-by: Jeff Squyres <[email protected]>
(cherry picked from commit b77ee56)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants