Skip to content

error handling large datatypes #6016

Closed
Closed
@ggouaillardet

Description

@ggouaillardet

The issue was initially reported at https://www.mail-archive.com/[email protected]/msg20812.html

The inline program above can be used to reproduce the issue.
It has to be run with one MPI task and requires ~64GB memory (!).

#include <mpi.h>
#include <stdio.h>
#include <stddef.h>
#include <stdlib.h>
#include <string.h>

int main(int argc, char * argv[]) {

  MPI_Init(&argc, &argv);

  int const per_process = 192;
  int const per_type = 20000000;
  size_t const bufsize = (size_t)per_type * (size_t)per_process * 4 * (size_t)sizeof(float);

  float * const buffer = malloc(bufsize);

  int scounts[2] = {per_process, per_process};
  int sdispls[2] = {3*per_process, 0*per_process};
  int rcounts[2] = {per_process, per_process};
  int rdispls[2] = {1*per_process, 2*per_process};

  printf ("buffer %p-%p : %p-%p\n", buffer+(size_t)per_type*(size_t)sdispls[0], buffer+(size_t)per_type*(size_t)(sdispls[0]+scounts[0]),
                                    buffer+(size_t)per_type*(size_t)sdispls[1], buffer+(size_t)per_type*(size_t)(sdispls[1]+scounts[1]));

  MPI_Datatype ddt, stype, rtype;
  MPI_Type_contiguous(per_type, MPI_FLOAT, &ddt);
  MPI_Type_indexed(2, scounts, sdispls, ddt, &stype);
  MPI_Type_commit(&stype);
  MPI_Type_indexed(2, rcounts, rdispls, ddt, &rtype);
  MPI_Type_commit(&rtype);

  MPI_Sendrecv(buffer, 1, stype, 0, 0,
               buffer, 1, rtype, 0, 0,
               MPI_COMM_SELF, MPI_STATUS_IGNORE);

  MPI_Type_free(&stype);
  MPI_Type_free(&rtype);
  free(buffer);

  MPI_Finalize();

}

I ran this under the debugger and found Open MPI tries to pack more data than necessary.
At this stage, I could not find why, nor any obvious integer overflow

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions