Skip to content

Crash or a hang (it depends on OpenMPI version) when receiver and sender use different datatypes #3937

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
densamoilov opened this issue Jul 20, 2017 · 3 comments

Comments

@densamoilov
Copy link

densamoilov commented Jul 20, 2017

Hi all,

I'm developer of Intel Math Kernel Library and for our cluster components we provide OpenMPI support for our customers.
But recently we faced with issue related to sending/receiving when sender and receiver use different data types.

I tried to use several OpenMPI versions such as 1.6.1 (hang), 1.8.1 and 2.1.1 (crash).
These versions have been downloaded from OpenMPI site and built from source.

Information about system:
OS: RHEL 7.2
CPU: Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz
Network type: N/A, within 1 node.

You can find reproducer below. The reproducer works fine with other MPI implementations, such as Intel MPI, MPICH. I run it on 2 processes and as result I observe the following error message:

[mkl:147075] *** An error occurred in MPI_Bcast
[mkl:147075] *** reported by process [3517710337,1]
[mkl:147075] *** on communicator MPI COMMUNICATOR 6 SPLIT FROM 3
[mkl:147075] *** MPI_ERR_TRUNCATE: message truncated
[mkl:147075] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
[mkl:147075] ***    and potentially your MPI job)
#include <mpi.h>
#include <stdlib.h>

MPI_Datatype get_type(int m, int n, int lda, MPI_Datatype type, int rank, int *count)
{

   if (rank == 1) {
       (*count) = m * n;
       return type;
   }

   MPI_Datatype new_type;

   MPI_Type_vector(n, m, lda, type, &new_type);
   MPI_Type_commit(&new_type);
   (*count) = 1;

   return new_type;
}


int main(int argc, char** argv)
{
    int m = 1000;
    int n = 1000;
    int lda = 1000;

    int size, rank;
    int count = -1;

    MPI_Init(&argc, &argv);
    MPI_Comm_size(MPI_COMM_WORLD, &size);
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);

    float* buffer = (float*) malloc(sizeof(float) * m * n);

    MPI_Datatype my_type = get_type(m, n, lda, MPI_FLOAT, rank, &count);

    MPI_Bcast((void*)buffer, count, my_type, 0, MPI_COMM_WORLD);

    free(buffer);
    MPI_Type_free(&my_type);

    MPI_Finalize();

    return 0;
}
@bosilca
Copy link
Member

bosilca commented Jul 20, 2017

Denis, we have a long standing bug related to collective communications using different datatypes when the tuned collective modules is used and the pipelining is enabled. Because of the different datatypes, the processes will decide to pipeline the collective at different granularities leading in some cases to data truncation. The open issue related to this topic is #1763.

There is no known quick solution. You can try to not use the tuned module (--mca coll ^tuned) or you can enable dynamic collective decision and then remove all pipeline for the collectives using different datatypes (more info on our FAQ), but all this will impact all usages of the particular collective.

@densamoilov
Copy link
Author

Thanks for the detailed explanation!

@hppritcha
Copy link
Member

closing this issue as it is being tracked by #1763

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants