-
Notifications
You must be signed in to change notification settings - Fork 900
error handling large datatypes #6016
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Labels
Comments
ggouaillardet
added a commit
to ggouaillardet/ompi
that referenced
this issue
Nov 4, 2018
Always use size_t (instead of converting to an uint32_t) in order to correctly support large datatypes. Refs open-mpi#6016 Signed-off-by: Gilles Gouaillardet <[email protected]>
ggouaillardet
added a commit
to ggouaillardet/ompi
that referenced
this issue
Nov 4, 2018
Always use size_t (instead of converting to an uint32_t) in order to correctly support large datatypes. Thanks Ben Menadue for the initial bug report Refs open-mpi#6016 Signed-off-by: Gilles Gouaillardet <[email protected]>
ggouaillardet
added a commit
to ggouaillardet/ompi
that referenced
this issue
Nov 5, 2018
Always use size_t (instead of converting to an uint32_t) in order to correctly support large datatypes. Thanks Ben Menadue for the initial bug report Refs open-mpi#6016 Signed-off-by: Gilles Gouaillardet <[email protected]>
ggouaillardet
added a commit
to ggouaillardet/ompi
that referenced
this issue
Nov 19, 2018
Always use size_t (instead of converting to an uint32_t) in order to correctly support large datatypes. Thanks Ben Menadue for the initial bug report Refs open-mpi#6016 Signed-off-by: Gilles Gouaillardet <[email protected]>
ggouaillardet
added a commit
to ggouaillardet/ompi
that referenced
this issue
Nov 19, 2018
Always use size_t (instead of converting to an uint32_t) in order to correctly support large datatypes. Thanks Ben Menadue for the initial bug report Refs open-mpi#6016 Signed-off-by: Gilles Gouaillardet <[email protected]>
ggouaillardet
added a commit
to ggouaillardet/ompi
that referenced
this issue
Nov 19, 2018
Always use size_t (instead of converting to an uint32_t) in order to correctly support large datatypes. Thanks Ben Menadue for the initial bug report Refs open-mpi#6016 Signed-off-by: Gilles Gouaillardet <[email protected]>
ggouaillardet
added a commit
to ggouaillardet/ompi
that referenced
this issue
Nov 19, 2018
Always use size_t (instead of converting to an uint32_t) in order to correctly support large datatypes. Thanks Ben Menadue for the initial bug report Refs open-mpi#6016 Signed-off-by: Gilles Gouaillardet <[email protected]>
@ggouaillardet Can you add a derivative of this test program into the ibm test suite? Perhaps modify it to only do the malloc on local rank 0 (so that it can still be run in MTT with more than ppn=1), and put in a check to see if the malloc fails, ...etc. |
I added a test in #6029 based on the example presented here, but without any memory allocation. |
ggouaillardet
added a commit
to ggouaillardet/ompi
that referenced
this issue
Dec 6, 2018
Always use size_t (instead of converting to an uint32_t) in order to correctly support large datatypes. Thanks Ben Menadue for the initial bug report Refs open-mpi#6016 Signed-off-by: Gilles Gouaillardet <[email protected]>
bosilca
pushed a commit
to bosilca/ompi
that referenced
this issue
Mar 6, 2019
Always use size_t (instead of converting to an uint32_t) in order to correctly support large datatypes. Thanks Ben Menadue for the initial bug report Refs open-mpi#6016 Signed-off-by: Gilles Gouaillardet <[email protected]>
hoopoepg
pushed a commit
to hoopoepg/ompi
that referenced
this issue
Mar 6, 2019
Always use size_t (instead of converting to an uint32_t) in order to correctly support large datatypes. Thanks Ben Menadue for the initial bug report Refs open-mpi#6016 Signed-off-by: Gilles Gouaillardet <[email protected]> (cherry picked from commit fbb5bb8) Conflicts: opal/datatype/opal_convertor_raw.c
bosilca
pushed a commit
to bosilca/ompi
that referenced
this issue
Sep 13, 2019
Always use size_t (instead of converting to an uint32_t) in order to correctly support large datatypes. Thanks Ben Menadue for the initial bug report Refs open-mpi#6016 Signed-off-by: Gilles Gouaillardet <[email protected]>
markalle
pushed a commit
to markalle/ompi
that referenced
this issue
Sep 12, 2020
Always use size_t (instead of converting to an uint32_t) in order to correctly support large datatypes. Thanks Ben Menadue for the initial bug report Refs open-mpi#6016 Signed-off-by: Gilles Gouaillardet <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
The issue was initially reported at https://www.mail-archive.com/[email protected]/msg20812.html
The inline program above can be used to reproduce the issue.
It has to be run with one MPI task and requires ~64GB memory (!).
I ran this under the debugger and found Open MPI tries to pack more data than necessary.
At this stage, I could not find why, nor any obvious integer overflow
The text was updated successfully, but these errors were encountered: