-
Notifications
You must be signed in to change notification settings - Fork 921
Closed
Labels
Description
On master head, I see that the IBM tests for duplicating datatypes are consistently segv'ing:
$ cd ompi-tests/ibm/datatype
$ ./type_dup_fn_mpifh
...segv...
This happens in all 3 of the Fortran tests (type_dup_fn_[mpih|usempi|usempif08]
). It does not seem to happen in the C version of this same test (type_dup_fn
), which is... weird.
Here's a stack trace from a resulting corefile:
#0 0x0000003fed232495 in raise () from /lib64/libc.so.6
#1 0x0000003fed233c75 in abort () from /lib64/libc.so.6
#2 0x0000003fed2703a7 in __libc_message () from /lib64/libc.so.6
#3 0x0000003fed275dee in malloc_printerr () from /lib64/libc.so.6
#4 0x0000003fed278c3d in _int_free () from /lib64/libc.so.6
#5 0x00002aaaac3e029a in opal_datatype_destruct (datatype=0x731d80)
at opal_datatype_create.c:83
#6 0x00002aaaab1f0de1 in opal_obj_run_destructors (object=0x731d80)
at ../../opal/class/opal_object.h:462
#7 0x00002aaaab1f1269 in ompi_datatype_destroy (type=0x7fffffffcf10)
at ompi_datatype_create.c:90
#8 0x00002aaaab2c3d67 in PMPI_Type_free (type=0x7fffffffcf10) at ptype_free.c:60
#9 0x00002aaaaaf1fdff in ompi_type_free_f (type=0x7fffffffcf6c,
ierr=0x7fffffffcf68) at type_free_f.c:76
#10 0x00002aaaaaf1fdcf in mpi_type_free_ (type=0x7fffffffcf6c,
ierr=0x7fffffffcf68) at type_free_f.c:56
#11 0x0000000000400dde in mpi_type_dup_fn_mpifh () at type_dup_fn_mpifh.f90:22
#12 0x0000000000400e58 in main (argc=1, argv=0x7fffffffd5e5)
at type_dup_fn_mpifh.f90:27
#13 0x0000003fed21ed1d in __libc_start_main () from /lib64/libc.so.6
#14 0x0000000000400c29 in _start ()
I notice that a free()
is failing because it appears to be freeing datatype->ptypes
, which appears to be a non-malloc'ed pointer somehow:
83 free(datatype->ptypes);
(gdb) p datatype->ptypes
$1 = (size_t *) 0x2aaaab516000 <__compound_literal.2>
@bosilca I tried to dig into this but couldn't figure out where ptypes
came from...