Skip to content

osc rdma hang with multiple windows #2530

Closed
@markalle

Description

@markalle

Using the testcase mt_1sided.c (and its support files 1sided.c mt_1sided_td1.c mt_1sided_td2.c) from this test harness pull request:
https://github.com/open-mpi/ompi-tests/pull/25
I get a hang from the osc rdma component:

% mpicc -o x mt_1sided.c mt_1sided_td1.c mt_1sided_td2.c
% mpirun -host hostA,hostB -mca osc rdma -mca pml ob1 -mca btl openib,self,vader ./x

This was from a vanilla build of openmpi-master-201612022109-0366f3a.

The single threaded "1sided.c" passes. And I think the key difference in the multi-threaded version is that there are two windows active (one for each thread).

1sided.c is a fairly broad test, looping over several synchronization types as well as contiguous and non-contiguous datatypes. I could probably whittle the test to be a bit more targeted for this particular hang if needed.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions