Skip to content

oshmem: add some useless but inoffensive communications and see how o… #4734

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

ggouaillardet
Copy link
Contributor

…shmem+ucx handles that

Signed-off-by: Gilles Gouaillardet [email protected]

@ggouaillardet
Copy link
Contributor Author

@yosefe there might be.
#4674 fails to pass the CI with UCX, and from my point of view, i do not think the PR itself is to be blamed. at a very early stage, irecv() receives truncated message (that is not properly handled, but this is an other issue). the goal of this PR is to get a better picture of what is going wrong.

@ggouaillardet ggouaillardet force-pushed the debug/ucx branch 2 times, most recently from 8955712 to 0252461 Compare January 21, 2018 03:52
@ibm-ompi
Copy link

The IBM CI (GNU Compiler) build failed! Please review the log, linked below.

Gist: https://gist.github.com/e9c85b22d9b273bc0d4b9759fb6418fb

@ibm-ompi
Copy link

The IBM CI (XL Compiler) build failed! Please review the log, linked below.

Gist: https://gist.github.com/6b75fe520b36c958476b22b25dd29344

@ibm-ompi
Copy link

The IBM CI (GNU Compiler) build failed! Please review the log, linked below.

Gist: https://gist.github.com/2f9d6eecef186c3bc0f3d0b7c317e1a0

@ibm-ompi
Copy link

The IBM CI (XL Compiler) build failed! Please review the log, linked below.

Gist: https://gist.github.com/f236fa19ddd296534d4dd068413b4840

@ibm-ompi
Copy link

The IBM CI (PGI Compiler) build failed! Please review the log, linked below.

Gist: https://gist.github.com/6daefd4ce9d5b60a396f1175169af8b9

@ibm-ompi
Copy link

The IBM CI (GNU Compiler) build failed! Please review the log, linked below.

Gist: https://gist.github.com/cd479eee9d454e729dda217c396210cf

@ibm-ompi
Copy link

The IBM CI (XL Compiler) build failed! Please review the log, linked below.

Gist: https://gist.github.com/7fcbddbc11d3bbc3422e12ad7b29d0fc

@ibm-ompi
Copy link

The IBM CI (GNU Compiler) build failed! Please review the log, linked below.

Gist: https://gist.github.com/1f3cd58761d8210f3d9079db09aec046

@ibm-ompi
Copy link

The IBM CI (XL Compiler) build failed! Please review the log, linked below.

Gist: https://gist.github.com/ac8ee133e6dd2ca621ea5062cabf31cd

…shmem+ucx handles that

Signed-off-by: Gilles Gouaillardet <[email protected]>
instead of invoking ompi_request_test_all(), that will end up
calling opal_progress() recursively, manually check the status
of the requests.

the same method is used in ompi_comm_request_progress()

Refs open-mpi#3901

Signed-off-by: Gilles Gouaillardet <[email protected]>
@ggouaillardet
Copy link
Contributor Author

@yosefe there is no problem with UCX nor SHMEM :-)
i did screw up (e.g. isend(MPI_INT) but irecv(MPI_BYTE))
iirc, UCX did crash but for some reasons, MXM did not ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants