-
Notifications
You must be signed in to change notification settings - Fork 900
mpi4py: Regressions in main: segv in MPI_Init_thread #11433
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@rhc54 Maybe you can shed some light on this one? |
I'm afraid I cannot extract the error from those outputs. Can you provide the specific error that is motivating this issue? I saw something about a segfault in the sm btl and then a server cannot be found, but that has nothing to do with me. |
@rhc54 The issue is happening at @hppritcha Maybe your changes from #11305? This is the error from the logs.
|
hmm, i thought that last commit i pushed to #11305 addressed this behavior. |
@hppritcha It can confirm the regression is related to your changes |
@dalcinl do you see the behavior with any of the non-spawn tests? |
@hppritcha Well, I'm not sure what to say... It depends on the day 😞 This is the first time I got the failure from a scheduled build yesterday (commit 478b6b2): This other failure if from a build I triggered manually today (same commit 478b6b2): |
closed via #11445 |
https://github.com/mpi4py/mpi4py-testing/actions/runs/4238187139/jobs/7368190440
https://github.com/mpi4py/mpi4py-testing/actions/runs/4238187139/jobs/7368941976
Please note that the failure happens in a (heavily?) oversubscribed scenario. GitHub Actions runners have two virtual cores, and I'm running there with 5 MPI processes, plus a few more to spawn. Not sure if this is relevant, but this observation may help the experts to figure out what could be going wrong.
PS: Now I' running these tests daily, so the regression should come from very recent changes pushed to main.
The text was updated successfully, but these errors were encountered: