-
Notifications
You must be signed in to change notification settings - Fork 901
Significant performance regression under Linux with busy waiting enabled #10929
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Labels
Comments
hjelmn
added a commit
to hjelmn/ompi
that referenced
this issue
Oct 14, 2022
Under normal circumstances epoll and poll produce similar performance on Linux. When busy polling is enabled they do not. Testing with a TCP-based system shows a significan performance degredation when using poll with busy waiting enabled. This performance regression is not seen when using epoll. This PR adjusts the default value of opal_event_include to epoll on Linux only to fix the regression. Fixes open-mpi#10929 Signed-off-by: Nathan Hjelm <[email protected]>
hjelmn
added a commit
to hjelmn/ompi
that referenced
this issue
Oct 14, 2022
Under normal circumstances epoll and poll produce similar performance on Linux. When busy polling is enabled they do not. Testing with a TCP-based system shows a significan performance degredation when using poll with busy waiting enabled. This performance regression is not seen when using epoll. This PR adjusts the default value of opal_event_include to epoll on Linux only to fix the regression. Fixes open-mpi#10929 Signed-off-by: Nathan Hjelm <[email protected]>
hjelmn
added a commit
to hjelmn/ompi
that referenced
this issue
Oct 17, 2022
Under normal circumstances epoll and poll produce similar performance on Linux. When busy polling is enabled they do not. Testing with a TCP-based system shows a significan performance degredation when using poll with busy waiting enabled. This performance regression is not seen when using epoll. This PR adjusts the default value of opal_event_include to epoll on Linux only to fix the regression. Fixes open-mpi#10929 Signed-off-by: Nathan Hjelm <[email protected]> (cherry picked from commit 279f6b6)
yli137
pushed a commit
to yli137/ompi
that referenced
this issue
Jan 10, 2024
Under normal circumstances epoll and poll produce similar performance on Linux. When busy polling is enabled they do not. Testing with a TCP-based system shows a significan performance degredation when using poll with busy waiting enabled. This performance regression is not seen when using epoll. This PR adjusts the default value of opal_event_include to epoll on Linux only to fix the regression. Fixes open-mpi#10929 Signed-off-by: Nathan Hjelm <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Thank you for taking the time to submit an issue!
Background information
I am looking at a system which is showing very poor performance with some applications (notably OpenFOAM) when running Open MPI. The system in question has busy polling enabled to (in theory) improve message latency:
What version of Open MPI are you using? (e.g., v3.0.5, v4.0.2, git branch name and hash, etc.)
v4.1.x, main, etc
Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)
Both from release tarball and git checkout.
If you are building/installing from a git clone, please copy-n-paste the output from
git submodule status
.Please describe the system on which you are running
Details of the problem
With these settings we see a 3-5x slowdown in the performance of all steps of OpenFOAM motorbike (can give run details if requested) vs with them both set to 0 (busy waiting disabled).
Without busy polling:
With busy polling:
The only difference between these runs is busy polling enabled vs disabled.
The text was updated successfully, but these errors were encountered: