-
Notifications
You must be signed in to change notification settings - Fork 900
Failure to stop at MPIR_Breakpoint with OpenMPI v3.1.x and v4.0.x #7757
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
What version of automake are you using? It was found that older versions of automake are adding the -O3, and newer versions are doing the correct thing in not adding it. See this comment/discussion: |
it's also documented in the code of orted_submit.c, for reference: |
I think @awlauria is asking: are you running |
I was building out-of-the-box openmpi-3.1.6.tar.gz and openmpi-4.0.3.tar.gz when first testing. Using
By just running ./configure / make you get the following line in the orte Makefile:
Later I pulled from the most recent branch of
To ensure that |
Is there any update on this issue? This issue persists with OpenMPI 4.1.0 recently released and PGI 20.1. orted_submit.c is still being compiled with -O3. |
@louisespellacy-arm sorry, this slipped through the cracks. I'll take a look. |
Attempted to fix the above issue with the following changes but unsure if its the best approach: diff --git a/orte/orted/Makefile.am b/orte/orted/Makefile.am
index 1235e51e69..7523cc5336 100644
--- a/orte/orted/Makefile.am
+++ b/orte/orted/Makefile.am
@@ -19,6 +19,8 @@
#
# $HEADER$
#
+CFLAGS = $(CFLAGS_WITHOUT_OPTFLAGS) $(DEBUGGER_CFLAGS)
+#
# This makefile.am does not stand on its own - it is included from orte/Makefile.am
@@ -38,7 +40,7 @@ lib@ORTE_LIB_PREFIX@open_rte_la_SOURCES += \
noinst_LTLIBRARIES += liborted_mpir.la
liborted_mpir_la_SOURCES = \
orted/orted_submit.c
-liborted_mpir_la_CFLAGS = $(CFLAGS_WITHOUT_OPTFLAGS) $(DEBUGGER_CFLAGS)
+#liborted_mpir_la_CFLAGS = $(CFLAGS_WITHOUT_OPTFLAGS) $(DEBUGGER_CFLAGS)
lib@ORTE_LIB_PREFIX@open_rte_la_LIBADD += liborted_mpir.la |
@louisespellacy-arm while that will work in a pinch, I don't think that is the best solution. Doing that will propagate the debugger CFLAGS for everything below the orte tree (losing -O3) for basically all of orte as far as I can tell. Unfortunately, it seems that libtool is blasting every file with One way that I can think of is to have something like an OPAL_C/CPPFLAGS variable that propagates down to every Makefile.am. It's a tedious change, but will give more control to every library component on what CFLAGS to build with. |
I'm no configure expert, so maybe there's a better way. I have a branch where this is partially implemented, and it seems to as well as before. Basically I just replaced CFLAGS/CPPFLAGS with OPAL_CFLAGS/OPAL_CPPFLAGS, and set CFLAGS/CPPFLAGS to be empty, preventing libtool from tacking them on. Then I've gone across a subset of makefiles and set them for each library. They will also have to be passed down to the likes of hwloc/prrte and other 3rd party packages, but that should be just as easy. This approach is in line with what gnu suggests in the automake docs:
https://www.gnu.org/software/automake/manual/html_node/Flag-Variables-Ordering.html I can continue this work and post a patch if the community is interested. |
The OPAL_{CFLAGS|CPPFLAGS} method looks like it would work, but is pretty time-intensive to implement. Before going too much further down that road the community (at least those familiar with the build system - @jsquyres @bwbarrett @ggouaillardet mayeb others) should weigh in so Austen doesn't waste effort here if there is another way. The only other way I thought of was to inject a script at the end of configure that would modify the generated Makefile (or at the end of autogen.pl if we can find a way) to strip out the CFLAGS for the orte_submit.c compilation step. It's pretty hacky, but should do the trick. @gpaulsen if we don't reach a decision beforehand, can you add this to next week's agenda? |
It looks like we did this differently in https://github.com/open-mpi/ompi/blob/master/ompi/debuggers/Makefile.am -- we overrode
Notice how it still has We can override This is not the case for It may well be necessary to make |
That would work to limit the scope some, but all files beneath orte/orted would also be affected (there's a pmix dir there), right? The global CFLAGS hammer is...burdensome to work-around. If we were to remove it, the added flexibility it will give developers/users to mix and match flags between components is a nice bonus as well. |
The end goal here is to compile the one file that is necessary without the regular This is unfortunately just how Automake rolls... |
Well, actually, it just occurs to me that there could be another method that could work, but it may be a bit crazy / not worth it. You could just override the rule for I say that this is a little crazy because, honestly, this scheme is a little dicey: if Libtool ever changes the rules that they generate, we'll be out of sync with them, and that could be problematic. |
True this is only for one file. But the way we're doing it now is not recommended by the automake docs as I read them - IE using CFLAGS as a global entity 'that everyone shall have'. And while right now we are running into this issue for mpir, which admittedly is going away, I'm sure we will eventually run into this issue again. And when that happens, we'll have to reshuffle things around or tinker with make files instead of adjusting one flag for that specific library.
I tried that approach to no success. I gave it it's own Makefile.am, but the resetting of CFLAGS still wound up in the generated orte/Makefile. I probably am missing something here, will tinker with it. I agree in theory that this will work as a fix for this file if I can get it working, but it feels like kicking the can until we have to go through this again. |
In optimized builds, CFLAGS contains various optimizations such as -O3, and is propogated by automake to all files. To work-around this, isolate MPIR_Breakpoint() and other MPIR_* symbols into its own library built with debugger specific CFLAGS. To prevent CFLAGS from being polluted elsewhere in the make tree, build this in its own tiny stand-alone makefile. Fixes open-mpi#7757
In optimized builds, CFLAGS contains various optimizations such as -O3, and is propogated by automake to all files. To work-around this, isolate MPIR_Breakpoint() and other MPIR_* symbols into its own library built with debugger specific CFLAGS. To prevent CFLAGS from being polluted elsewhere in the make tree, build this in its own tiny stand-alone makefile. Fixes open-mpi#7757
In optimized builds, CFLAGS contains various optimizations such as -O3, and is propogated by automake to all files. To work-around this, isolate MPIR_Breakpoint() and other MPIR_* symbols into its own library built with debugger specific CFLAGS. To prevent CFLAGS from being polluted elsewhere in the make tree, build this in its own tiny stand-alone makefile. Fixes open-mpi#7757 Signed-off-by: Austen Lauria <[email protected]>
In optimized builds, CFLAGS contains various optimizations such as -O3, and is propogated by automake to all files. To work-around this, isolate MPIR_Breakpoint() and other MPIR_* symbols into its own library built with debugger specific CFLAGS. To prevent CFLAGS from being polluted elsewhere in the make tree, build this in its own tiny stand-alone makefile. Fixes open-mpi#7757 Signed-off-by: Austen Lauria <[email protected]>
In optimized builds, CFLAGS contains various optimizations such as -O3, and is propogated by automake to all files. To work-around this, isolate MPIR_Breakpoint() and other MPIR_* symbols into its own library built with debugger specific CFLAGS. To prevent CFLAGS from being polluted elsewhere in the make tree, build this in its own tiny stand-alone makefile. Fixes open-mpi#7757 Signed-off-by: Austen Lauria <[email protected]>
In optimized builds, CFLAGS contains various optimizations such as -O3, and is propogated by automake to all files. To work-around this, isolate MPIR_Breakpoint() and other MPIR_* symbols into its own library built with debugger specific CFLAGS. To prevent CFLAGS from being polluted elsewhere in the make tree, build this in its own tiny stand-alone makefile. Fixes open-mpi#7757 Signed-off-by: Austen Lauria <[email protected]>
In optimized builds, CFLAGS contains various optimizations such as -O3, and is propogated by automake to all files. To work-around this, isolate MPIR_Breakpoint() and other MPIR_* symbols into its own library built with debugger specific CFLAGS. To prevent CFLAGS from being polluted elsewhere in the make tree, build this in its own tiny stand-alone makefile. Fixes open-mpi#7757 Signed-off-by: Austen Lauria <[email protected]> (cherry picked from commit 6d82003)
@louisespellacy-arm See #8428 for the v4.1.x fix, if you have time to verify. Thanks for your patience! |
Built using the pull requested linked and built with PGI 20.1. Checking breaks at MPIR_Breakpoint seems to give desired output - at the location of new file.
|
This should be resolved in v4.0.6 by #8422. |
Excellent thanks for all the help! |
It's my understanding that MPIR is not on master - the standard is shifting to PMix based standards: https://github.com/openpmix/mpir-to-pmix-guide That said there may be some adjustments needed in PMix to make sure the same thing doesn't happen with the new implementation. |
Umm... right. Duh. Got it. |
No memory mapping is done with PMIx tools, so we shouldn't have to worry about this particular problem 😄 |
Are changes still being made to the v3.1.x branch? PGI/NVHPC is still shipping a pre-built openmpi-3.1.x and openmpi-4.0.x - would it be possible to apply the pull request to v3.1.x also? |
It's my understanding that v3.1 is closed, or very limited. @jsquyres @bwbarrett would you consider taking it - if only for the nightly builds. |
We talked about this today (i.e., merging to v3.1). @awlauria is going to make a v3.1.x PR. If it's on the same order of magnitude as the master / v4.0.x / v4.1.x PRs (i.e., self-contained and low risk), we're open to merging it on v3.1.x. We will almost certainly not do a new release, though -- but a snapshot build tarball from the v3.1.x branch will be available within 24 hours of merging (see https://www.open-mpi.org/nightly/v3.1.x/). |
In optimized builds, CFLAGS contains various optimizations such as -O3, and is propogated by automake to all files. To work-around this, isolate MPIR_Breakpoint() and other MPIR_* symbols into its own library built with debugger specific CFLAGS. To prevent CFLAGS from being polluted elsewhere in the make tree, build this in its own tiny stand-alone makefile. Fixes open-mpi#7757 Signed-off-by: Austen Lauria <[email protected]> (cherry picked from commit 6d82003)
Closing this, as all PR's have been merged. @louisespellacy-arm if you encounter any other issues, feel free to re-open or create a new issue. Thanks! |
Thanks! Useful so we can indicate to customers to pull latest if they encounter issues! |
Is there a reason why the new library was not versioned ? |
@awlauria Thanks. I had written something similar but I'll pick this patch instead for the openmpi4 4.1.1 package on SUSE |
Background information
What version of Open MPI are you using? (e.g., v3.0.5, v4.0.2, git branch name and hash, etc.)
v3.1.6 and v4.0.3
Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)
from source tarball with PGI 19.5 and 20.1
Please describe the system on which you are running
Details of the problem
GDB unable to stop at MPIR_Breakpoint when debugging mpirun process with the MPIR interface with out-of-the-box OpenMPI installations. MPIR_Breakpoint is being optimized out.
#5501 relates to this issue and the reproducer can be re-used. This issue is not occurring with PGI 18.7.
However, providing
CFLAGS=-O1
,FCFLAGS=-O1
andCXXFLAGS=-O1
when building OpenMPI allows GDB to stop at MPIR_Breakpoint.It does not work if you use
-O2
.Upon looking at the building of
orted/orted_submit.c
, it is being built with CFLAGS when it shouldn't be.The compile and link lines being generated by automake in
orte/Makefile
are:In
orte/orted/Makefile.am
, the CFLAGS are defined to remove the optimzations, but $(CFLAGS) is being added to the compile line, adding the optimzations anyway.By manually removing CFLAGS from the compile and link line for
orted_submit.c
, there would be no optimization when buildingorted/orted_submit.c
, as described in the comments.The text was updated successfully, but these errors were encountered: