Skip to content

[lldb] Don't clear a Module's UnwindTable when adding a SymbolFile #86603

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

jasonmolenda
Copy link
Collaborator

Fixing a crash in lldb when symbols.auto-download setting is enabled. When doing a backtrace, this feature has lldb search for a SymbolFile for stack frames when we are backtracing, and add them either synchoronously or asynchronously, depending on the specific setting used.

Module::SetSymbolFileFileSpec clears the Module's UnwindTable, once we find a new SymbolFile. We may be adding a source of unwind information that we did not have when lldb was working only with the executable binary.

What happens in practice is that we're using a reference to the Module's UnwindTable, and then the other thread getting the SymbolFile clears it and now the first thread is referring to freed memory and we can crash. When built with address sanitizer, it crashes much more reliably.

Given that unwind information used for exception handling -- eh_frame, compact unwind -- is present in executable binaries, the only thing we're likely to add would be DWARF's debug_frame if that was also available. The actual value of re-creating the UnwindTable when we have added a SymbolFile is not large.

I also tried fixing this by changing the Module to have a shared_ptr to the UnwindTable, so we could have two different UnwindTable's in use simultaneously for a brief period. This would be fine TODAY, but it introduces a very subtle bug that someone will have a heck of a time figuring out in the future.

In the end, I believe the safest approach is to sacrifice the possible marginal gain of reconstructing the UnwindTable once a SymbolFile has been added, to sidestep this whole problem area.

Also, in Module::GetUnwindTable(), call DownloadSymbolFileAsync before we create the UnwindTable for the first time, in case the symbol file is fetched synchronously, we will have it for that possible marginal gain.

Fixing a crash in lldb when `symbols.auto-download` setting is
enabled.  When doing a backtrace, this feature has lldb search
for a SymbolFile for stack frames when we are backtracing, and
add them either synchoronously or asynchronously, depending
on the specific setting used.

Module::SetSymbolFileFileSpec clears the Module's UnwindTable, once
we find a new SymbolFile.  We may be adding a source of unwind
information that we did not have when lldb was working only with
the executable binary.

What happens in practice is that we're using a reference to the
Module's UnwindTable, and then the other thread getting the SymbolFile
clears it and now the first thread is referring to freed memory
and we can crash.  When built with address sanitizer, it crashes
much more reliably.

Given that unwind information used for exception handling -- eh_frame,
compact unwind -- is present in executable binaries, the only thing
we're likely to *add* would be DWARF's `debug_frame` if that was
also available.  The actual value of re-creating the UnwindTable
when we have added a SymbolFile is not large.

I also tried fixing this by changing the Module to have a shared_ptr
to the UnwindTable, so we could have two different UnwindTable's
in use simultaneously for a brief period.  This would be fine TODAY,
but it introduces a very subtle bug that someone will have a heck
of a time figuring out in the future.

In the end, I believe the safest approach is to sacrifice the possible
marginal gain of reconstructing the UnwindTable once a SymbolFile has
been added, to sidestep this whole problem area.
@llvmbot
Copy link
Member

llvmbot commented Mar 25, 2024

@llvm/pr-subscribers-lldb

Author: Jason Molenda (jasonmolenda)

Changes

Fixing a crash in lldb when symbols.auto-download setting is enabled. When doing a backtrace, this feature has lldb search for a SymbolFile for stack frames when we are backtracing, and add them either synchoronously or asynchronously, depending on the specific setting used.

Module::SetSymbolFileFileSpec clears the Module's UnwindTable, once we find a new SymbolFile. We may be adding a source of unwind information that we did not have when lldb was working only with the executable binary.

What happens in practice is that we're using a reference to the Module's UnwindTable, and then the other thread getting the SymbolFile clears it and now the first thread is referring to freed memory and we can crash. When built with address sanitizer, it crashes much more reliably.

Given that unwind information used for exception handling -- eh_frame, compact unwind -- is present in executable binaries, the only thing we're likely to add would be DWARF's debug_frame if that was also available. The actual value of re-creating the UnwindTable when we have added a SymbolFile is not large.

I also tried fixing this by changing the Module to have a shared_ptr to the UnwindTable, so we could have two different UnwindTable's in use simultaneously for a brief period. This would be fine TODAY, but it introduces a very subtle bug that someone will have a heck of a time figuring out in the future.

In the end, I believe the safest approach is to sacrifice the possible marginal gain of reconstructing the UnwindTable once a SymbolFile has been added, to sidestep this whole problem area.

Also, in Module::GetUnwindTable(), call DownloadSymbolFileAsync before we create the UnwindTable for the first time, in case the symbol file is fetched synchronously, we will have it for that possible marginal gain.


Full diff: https://github.com/llvm/llvm-project/pull/86603.diff

1 Files Affected:

  • (modified) lldb/source/Core/Module.cpp (+1-6)
diff --git a/lldb/source/Core/Module.cpp b/lldb/source/Core/Module.cpp
index 8ffa35518b3c36..a520523a96521a 100644
--- a/lldb/source/Core/Module.cpp
+++ b/lldb/source/Core/Module.cpp
@@ -1239,9 +1239,9 @@ void Module::SectionFileAddressesChanged() {
 
 UnwindTable &Module::GetUnwindTable() {
   if (!m_unwind_table) {
-    m_unwind_table.emplace(*this);
     if (!m_symfile_spec)
       SymbolLocator::DownloadSymbolFileAsync(GetUUID());
+    m_unwind_table.emplace(*this);
   }
   return *m_unwind_table;
 }
@@ -1359,15 +1359,10 @@ void Module::SetSymbolFileFileSpec(const FileSpec &file) {
         // one
         obj_file->ClearSymtab();
 
-        // Clear the unwind table too, as that may also be affected by the
-        // symbol file information.
-        m_unwind_table.reset();
-
         // The symbol file might be a directory bundle ("/tmp/a.out.dSYM")
         // instead of a full path to the symbol file within the bundle
         // ("/tmp/a.out.dSYM/Contents/Resources/DWARF/a.out"). So we need to
         // check this
-
         if (FileSystem::Instance().IsDirectory(file)) {
           std::string new_path(file.GetPath());
           std::string old_path(obj_file->GetFileSpec().GetPath());

Copy link
Member

@JDevlieghere JDevlieghere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this!

@jasonmolenda jasonmolenda merged commit 2f63718 into llvm:main Mar 26, 2024
@jasonmolenda jasonmolenda deleted the dont-reset-unwindtable-on-adding-symbol-file branch March 26, 2024 16:07
jasonmolenda added a commit to jasonmolenda/llvm-project that referenced this pull request Mar 26, 2024
…lvm#86603)

Fixing a crash in lldb when `symbols.auto-download` setting is enabled.
When doing a backtrace, this feature has lldb search for a SymbolFile
for stack frames when we are backtracing, and add them either
synchoronously or asynchronously, depending on the specific setting
used.

Module::SetSymbolFileFileSpec clears the Module's UnwindTable, once we
find a new SymbolFile. We may be adding a source of unwind information
that we did not have when lldb was working only with the executable
binary.

What happens in practice is that we're using a reference to the Module's
UnwindTable, and then the other thread getting the SymbolFile clears it
and now the first thread is referring to freed memory and we can crash.
When built with address sanitizer, it crashes much more reliably.

Given that unwind information used for exception handling -- eh_frame,
compact unwind -- is present in executable binaries, the only thing
we're likely to *add* would be DWARF's `debug_frame` if that was also
available. The actual value of re-creating the UnwindTable when we have
added a SymbolFile is not large.

I also tried fixing this by changing the Module to have a shared_ptr to
the UnwindTable, so we could have two different UnwindTable's in use
simultaneously for a brief period. This would be fine TODAY, but it
introduces a very subtle bug that someone will have a heck of a time
figuring out in the future.

In the end, I believe the safest approach is to sacrifice the possible
marginal gain of reconstructing the UnwindTable once a SymbolFile has
been added, to sidestep this whole problem area.

Also, in `Module::GetUnwindTable()`, call `DownloadSymbolFileAsync`
before we create the UnwindTable for the first time, in case the symbol
file is fetched synchronously, we will have it for that possible
marginal gain.

(cherry picked from commit 2f63718)
jasonmolenda added a commit to jasonmolenda/llvm-project that referenced this pull request Mar 26, 2024
…lvm#86603)

Fixing a crash in lldb when `symbols.auto-download` setting is enabled.
When doing a backtrace, this feature has lldb search for a SymbolFile
for stack frames when we are backtracing, and add them either
synchoronously or asynchronously, depending on the specific setting
used.

Module::SetSymbolFileFileSpec clears the Module's UnwindTable, once we
find a new SymbolFile. We may be adding a source of unwind information
that we did not have when lldb was working only with the executable
binary.

What happens in practice is that we're using a reference to the Module's
UnwindTable, and then the other thread getting the SymbolFile clears it
and now the first thread is referring to freed memory and we can crash.
When built with address sanitizer, it crashes much more reliably.

Given that unwind information used for exception handling -- eh_frame,
compact unwind -- is present in executable binaries, the only thing
we're likely to *add* would be DWARF's `debug_frame` if that was also
available. The actual value of re-creating the UnwindTable when we have
added a SymbolFile is not large.

I also tried fixing this by changing the Module to have a shared_ptr to
the UnwindTable, so we could have two different UnwindTable's in use
simultaneously for a brief period. This would be fine TODAY, but it
introduces a very subtle bug that someone will have a heck of a time
figuring out in the future.

In the end, I believe the safest approach is to sacrifice the possible
marginal gain of reconstructing the UnwindTable once a SymbolFile has
been added, to sidestep this whole problem area.

Also, in `Module::GetUnwindTable()`, call `DownloadSymbolFileAsync`
before we create the UnwindTable for the first time, in case the symbol
file is fetched synchronously, we will have it for that possible
marginal gain.

(cherry picked from commit 2f63718)
jasonmolenda added a commit that referenced this pull request Mar 26, 2024
In

commit 2f63718
Author: Jason Molenda <[email protected]>
Date:   Tue Mar 26 09:07:15 2024 -0700

    [lldb] Don't clear a Module's UnwindTable when adding a SymbolFile (#86603)

I changed lldb to not clear a Module's UnwindTable when we add a
SymbolFile to a binary, because the added benefit is marginal, and
handling this reconstruction correctly is difficult.  This test was
written to explicitly create a test without unwind info in the
binary, then add a symbol file with the unwind info, and check that
it is present.  I've intentionally broken this, so I'm removing the
test.
jasonmolenda added a commit to swiftlang/llvm-project that referenced this pull request Mar 27, 2024
…when-adding-symbol-file-20230725

[lldb] Don't clear a Module's UnwindTable when adding a SymbolFile (llvm#86603)
jasonmolenda added a commit to jasonmolenda/llvm-project that referenced this pull request Mar 27, 2024
In
  commit 2f63718
  Author: Jason Molenda <[email protected]>
  Date:   Tue Mar 26 09:07:15 2024 -0700

    [lldb] Don't clear a Module's UnwindTable when adding a SymbolFile (llvm#86603)

I stopped clearing a Module's UnwindTable when we add a SymbolFile
to avoid the memory management problems with adding a symbol file
asynchronously while the UnwindTable is being accessed on another
thread.  This broke the target-symbols-add-unwind.test shell test
on Linux which removes the DWARF debub_frame section from a binary,
loads it, then loads the unstripped binary with the DWARF debug_frame
section and checks that the UnwindPlans for a function include
debug_frame.

I originally decided that I was willing to sacrifice the possiblity
of additional unwind sources from a symbol file because we rely on
assembly emulation so heavily, they're rarely critical.  But there
are targets where we we don't have emluation and rely on things
like DWARF debug_frame a lot more, so this probably wasn't a good
choice.

This patch adds a new UnwindTable::Update method which looks for any
new sources of unwind information and adds it to the UnwindTable,
and calls that after a new SymbolFile has been added to a Module.
jasonmolenda added a commit that referenced this pull request Mar 27, 2024
In
     commit 2f63718
     Author: Jason Molenda <[email protected]>
     Date:   Tue Mar 26 09:07:15 2024 -0700

[lldb] Don't clear a Module's UnwindTable when adding a SymbolFile
(#86603)

I stopped clearing a Module's UnwindTable when we add a SymbolFile to
avoid the memory management problems with adding a symbol file
asynchronously while the UnwindTable is being accessed on another
thread. This broke the target-symbols-add-unwind.test shell test on
Linux which removes the DWARF debub_frame section from a binary, loads
it, then loads the unstripped binary with the DWARF debug_frame section
and checks that the UnwindPlans for a function include debug_frame.

I originally decided that I was willing to sacrifice the possiblity of
additional unwind sources from a symbol file because we rely on assembly
emulation so heavily, they're rarely critical. But there are targets
where we we don't have emluation and rely on things like DWARF
debug_frame a lot more, so this probably wasn't a good choice.

This patch adds a new UnwindTable::Update method which looks for any new
sources of unwind information and adds it to the UnwindTable, and calls
that after a new SymbolFile has been added to a Module.
jasonmolenda added a commit to jasonmolenda/llvm-project that referenced this pull request Mar 27, 2024
In
     commit 2f63718
     Author: Jason Molenda <[email protected]>
     Date:   Tue Mar 26 09:07:15 2024 -0700

[lldb] Don't clear a Module's UnwindTable when adding a SymbolFile
(llvm#86603)

I stopped clearing a Module's UnwindTable when we add a SymbolFile to
avoid the memory management problems with adding a symbol file
asynchronously while the UnwindTable is being accessed on another
thread. This broke the target-symbols-add-unwind.test shell test on
Linux which removes the DWARF debub_frame section from a binary, loads
it, then loads the unstripped binary with the DWARF debug_frame section
and checks that the UnwindPlans for a function include debug_frame.

I originally decided that I was willing to sacrifice the possiblity of
additional unwind sources from a symbol file because we rely on assembly
emulation so heavily, they're rarely critical. But there are targets
where we we don't have emluation and rely on things like DWARF
debug_frame a lot more, so this probably wasn't a good choice.

This patch adds a new UnwindTable::Update method which looks for any new
sources of unwind information and adds it to the UnwindTable, and calls
that after a new SymbolFile has been added to a Module.

(cherry picked from commit 6a0ec8e)
jasonmolenda added a commit to jasonmolenda/llvm-project that referenced this pull request Mar 27, 2024
In
     commit 2f63718
     Author: Jason Molenda <[email protected]>
     Date:   Tue Mar 26 09:07:15 2024 -0700

[lldb] Don't clear a Module's UnwindTable when adding a SymbolFile
(llvm#86603)

I stopped clearing a Module's UnwindTable when we add a SymbolFile to
avoid the memory management problems with adding a symbol file
asynchronously while the UnwindTable is being accessed on another
thread. This broke the target-symbols-add-unwind.test shell test on
Linux which removes the DWARF debub_frame section from a binary, loads
it, then loads the unstripped binary with the DWARF debug_frame section
and checks that the UnwindPlans for a function include debug_frame.

I originally decided that I was willing to sacrifice the possiblity of
additional unwind sources from a symbol file because we rely on assembly
emulation so heavily, they're rarely critical. But there are targets
where we we don't have emluation and rely on things like DWARF
debug_frame a lot more, so this probably wasn't a good choice.

This patch adds a new UnwindTable::Update method which looks for any new
sources of unwind information and adds it to the UnwindTable, and calls
that after a new SymbolFile has been added to a Module.

(cherry picked from commit 6a0ec8e)
JDevlieghere added a commit to swiftlang/llvm-project that referenced this pull request Mar 28, 2024
…when-adding-symbol-file-6.0

[lldb] Don't clear a Module's UnwindTable when adding a SymbolFile (llvm#86603)
labath added a commit to labath/llvm-project that referenced this pull request Feb 5, 2025
PR llvm#86603 broke unwinding in for unwind info added via "target symbols
add". llvm#86770 attempted to fix this, but the fix was only partial -- it
accepted new sources of unwind information, but didn't take into account
that the symbol file can alter what lldb percieves as function
boundaries.

A stripped file will not contain information about private
(non-exported) symbols, which will make the public symbols appear very
large. If lldb tries to unwind from such a function before symbols are
added, then the cached unwind plan will prevent new (correct) unwind
plans from being created.

target-symbols-add-unwind.test might have caught this, were it not for
the fact that the "image show-unwind" command does *not* use cached
unwind information (it recomputes it from scratch).

The changes in this patch come in three pieces:
- Clear cached unwind plans when adding symbols. Since the symbol
  boundaries can change, we cannot trust anything we've computed
  previously.
- Add a flag to "image show-unwind" to display the cached unwind
  information (mainly for the use in the test, but I think it's also
  generally useful).
- Rewrite the test to better and more reliably simulate the real-world
  scenario: I've swapped the running process for a core (minidump) file
  so it can run anywhere; used the caching version of the show-unwind
  command; and swapped C for assembly to better control the placement of
  symbols
labath added a commit that referenced this pull request Feb 7, 2025
PR #86603 broke unwinding in for unwind info added via "target symbols
add". #86770 attempted to fix this, but the fix was only partial -- it
accepted new sources of unwind information, but didn't take into account
that the symbol file can alter what lldb percieves as function
boundaries.

A stripped file will not contain information about private
(non-exported) symbols, which will make the public symbols appear very
large. If lldb tries to unwind from such a function before symbols are
added, then the cached unwind plan will prevent new (correct) unwind
plans from being created.

target-symbols-add-unwind.test might have caught this, were it not for
the fact that the "image show-unwind" command does *not* use cached
unwind information (it recomputes it from scratch).

The changes in this patch come in three pieces:
- Clear cached unwind plans when adding symbols. Since the symbol
boundaries can change, we cannot trust anything we've computed
previously.
- Add a flag to "image show-unwind" to display the cached unwind
information (mainly for the use in the test, but I think it's also
generally useful).
- Rewrite the test to better and more reliably simulate the real-world
scenario: I've swapped the running process for a core (minidump) file so
it can run anywhere; used the caching version of the show-unwind
command; and swapped C for assembly to better control the placement of
symbols
Icohedron pushed a commit to Icohedron/llvm-project that referenced this pull request Feb 11, 2025
PR llvm#86603 broke unwinding in for unwind info added via "target symbols
add". llvm#86770 attempted to fix this, but the fix was only partial -- it
accepted new sources of unwind information, but didn't take into account
that the symbol file can alter what lldb percieves as function
boundaries.

A stripped file will not contain information about private
(non-exported) symbols, which will make the public symbols appear very
large. If lldb tries to unwind from such a function before symbols are
added, then the cached unwind plan will prevent new (correct) unwind
plans from being created.

target-symbols-add-unwind.test might have caught this, were it not for
the fact that the "image show-unwind" command does *not* use cached
unwind information (it recomputes it from scratch).

The changes in this patch come in three pieces:
- Clear cached unwind plans when adding symbols. Since the symbol
boundaries can change, we cannot trust anything we've computed
previously.
- Add a flag to "image show-unwind" to display the cached unwind
information (mainly for the use in the test, but I think it's also
generally useful).
- Rewrite the test to better and more reliably simulate the real-world
scenario: I've swapped the running process for a core (minidump) file so
it can run anywhere; used the caching version of the show-unwind
command; and swapped C for assembly to better control the placement of
symbols
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants