Skip to content

Conversation

bulbazord
Copy link
Member

Instead of reporting the error directly through the DWARFContext passed in as an argument, it would be more flexible to have extract return the error and allow the caller to react appropriately.

This will be useful for using llvm's DWARFHeaderUnit from lldb which may report header extraction errors through a different mechanism.

@llvmbot
Copy link
Member

llvmbot commented Oct 4, 2023

@llvm/pr-subscribers-debuginfo

Changes

Instead of reporting the error directly through the DWARFContext passed in as an argument, it would be more flexible to have extract return the error and allow the caller to react appropriately.

This will be useful for using llvm's DWARFHeaderUnit from lldb which may report header extraction errors through a different mechanism.


Full diff: https://github.com/llvm/llvm-project/pull/68242.diff

4 Files Affected:

  • (modified) llvm/include/llvm/DebugInfo/DWARF/DWARFUnit.h (+2-2)
  • (modified) llvm/lib/DebugInfo/DWARF/DWARFContext.cpp (+10-4)
  • (modified) llvm/lib/DebugInfo/DWARF/DWARFUnit.cpp (+34-45)
  • (modified) llvm/unittests/DebugInfo/DWARF/DWARFDebugInfoTest.cpp (+5-1)
diff --git a/llvm/include/llvm/DebugInfo/DWARF/DWARFUnit.h b/llvm/include/llvm/DebugInfo/DWARF/DWARFUnit.h
index 3c0770787463e6c..7084081ce61a43a 100644
--- a/llvm/include/llvm/DebugInfo/DWARF/DWARFUnit.h
+++ b/llvm/include/llvm/DebugInfo/DWARF/DWARFUnit.h
@@ -79,8 +79,8 @@ class DWARFUnitHeader {
   /// Note that \p SectionKind is used as a hint to guess the unit type
   /// for DWARF formats prior to DWARFv5. In DWARFv5 the unit type is
   /// explicitly defined in the header and the hint is ignored.
-  bool extract(DWARFContext &Context, const DWARFDataExtractor &debug_info,
-               uint64_t *offset_ptr, DWARFSectionKind SectionKind);
+  Error extract(DWARFContext &Context, const DWARFDataExtractor &debug_info,
+                uint64_t *offset_ptr, DWARFSectionKind SectionKind);
   // For units in DWARF Package File, remember the index entry and update
   // the abbreviation offset read by extract().
   bool applyIndexEntry(const DWARFUnitIndex::Entry *Entry);
diff --git a/llvm/lib/DebugInfo/DWARF/DWARFContext.cpp b/llvm/lib/DebugInfo/DWARF/DWARFContext.cpp
index 1e1ab814673f423..b6bca561c66da5c 100644
--- a/llvm/lib/DebugInfo/DWARF/DWARFContext.cpp
+++ b/llvm/lib/DebugInfo/DWARF/DWARFContext.cpp
@@ -89,9 +89,12 @@ void fixupIndexV4(DWARFContext &C, DWARFUnitIndex &Index) {
     DWARFDataExtractor Data(DObj, S, C.isLittleEndian(), 0);
     while (Data.isValidOffset(Offset)) {
       DWARFUnitHeader Header;
-      if (!Header.extract(C, Data, &Offset, DWARFSectionKind::DW_SECT_INFO)) {
+      if (Error ExtractionErr = Header.extract(
+              C, Data, &Offset, DWARFSectionKind::DW_SECT_INFO)) {
         logAllUnhandledErrors(
-            createError("Failed to parse CU header in DWP file"), errs());
+            joinErrors(createError("Failed to parse CU header in DWP file"),
+                       std::move(ExtractionErr)),
+            errs());
         Map.clear();
         break;
       }
@@ -149,9 +152,12 @@ void fixupIndexV5(DWARFContext &C, DWARFUnitIndex &Index) {
     uint64_t Offset = 0;
     while (Data.isValidOffset(Offset)) {
       DWARFUnitHeader Header;
-      if (!Header.extract(C, Data, &Offset, DWARFSectionKind::DW_SECT_INFO)) {
+      if (Error ExtractionErr = Header.extract(
+              C, Data, &Offset, DWARFSectionKind::DW_SECT_INFO)) {
         logAllUnhandledErrors(
-            createError("Failed to parse unit header in DWP file"), errs());
+            joinErrors(createError("Failed to parse CU header in DWP file"),
+                       std::move(ExtractionErr)),
+            errs());
         break;
       }
       bool CU = Header.getUnitType() == DW_UT_split_compile;
diff --git a/llvm/lib/DebugInfo/DWARF/DWARFUnit.cpp b/llvm/lib/DebugInfo/DWARF/DWARFUnit.cpp
index 387345a4ac2d601..a45e069e63f9a83 100644
--- a/llvm/lib/DebugInfo/DWARF/DWARFUnit.cpp
+++ b/llvm/lib/DebugInfo/DWARF/DWARFUnit.cpp
@@ -81,8 +81,11 @@ void DWARFUnitVector::addUnitsImpl(
       if (!Data.isValidOffset(Offset))
         return nullptr;
       DWARFUnitHeader Header;
-      if (!Header.extract(Context, Data, &Offset, SectionKind))
+      if (Error ExtractErr =
+              Header.extract(Context, Data, &Offset, SectionKind)) {
+        Context.getWarningHandler()(std::move(ExtractErr));
         return nullptr;
+      }
       if (!IndexEntry && IsDWO) {
         const DWARFUnitIndex &Index = getDWARFUnitIndex(
             Context, Header.isTypeUnit() ? DW_SECT_EXT_TYPES : DW_SECT_INFO);
@@ -244,10 +247,10 @@ Expected<uint64_t> DWARFUnit::getStringOffsetSectionItem(uint32_t Index) const {
   return DA.getRelocatedValue(ItemSize, &Offset);
 }
 
-bool DWARFUnitHeader::extract(DWARFContext &Context,
-                              const DWARFDataExtractor &debug_info,
-                              uint64_t *offset_ptr,
-                              DWARFSectionKind SectionKind) {
+Error DWARFUnitHeader::extract(DWARFContext &Context,
+                               const DWARFDataExtractor &debug_info,
+                               uint64_t *offset_ptr,
+                               DWARFSectionKind SectionKind) {
   Offset = *offset_ptr;
   Error Err = Error::success();
   IndexEntry = nullptr;
@@ -277,72 +280,58 @@ bool DWARFUnitHeader::extract(DWARFContext &Context,
   } else if (UnitType == DW_UT_split_compile || UnitType == DW_UT_skeleton)
     DWOId = debug_info.getU64(offset_ptr, &Err);
 
-  if (Err) {
-    Context.getWarningHandler()(joinErrors(
+  if (Err)
+    return joinErrors(
         createStringError(
             errc::invalid_argument,
             "DWARF unit at 0x%8.8" PRIx64 " cannot be parsed:", Offset),
-        std::move(Err)));
-    return false;
-  }
+        std::move(Err));
 
   // Header fields all parsed, capture the size of this unit header.
   assert(*offset_ptr - Offset <= 255 && "unexpected header size");
   Size = uint8_t(*offset_ptr - Offset);
   uint64_t NextCUOffset = Offset + getUnitLengthFieldByteSize() + getLength();
 
-  if (!debug_info.isValidOffset(getNextUnitOffset() - 1)) {
-    Context.getWarningHandler()(
-        createStringError(errc::invalid_argument,
-                          "DWARF unit from offset 0x%8.8" PRIx64 " incl. "
-                          "to offset  0x%8.8" PRIx64 " excl. "
-                          "extends past section size 0x%8.8zx",
-                          Offset, NextCUOffset, debug_info.size()));
-    return false;
-  }
+  if (!debug_info.isValidOffset(getNextUnitOffset() - 1))
+    return createStringError(errc::invalid_argument,
+                             "DWARF unit from offset 0x%8.8" PRIx64 " incl. "
+                             "to offset  0x%8.8" PRIx64 " excl. "
+                             "extends past section size 0x%8.8zx",
+                             Offset, NextCUOffset, debug_info.size());
 
-  if (!DWARFContext::isSupportedVersion(getVersion())) {
-    Context.getWarningHandler()(createStringError(
+  if (!DWARFContext::isSupportedVersion(getVersion()))
+    return createStringError(
         errc::invalid_argument,
         "DWARF unit at offset 0x%8.8" PRIx64 " "
         "has unsupported version %" PRIu16 ", supported are 2-%u",
-        Offset, getVersion(), DWARFContext::getMaxSupportedVersion()));
-    return false;
-  }
+        Offset, getVersion(), DWARFContext::getMaxSupportedVersion());
 
   // Type offset is unit-relative; should be after the header and before
   // the end of the current unit.
-  if (isTypeUnit() && TypeOffset < Size) {
-    Context.getWarningHandler()(
-        createStringError(errc::invalid_argument,
-                          "DWARF type unit at offset "
-                          "0x%8.8" PRIx64 " "
-                          "has its relocated type_offset 0x%8.8" PRIx64 " "
-                          "pointing inside the header",
-                          Offset, Offset + TypeOffset));
-    return false;
-  }
-  if (isTypeUnit() &&
-      TypeOffset >= getUnitLengthFieldByteSize() + getLength()) {
-    Context.getWarningHandler()(createStringError(
+  if (isTypeUnit() && TypeOffset < Size)
+    return createStringError(errc::invalid_argument,
+                             "DWARF type unit at offset "
+                             "0x%8.8" PRIx64 " "
+                             "has its relocated type_offset 0x%8.8" PRIx64 " "
+                             "pointing inside the header",
+                             Offset, Offset + TypeOffset);
+
+  if (isTypeUnit() && TypeOffset >= getUnitLengthFieldByteSize() + getLength())
+    return createStringError(
         errc::invalid_argument,
         "DWARF type unit from offset 0x%8.8" PRIx64 " incl. "
         "to offset 0x%8.8" PRIx64 " excl. has its "
         "relocated type_offset 0x%8.8" PRIx64 " pointing past the unit end",
-        Offset, NextCUOffset, Offset + TypeOffset));
-    return false;
-  }
+        Offset, NextCUOffset, Offset + TypeOffset);
 
   if (Error SizeErr = DWARFContext::checkAddressSizeSupported(
           getAddressByteSize(), errc::invalid_argument,
-          "DWARF unit at offset 0x%8.8" PRIx64, Offset)) {
-    Context.getWarningHandler()(std::move(SizeErr));
-    return false;
-  }
+          "DWARF unit at offset 0x%8.8" PRIx64, Offset))
+    return SizeErr;
 
   // Keep track of the highest DWARF version we encounter across all units.
   Context.setMaxVersionIfGreater(getVersion());
-  return true;
+  return Error::success();
 }
 
 bool DWARFUnitHeader::applyIndexEntry(const DWARFUnitIndex::Entry *Entry) {
diff --git a/llvm/unittests/DebugInfo/DWARF/DWARFDebugInfoTest.cpp b/llvm/unittests/DebugInfo/DWARF/DWARFDebugInfoTest.cpp
index 2adc2403eaca9e6..9f4fe9c54a928fd 100644
--- a/llvm/unittests/DebugInfo/DWARF/DWARFDebugInfoTest.cpp
+++ b/llvm/unittests/DebugInfo/DWARF/DWARFDebugInfoTest.cpp
@@ -2170,7 +2170,11 @@ TEST(DWARFDebugInfo, TestDWARF64UnitLength) {
     DWARFDataExtractor Data(Obj, Sec, /* IsLittleEndian = */ true,
                             /* AddressSize = */ 4);
     uint64_t Offset = 0;
-    EXPECT_FALSE(Header.extract(*Context, Data, &Offset, DW_SECT_INFO));
+    ASSERT_THAT_ERROR(
+        Header.extract(*Context, Data, &Offset, DW_SECT_INFO),
+        FailedWithMessage(
+            "DWARF unit from offset 0x00000000 incl. to offset  "
+            "0x1122334455667794 excl. extends past section size 0x00000018"));
     // Header.extract() returns false because there is not enough space
     // in the section for the declared length. Anyway, we can check that
     // the properties are read correctly.

Comment on lines 92 to 97
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably this improves the error handling by producing a single error with the outer/contextual message, then the underlying root cause? (rather than printing two separate errors - inner then outer) - could that behavior/improvement be tested?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the errors will be grouped together in this case. As you said, we would previously report the inner error through the DWARFContext and then the caller would report an error "Failed to parse CU header in DWP file". This does slightly change behavior such that "Failed to parse CU header in DWP file" will get reported first with the inner error message following afterwards.

I'll see if there are any existing tests that are suitable to extend or copy. If not, I may need help producing one. 😅

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dwblaikie I hacked up DWARFContext::fixupIndex by adding an early return before any work is done and ran the llvm test suite. It doesn't look like any tests failed after that though. I'm not sure how this codepath is triggered. Do you have any suggestions here? I'd like to continue making progress improving the error handling in the LLVM DWARF Parser and refactoring LLDB to use it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, fixupIndex should be reachable (simplest way to find it would be to assert(false); in there and you'd find which tests reach the code) via tests using -manaully-generate-unit-index - though they might not be testing invalid headers, and such - and could be improved to do so, that'd involve some manually crafted DWARF (probably written in assembly and assembled with llvm-mc in the test) that has a corrupted header.

It's a bit of a chore to create a hand-crafted dwp file, though...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the tip! Added a test. 🙂

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you don't want to use joinErrors here. This will result in two separate errors that are reported individually, rather than one error with an outer and an inner context. A better way would be to create a new Error that concatenates the necessary messages. We use this trick in llvm-readobj, for example. See

return createError("cannot read content of " + describe(Sec) + ": " +
.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the pointer! I'll update this to use that trick.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's nice that now we can tell from the signature the meaning of the returned value (previously we couldn't and there was no documentation about it)

…it::extract

Instead of reporting the error directly through the DWARFContext passed
in as an argument, it would be more flexible to have extract return the
error and allow the caller to react appropriately.

This will be useful for using llvm's DWARFHeaderUnit from lldb which may
report header extraction errors through a different mechanism.
@bulbazord bulbazord force-pushed the header-separate-error-generation-from-reporting branch from 0484077 to 9ec5638 Compare October 17, 2023 00:02
# CHECK-NOT: .debug_info.dwo contents:

# CHECK-DAG: .debug_cu_index contents:
# CHECK: Failed to parse CU header in DWP file: DWARF unit at offset 0x00000000 has unsupported version 6, supported are 2-5
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd probably still include the error: prefix here to make it a bit more explicit how this is being printed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They're not emitted with error: unfortunately. The raw output of llvm-dwarfdump is:

$PROJECTS/llvm-project/build/test/tools/llvm-dwp/X86/Output/cu_tu_units_manual_v5_invalid.s.tmp.dwp:     file format elf64-x86-64
warning: DWARF unit at offset 0x00000000 has unsupported version 6, supported are 2-5

.debug_cu_index contents:
Failed to parse CU header in DWP file: DWARF unit at offset 0x00000000 has unsupported version 6, supported are 2-5
version = 5, units = 1, slots = 2

Index Signature          INFO                                     ABBREV
----- ------------------ ---------------------------------------- ------------------------
    1 0x10001450c58e1dbe [0x0000000000000036, 0x000000000000004b) [0x00000000, 0x00000010)

.debug_tu_index contents:
Failed to parse CU header in DWP file: DWARF unit at offset 0x00000000 has unsupported version 6, supported are 2-5
version = 5, units = 2, slots = 4

Index Signature          INFO                                     ABBREV
----- ------------------ ---------------------------------------- ------------------------
    1 0x4e834ea939695c24 [0x0000000000000000, 0x000000000000001b) [0x00000000, 0x00000010)
    4 0x89a49a5d44b29ee7 [0x000000000000001b, 0x0000000000000036) [0x00000000, 0x00000010)

If you consider the output of the test I added but without my changes, you end up with:

.debug_cu_index contents:
warning: DWARF unit at offset 0x00000000 has unsupported version 6, supported are 2-5
Failed to parse unit header in DWP file

I think the lack of a warning in my test is definitely problematic, but that's because fixupIndexV{4,5} isn't reporting the error correctly. I'd like to address that in a follow-up.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if this is what you meant, but the "Failed to parse..." message probably should have an error: prefix, but that should be added by the error handler (using WithColor::error etc).

@bulbazord bulbazord merged commit 814a79a into llvm:main Oct 18, 2023
@bulbazord bulbazord deleted the header-separate-error-generation-from-reporting branch October 18, 2023 16:06
rastogishubham added a commit to rastogishubham/llvm-project that referenced this pull request Oct 25, 2023
PR llvm#68242 changed
DWARFUnitHeader::extract to return an Error, which had to be correctly
handled in MCCAS. This change fixes that issue.
rastogishubham added a commit to rastogishubham/llvm-project that referenced this pull request Nov 9, 2023
PR llvm#68242 changed
DWARFUnitHeader::extract to return an Error, which had to be correctly
handled in MCCAS. This change fixes that issue.

(cherry picked from commit 7773cc5)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants