Skip to content

[DebugInfo] Place local ODR-uniqued types in decl DISubprograms #119001

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

jmorse
Copy link
Member

@jmorse jmorse commented Dec 6, 2024

This is a reapplication and fix for the code in #75385 , support for function-local types in debug-info. The first commit is a reapplication of that code (plus some rebasing-maintenance), the second commit shifts a bit of debug-info metadata in the type hierarchy to make it safer when being unique'd. The root problem in #75385 seems to be that multiple copies of DICompositeTypes in LTO contexts, where multiple definitions of functions appear in debug-info, get unique'd and break various links between types and subprograms. Second commit message follows:

~

There are two flavours of DISubprogram: declarations, which are unique'd
and abstractly describe the function in question. There are also definition
DISubprograms which correspond to real instances, some of which are
inlined, duplicated across translation units in LTO, or otherwise can have
multiple instances.

Given that LLVM sometimes force-uniques types by their ODR-name, see the
enableDebugTypeODRUniquing feature, we shouldn't place types that might be
unique'd into duplicated contexts like definition DISubprograms. Instead,
place them into the declaration.

This slightly bends the existing approach where only functions that have a
separate declaratrion to their definition get a declaration-DISubprogram. A
single function in a translation unit might now get a declaration where it
didn't before, if it contains an ODR-unique'd type declaration. This seems
reasonable given that the LLVM idea of a declaration doesn't have to
exactly match source-language ideas.

The added cpp test checks that such ORD-unique'd types are detected and
placed in the declaration DISubprogram, creating one if necessary. The IR
test ensures that nothing changes after a round-trip with
enableDebugTypeODRUniquing turned on. Changes to the codeview
test are because the order of metadata changes a little.

…lock scopes (4/7)

RFC https://discourse.llvm.org/t/rfc-dwarfdebug-fix-and-improve-handling-imported-entities-types-and-static-local-in-subprogram-and-lexical-block-scopes/68544

Similar to imported declarations, the patch tracks function-local types in
DISubprogram's 'retainedNodes' field. DwarfDebug is adjusted in accordance with
the aforementioned metadata change and provided a support of function-local
types scoped within a lexical block.

The patch assumes that DICompileUnit's 'enums field' no longer tracks local
types and DwarfDebug would assert if any locally-scoped types get placed there.

Additionally, if DISubprogram is not cloned in CloneFunctionInto(),
cloning of its DILocalVariables is avoided.
Otherwise we get duplicated DILocalVariables not tracked in their
subprogram's retainedNodes, that crash LTO with Chromium.

Reviewed By: jmmartinez
Authored-by: Kristina Bessonova <[email protected]>
Differential Revision: https://reviews.llvm.org/D144006

(cherry picked from commit 2953d4a)
Signed-off-by: Jeremy Morse <[email protected]>

Conflicts:
	llvm/lib/IR/DIBuilder.cpp
	llvm/lib/IR/DebugInfo.cpp
        llvm/lib/Transforms/Utils/CloneFunction.cpp
There are two flavours of DISubprogram: declarations, which are unique'd
and abstractly describe the function in question. There are also definition
DISubprograms which correspond to real instances, some of which are
inlined, duplicated across translation units in LTO, or otherwise can have
multiple instances.

Given that LLVM sometimes force-uniques types by their ODR-name, see the
enableDebugTypeODRUniquing feature, we shouldn't place types that might be
unique'd into duplicated contexts like definition DISubprograms. Instead,
place them into the declaration.

This slightly bends the existing approach where only functions that have a
separate declaratrion to their definition get a declaration-DISubprogram. A
single function in a translation unit might now get a declaration where it
didn't before, if it contains an ODR-unique'd type declaration. This seems
reasonable given that the LLVM idea of a declaration doesn't have to
exactly match source-language ideas.

The added cpp test checks that such ORD-unique'd types are detected and
placed in the declaration DISubprogram, creating one if necessary. The IR
test ensures that nothing changes after a round-trip with
enableDebugTypeODRUniquing turned on.
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:codegen IR generation bugs: mangling, exceptions, etc. debuginfo llvm:ir llvm:transforms labels Dec 6, 2024
@llvmbot
Copy link
Member

llvmbot commented Dec 6, 2024

@llvm/pr-subscribers-llvm-transforms
@llvm/pr-subscribers-debuginfo
@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-codegen

Author: Jeremy Morse (jmorse)

Changes

This is a reapplication and fix for the code in #75385 , support for function-local types in debug-info. The first commit is a reapplication of that code (plus some rebasing-maintenance), the second commit shifts a bit of debug-info metadata in the type hierarchy to make it safer when being unique'd. The root problem in #75385 seems to be that multiple copies of DICompositeTypes in LTO contexts, where multiple definitions of functions appear in debug-info, get unique'd and break various links between types and subprograms. Second commit message follows:

~

There are two flavours of DISubprogram: declarations, which are unique'd
and abstractly describe the function in question. There are also definition
DISubprograms which correspond to real instances, some of which are
inlined, duplicated across translation units in LTO, or otherwise can have
multiple instances.

Given that LLVM sometimes force-uniques types by their ODR-name, see the
enableDebugTypeODRUniquing feature, we shouldn't place types that might be
unique'd into duplicated contexts like definition DISubprograms. Instead,
place them into the declaration.

This slightly bends the existing approach where only functions that have a
separate declaratrion to their definition get a declaration-DISubprogram. A
single function in a translation unit might now get a declaration where it
didn't before, if it contains an ODR-unique'd type declaration. This seems
reasonable given that the LLVM idea of a declaration doesn't have to
exactly match source-language ideas.

The added cpp test checks that such ORD-unique'd types are detected and
placed in the declaration DISubprogram, creating one if necessary. The IR
test ensures that nothing changes after a round-trip with
enableDebugTypeODRUniquing turned on. Changes to the codeview
test are because the order of metadata changes a little.


Patch is 111.33 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/119001.diff

35 Files Affected:

  • (modified) clang/lib/CodeGen/CGDebugInfo.cpp (+55)
  • (modified) clang/lib/CodeGen/CGDebugInfo.h (+2)
  • (modified) clang/test/CodeGen/debug-info-codeview-unnamed.c (+8-8)
  • (modified) clang/test/CodeGen/debug-info-unused-types.c (+9-7)
  • (modified) clang/test/CodeGen/debug-info-unused-types.cpp (+8-6)
  • (modified) clang/test/CodeGenCXX/debug-info-access.cpp (+1-1)
  • (modified) clang/test/CodeGenCXX/debug-info-anon-union-vars.cpp (+6-6)
  • (modified) clang/test/CodeGenCXX/debug-info-codeview-unnamed.cpp (+36-27)
  • (modified) clang/test/CodeGenCXX/debug-info-gline-tables-only-codeview.cpp (+2-2)
  • (added) clang/test/CodeGenCXX/debug-info-local-types.cpp (+79)
  • (modified) clang/test/CodeGenCXX/debug-lambda-this.cpp (+2-2)
  • (modified) llvm/include/llvm/IR/DIBuilder.h (+3-3)
  • (modified) llvm/include/llvm/IR/DebugInfo.h (+10-1)
  • (modified) llvm/lib/Bitcode/Reader/MetadataLoader.cpp (+78-33)
  • (modified) llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp (+40-20)
  • (modified) llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.h (+8-8)
  • (modified) llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp (+10-3)
  • (modified) llvm/lib/IR/DIBuilder.cpp (+25-8)
  • (modified) llvm/lib/IR/DebugInfo.cpp (+10-4)
  • (modified) llvm/lib/IR/Verifier.cpp (+3-3)
  • (modified) llvm/lib/Transforms/Utils/CloneFunction.cpp (+14-1)
  • (added) llvm/test/Bitcode/clone-local-types.ll (+46)
  • (modified) llvm/test/Bitcode/upgrade-cu-locals.ll (+41-27)
  • (modified) llvm/test/Bitcode/upgrade-cu-locals.ll.bc ()
  • (added) llvm/test/DebugInfo/Generic/inlined-local-type.ll (+128)
  • (added) llvm/test/DebugInfo/Generic/lexical-block-retained-types.ll (+55)
  • (added) llvm/test/DebugInfo/Generic/lexical-block-types.ll (+425)
  • (modified) llvm/test/DebugInfo/Generic/verifier-invalid-disubprogram.ll (+1-1)
  • (added) llvm/test/DebugInfo/X86/local-type-as-template-parameter.ll (+161)
  • (modified) llvm/test/DebugInfo/X86/set.ll (+2-2)
  • (renamed) llvm/test/DebugInfo/X86/split-dwarf-local-import.ll (-1)
  • (renamed) llvm/test/DebugInfo/X86/split-dwarf-local-import2.ll (-1)
  • (renamed) llvm/test/DebugInfo/X86/split-dwarf-local-import3.ll ()
  • (added) llvm/test/DebugInfo/local-odr-types-hiearchy.ll (+124)
  • (modified) llvm/unittests/Transforms/Utils/CloningTest.cpp (+105)
diff --git a/clang/lib/CodeGen/CGDebugInfo.cpp b/clang/lib/CodeGen/CGDebugInfo.cpp
index 60f32f76109e9a..e56aef662a571d 100644
--- a/clang/lib/CodeGen/CGDebugInfo.cpp
+++ b/clang/lib/CodeGen/CGDebugInfo.cpp
@@ -1235,6 +1235,7 @@ CGDebugInfo::getOrCreateRecordFwdDecl(const RecordType *Ty,
   // Don't include a linkage name in line tables only.
   if (CGM.getCodeGenOpts().hasReducedDebugInfo())
     Identifier = getTypeIdentifier(Ty, CGM, TheCU);
+  Ctx = PickCompositeTypeScope(Ctx, Identifier);
   llvm::DICompositeType *RetTy = DBuilder.createReplaceableCompositeType(
       getTagForRecord(RD), RDName, Ctx, DefUnit, Line, 0, Size, Align, Flags,
       Identifier);
@@ -3534,6 +3535,7 @@ llvm::DIType *CGDebugInfo::CreateEnumType(const EnumType *Ty) {
     // FwdDecl with the second and then replace the second with
     // complete type.
     llvm::DIScope *EDContext = getDeclContextDescriptor(ED);
+    EDContext = PickCompositeTypeScope(EDContext, Identifier);
     llvm::DIFile *DefUnit = getOrCreateFile(ED->getLocation());
     llvm::TempDIScope TmpContext(DBuilder.createReplaceableCompositeType(
         llvm::dwarf::DW_TAG_enumeration_type, "", TheCU, DefUnit, 0));
@@ -3901,6 +3903,58 @@ CGDebugInfo::getOrCreateLimitedType(const RecordType *Ty) {
   return Res;
 }
 
+llvm::DIScope *
+CGDebugInfo::PickCompositeTypeScope(llvm::DIScope *S, StringRef Identifier) {
+  using llvm::DISubprogram;
+
+  // Only adjust the scope for composite types placed into functions.
+  if (!isa<DISubprogram>(S))
+    return S;
+
+  // We must adjust the scope if the ODR-name of the type is set.
+  if (Identifier.empty())
+    return S;
+
+  // This type has an ODR-name, and might be de-duplicated during LTO. It needs
+  // to be placed in the unique declaration of the function, not a (potentially
+  // duplicated) definition.
+  DISubprogram *SP = cast<DISubprogram>(S);
+  if (DISubprogram *Decl = SP->getDeclaration())
+    return Decl;
+
+  // There is no declaration -- we must produce one and retrofit it to the
+  // existing definition. Assume that we can just harvest the existing
+  // information, clear the definition flag and set as decl.
+  DISubprogram::DISPFlags SPFlags = SP->getSPFlags();
+  SPFlags &= ~DISubprogram::SPFlagDefinition;
+
+  llvm::DINode::DIFlags Flags = SP->getFlags();
+  Flags &= ~llvm::DINode::FlagAllCallsDescribed;
+
+#ifdef EXPENSIVE_CHECKS
+  // If we're looking to be really rigorous and avoid a hard-to-debug mishap,
+  // make sure that there aren't any function definitions in the scope chain.
+  llvm::DIScope *ToCheck = SP->getScope();
+  do {
+    // We should terminate at a DIFile rather than a DICompileUnit -- we're
+    // not fully unique across LTO otherwise.
+    assert(!isa<llvm::DICompileUnit>(ToCheck));
+    if (auto *DISP = dyn_cast<DISubprogram>(ToCheck))
+      assert(!(DISP->getSPFlags() & DISubprogram::SPFlagDefinition));
+    ToCheck = ToCheck->getScope();
+  } while (ToCheck);
+#endif
+
+  DISubprogram *DeclSP = DBuilder.createFunction(
+      SP->getScope(), SP->getName(), SP->getLinkageName(), SP->getFile(),
+      SP->getLine(), SP->getType(), SP->getScopeLine(),
+      Flags, SPFlags, SP->getTemplateParams(), nullptr, nullptr,
+      SP->getAnnotations());
+
+  SP->replaceDeclaration(DeclSP);
+  return DeclSP;
+}
+
 // TODO: Currently used for context chains when limiting debug info.
 llvm::DICompositeType *CGDebugInfo::CreateLimitedType(const RecordType *Ty) {
   RecordDecl *RD = Ty->getDecl();
@@ -3938,6 +3992,7 @@ llvm::DICompositeType *CGDebugInfo::CreateLimitedType(const RecordType *Ty) {
   auto Align = getTypeAlignIfRequired(Ty, CGM.getContext());
 
   SmallString<256> Identifier = getTypeIdentifier(Ty, CGM, TheCU);
+  RDContext = PickCompositeTypeScope(RDContext, Identifier);
 
   // Explicitly record the calling convention and export symbols for C++
   // records.
diff --git a/clang/lib/CodeGen/CGDebugInfo.h b/clang/lib/CodeGen/CGDebugInfo.h
index 3fd0237a1c61dd..71c84c45986c78 100644
--- a/clang/lib/CodeGen/CGDebugInfo.h
+++ b/clang/lib/CodeGen/CGDebugInfo.h
@@ -486,6 +486,8 @@ class CGDebugInfo {
   void EmitFunctionDecl(GlobalDecl GD, SourceLocation Loc,
                         QualType FnType, llvm::Function *Fn = nullptr);
 
+  llvm::DIScope *PickCompositeTypeScope(llvm::DIScope *S, StringRef Identifier);
+
   /// Emit debug info for an extern function being called.
   /// This is needed for call site debug info.
   void EmitFuncDeclForCallSite(llvm::CallBase *CallOrInvoke,
diff --git a/clang/test/CodeGen/debug-info-codeview-unnamed.c b/clang/test/CodeGen/debug-info-codeview-unnamed.c
index 0df6e1a0419bb3..7b88f53a92e421 100644
--- a/clang/test/CodeGen/debug-info-codeview-unnamed.c
+++ b/clang/test/CodeGen/debug-info-codeview-unnamed.c
@@ -8,23 +8,23 @@ int main(int argc, char* argv[], char* arge[]) {
   //
   struct { int bar; } one = {42};
   //
-  // LINUX:      !{{[0-9]+}} = !DILocalVariable(name: "one"
-  // LINUX-SAME:     type: [[TYPE_OF_ONE:![0-9]+]]
-  // LINUX-SAME:     )
-  // LINUX:      [[TYPE_OF_ONE]] = distinct !DICompositeType(
+  // LINUX:      [[TYPE_OF_ONE:![0-9]+]] = distinct !DICompositeType(
   // LINUX-SAME:     tag: DW_TAG_structure_type
   // LINUX-NOT:      name:
   // LINUX-NOT:      identifier:
   // LINUX-SAME:     )
+  // LINUX:      !{{[0-9]+}} = !DILocalVariable(name: "one"
+  // LINUX-SAME:     type: [[TYPE_OF_ONE]]
+  // LINUX-SAME:     )
   //
-  // MSVC:       !{{[0-9]+}} = !DILocalVariable(name: "one"
-  // MSVC-SAME:      type: [[TYPE_OF_ONE:![0-9]+]]
-  // MSVC-SAME:      )
-  // MSVC:       [[TYPE_OF_ONE]] = distinct !DICompositeType
+  // MSVC:       [[TYPE_OF_ONE:![0-9]+]] = distinct !DICompositeType
   // MSVC-SAME:      tag: DW_TAG_structure_type
   // MSVC-NOT:       name:
   // MSVC-NOT:       identifier:
   // MSVC-SAME:      )
+  // MSVC:       !{{[0-9]+}} = !DILocalVariable(name: "one"
+  // MSVC-SAME:      type: [[TYPE_OF_ONE]]
+  // MSVC-SAME:      )
 
   return 0;
 }
diff --git a/clang/test/CodeGen/debug-info-unused-types.c b/clang/test/CodeGen/debug-info-unused-types.c
index 3e9f7b07658e36..31d608d92a06b4 100644
--- a/clang/test/CodeGen/debug-info-unused-types.c
+++ b/clang/test/CodeGen/debug-info-unused-types.c
@@ -18,13 +18,15 @@ void quux(void) {
 // CHECK: !DICompileUnit{{.+}}retainedTypes: [[RETTYPES:![0-9]+]]
 // CHECK: [[TYPE0:![0-9]+]] = !DICompositeType(tag: DW_TAG_enumeration_type, name: "bar"
 // CHECK: [[TYPE1:![0-9]+]] = !DIEnumerator(name: "BAR"
-// CHECK: [[TYPE2:![0-9]+]] = !DICompositeType(tag: DW_TAG_enumeration_type, name: "z"
-// CHECK: [[TYPE3:![0-9]+]] = !DIEnumerator(name: "Z"
-// CHECK: [[RETTYPES]] = !{[[TYPE4:![0-9]+]], [[TYPE5:![0-9]+]], [[TYPE0]], [[TYPE6:![0-9]+]], {{![0-9]+}}, [[TYPE7:![0-9]+]], [[TYPE2]], [[TYPE8:![0-9]+]]}
-// CHECK: [[TYPE4]] = !DIDerivedType(tag: DW_TAG_typedef, name: "my_int"
-// CHECK: [[TYPE5]] = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "foo"
-// CHECK: [[TYPE6]] = distinct !DICompositeType(tag: DW_TAG_union_type, name: "baz"
-// CHECK: [[TYPE7]] = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "y"
+// CHECK: [[RETTYPES]] = !{[[TYPE2:![0-9]+]], [[TYPE3:![0-9]+]], [[TYPE0]], [[TYPE4:![0-9]+]], {{![0-9]+}}}
+// CHECK: [[TYPE2]] = !DIDerivedType(tag: DW_TAG_typedef, name: "my_int"
+// CHECK: [[TYPE3]] = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "foo"
+// CHECK: [[TYPE4]] = distinct !DICompositeType(tag: DW_TAG_union_type, name: "baz"
+// CHECK: [[SP:![0-9]+]] = distinct !DISubprogram(name: "quux", {{.*}}, retainedNodes: [[SPRETNODES:![0-9]+]]
+// CHECK: [[SPRETNODES]] = !{[[TYPE5:![0-9]+]], [[TYPE6:![0-9]+]], [[TYPE8:![0-9]+]]}
+// CHECK: [[TYPE5]] = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "y"
+// CHECK: [[TYPE6]] = !DICompositeType(tag: DW_TAG_enumeration_type, name: "z"
+// CHECK: [[TYPE7:![0-9]+]] = !DIEnumerator(name: "Z"
 // CHECK: [[TYPE8]] = distinct !DICompositeType(tag: DW_TAG_union_type, name: "w"
 
 // Check that debug info is not emitted for the typedef, struct, enum, and
diff --git a/clang/test/CodeGen/debug-info-unused-types.cpp b/clang/test/CodeGen/debug-info-unused-types.cpp
index 023cac159faa4b..5b01c6dbb39414 100644
--- a/clang/test/CodeGen/debug-info-unused-types.cpp
+++ b/clang/test/CodeGen/debug-info-unused-types.cpp
@@ -13,12 +13,14 @@ void quux() {
 // CHECK: !DICompileUnit{{.+}}retainedTypes: [[RETTYPES:![0-9]+]]
 // CHECK: [[TYPE0:![0-9]+]] = !DICompositeType(tag: DW_TAG_enumeration_type, name: "baz"
 // CHECK: [[TYPE1:![0-9]+]] = !DIEnumerator(name: "BAZ"
-// CHECK: [[TYPE2:![0-9]+]] = !DICompositeType(tag: DW_TAG_enumeration_type, name: "z"
-// CHECK: [[TYPE3:![0-9]+]] = !DIEnumerator(name: "Z"
-// CHECK: [[RETTYPES]] = !{[[TYPE4:![0-9]+]], [[TYPE5:![0-9]+]], [[TYPE0]], {{![0-9]+}}, [[TYPE6:![0-9]+]], [[TYPE2]]}
-// CHECK: [[TYPE4]] = !DIDerivedType(tag: DW_TAG_typedef, name: "foo"
-// CHECK: [[TYPE5]] = distinct !DICompositeType(tag: DW_TAG_class_type, name: "bar"
-// CHECK: [[TYPE6]] = distinct !DICompositeType(tag: DW_TAG_class_type, name: "y"
+// CHECK: [[RETTYPES]] = !{[[TYPE2:![0-9]+]], [[TYPE3:![0-9]+]], [[TYPE0]], {{![0-9]+}}}
+// CHECK: [[TYPE2]] = !DIDerivedType(tag: DW_TAG_typedef, name: "foo"
+// CHECK: [[TYPE3]] = distinct !DICompositeType(tag: DW_TAG_class_type, name: "bar"
+// CHECK: [[SP:![0-9]+]] = distinct !DISubprogram(name: "quux", {{.*}}, retainedNodes: [[SPRETNODES:![0-9]+]]
+// CHECK: [[SPRETNODES]] = !{[[TYPE4:![0-9]+]], [[TYPE5:![0-9]+]]}
+// CHECK: [[TYPE4]] = distinct !DICompositeType(tag: DW_TAG_class_type, name: "y", scope: [[SP]]
+// CHECK: [[TYPE5]] = !DICompositeType(tag: DW_TAG_enumeration_type, name: "z", scope: [[SP]]
+// CHECK: [[TYPE6:![0-9]+]] = !DIEnumerator(name: "Z"
 
 // NODBG-NOT: !DI{{CompositeType|Enumerator|DerivedType}}
 
diff --git a/clang/test/CodeGenCXX/debug-info-access.cpp b/clang/test/CodeGenCXX/debug-info-access.cpp
index 9f2c044843d0f0..7c0bf8ccb03842 100644
--- a/clang/test/CodeGenCXX/debug-info-access.cpp
+++ b/clang/test/CodeGenCXX/debug-info-access.cpp
@@ -18,9 +18,9 @@ class B : public A {
   static int public_static;
 
 protected:
+  // CHECK-DAG: !DIDerivedType(tag: DW_TAG_typedef, name: "prot_using",{{.*}} line: [[@LINE+3]],{{.*}} flags: DIFlagProtected)
   // CHECK-DAG: !DIDerivedType(tag: DW_TAG_typedef, name: "prot_typedef",{{.*}} line: [[@LINE+1]],{{.*}} flags: DIFlagProtected)
   typedef int prot_typedef;
-  // CHECK-DAG: !DIDerivedType(tag: DW_TAG_typedef, name: "prot_using",{{.*}} line: [[@LINE+1]],{{.*}} flags: DIFlagProtected)
   using prot_using = prot_typedef;
   prot_using prot_member;
 
diff --git a/clang/test/CodeGenCXX/debug-info-anon-union-vars.cpp b/clang/test/CodeGenCXX/debug-info-anon-union-vars.cpp
index 61b3c7c0526c8e..c91cf83c0405fe 100644
--- a/clang/test/CodeGenCXX/debug-info-anon-union-vars.cpp
+++ b/clang/test/CodeGenCXX/debug-info-anon-union-vars.cpp
@@ -51,13 +51,13 @@ void instantiate(int x) {
 // CHECK: !DIGlobalVariable(name: "b",{{.*}} file: [[FILE]], line: 6,{{.*}} isLocal: true, isDefinition: true
 // CHECK: !DIGlobalVariable(name: "result", {{.*}} isLocal: false, isDefinition: true
 // CHECK: !DIGlobalVariable(name: "value", {{.*}} isLocal: false, isDefinition: true
-// CHECK: !DILocalVariable(name: "i", {{.*}}, flags: DIFlagArtificial
-// CHECK: !DILocalVariable(name: "c", {{.*}}, flags: DIFlagArtificial
-// CHECK: !DILocalVariable(
-// CHECK-NOT: name:
-// CHECK: type: ![[UNION:[0-9]+]]
-// CHECK: ![[UNION]] = distinct !DICompositeType(tag: DW_TAG_union_type,
+// CHECK: ![[UNION:[0-9]+]] = distinct !DICompositeType(tag: DW_TAG_union_type,
 // CHECK-NOT: name:
 // CHECK: elements
 // CHECK: !DIDerivedType(tag: DW_TAG_member, name: "i", scope: ![[UNION]],
 // CHECK: !DIDerivedType(tag: DW_TAG_member, name: "c", scope: ![[UNION]],
+// CHECK: !DILocalVariable(name: "i", {{.*}}, flags: DIFlagArtificial
+// CHECK: !DILocalVariable(name: "c", {{.*}}, flags: DIFlagArtificial
+// CHECK: !DILocalVariable(
+// CHECK-NOT: name:
+// CHECK: type: ![[UNION]]
diff --git a/clang/test/CodeGenCXX/debug-info-codeview-unnamed.cpp b/clang/test/CodeGenCXX/debug-info-codeview-unnamed.cpp
index 9fcb1c68d7efa7..06767370483790 100644
--- a/clang/test/CodeGenCXX/debug-info-codeview-unnamed.cpp
+++ b/clang/test/CodeGenCXX/debug-info-codeview-unnamed.cpp
@@ -3,6 +3,34 @@
 
 int main(int argc, char* argv[], char* arge[]) {
   //
+  // LINUX:      [[TYPE_OF_ONE:![0-9]+]] = distinct !DICompositeType(
+  // LINUX-SAME:     tag: DW_TAG_structure_type
+  // LINUX-NOT:      name:
+  // LINUX-NOT:      identifier:
+  // LINUX-SAME:     )
+
+  //
+  // LINUX:      [[TYPE_OF_TWO:![0-9]+]] = distinct !DICompositeType(
+  // LINUX-SAME:     tag: DW_TAG_structure_type
+  // LINUX-NOT:      name:
+  // LINUX-NOT:      identifier:
+  // LINUX-SAME:     )
+  //
+
+  //
+  // LINUX:      [[TYPE_OF_THREE:![0-9]+]] = distinct !DICompositeType(
+  // LINUX-SAME:     tag: DW_TAG_structure_type
+  // LINUX-SAME:     name: "named"
+  // LINUX-NOT:      identifier:
+  // LINUX-SAME:     )
+
+  //
+  // LINUX:      [[TYPE_OF_FOUR:![0-9]+]] = distinct !DICompositeType(
+  // LINUX-SAME:     tag: DW_TAG_class_type
+  // LINUX-NOT:      name:
+  // LINUX-NOT:      identifier:
+  // LINUX-SAME:     )
+
   // In CodeView, the LF_MFUNCTION entry for "bar()" refers to the forward
   // reference of the unnamed struct. Visual Studio requires a unique
   // identifier to match the LF_STRUCTURE forward reference to the definition.
@@ -10,24 +38,19 @@ int main(int argc, char* argv[], char* arge[]) {
   struct { void bar() {} } one;
   //
   // LINUX:      !{{[0-9]+}} = !DILocalVariable(name: "one"
-  // LINUX-SAME:     type: [[TYPE_OF_ONE:![0-9]+]]
-  // LINUX-SAME:     )
-  // LINUX:      [[TYPE_OF_ONE]] = distinct !DICompositeType(
-  // LINUX-SAME:     tag: DW_TAG_structure_type
-  // LINUX-NOT:      name:
-  // LINUX-NOT:      identifier:
+  // LINUX-SAME:     type: [[TYPE_OF_ONE]]
   // LINUX-SAME:     )
   //
   // MSVC:       !{{[0-9]+}} = !DILocalVariable(name: "one"
   // MSVC-SAME:      type: [[TYPE_OF_ONE:![0-9]+]]
   // MSVC-SAME:      )
+  //
   // MSVC:       [[TYPE_OF_ONE]] = distinct !DICompositeType
   // MSVC-SAME:      tag: DW_TAG_structure_type
   // MSVC-SAME:      name: "<unnamed-type-one>"
   // MSVC-SAME:      identifier: ".?AU<unnamed-type-one>@?1??main@@9@"
   // MSVC-SAME:      )
 
-
   // In CodeView, the LF_POINTER entry for "ptr2unnamed" refers to the forward
   // reference of the unnamed struct. Visual Studio requires a unique
   // identifier to match the LF_STRUCTURE forward reference to the definition.
@@ -36,24 +59,19 @@ int main(int argc, char* argv[], char* arge[]) {
   int decltype(two)::*ptr2unnamed = &decltype(two)::bar;
   //
   // LINUX:      !{{[0-9]+}} = !DILocalVariable(name: "two"
-  // LINUX-SAME:     type: [[TYPE_OF_TWO:![0-9]+]]
-  // LINUX-SAME:     )
-  // LINUX:      [[TYPE_OF_TWO]] = distinct !DICompositeType(
-  // LINUX-SAME:     tag: DW_TAG_structure_type
-  // LINUX-NOT:      name:
-  // LINUX-NOT:      identifier:
+  // LINUX-SAME:     type: [[TYPE_OF_TWO]]
   // LINUX-SAME:     )
   //
   // MSVC:       !{{[0-9]+}} = !DILocalVariable(name: "two"
   // MSVC-SAME:      type: [[TYPE_OF_TWO:![0-9]+]]
   // MSVC-SAME:      )
+  //
   // MSVC:       [[TYPE_OF_TWO]] = distinct !DICompositeType
   // MSVC-SAME:      tag: DW_TAG_structure_type
   // MSVC-SAME:      name: "<unnamed-type-two>"
   // MSVC-SAME:      identifier: ".?AU<unnamed-type-two>@?2??main@@9@"
   // MSVC-SAME:      )
 
-
   // In DWARF, named structures which are not externally visibile do not
   // require an identifier.  In CodeView, named structures are given an
   // identifier.
@@ -61,24 +79,19 @@ int main(int argc, char* argv[], char* arge[]) {
   struct named { int bar; int named::* p2mem; } three = { 42, &named::bar };
   //
   // LINUX:      !{{[0-9]+}} = !DILocalVariable(name: "three"
-  // LINUX-SAME:     type: [[TYPE_OF_THREE:![0-9]+]]
-  // LINUX-SAME:     )
-  // LINUX:      [[TYPE_OF_THREE]] = distinct !DICompositeType(
-  // LINUX-SAME:     tag: DW_TAG_structure_type
-  // LINUX-SAME:     name: "named"
-  // LINUX-NOT:      identifier:
+  // LINUX-SAME:     type: [[TYPE_OF_THREE]]
   // LINUX-SAME:     )
   //
   // MSVC:       !{{[0-9]+}} = !DILocalVariable(name: "three"
   // MSVC-SAME:      type: [[TYPE_OF_THREE:![0-9]+]]
   // MSVC-SAME:      )
+  //
   // MSVC:       [[TYPE_OF_THREE]] = distinct !DICompositeType
   // MSVC-SAME:      tag: DW_TAG_structure_type
   // MSVC-SAME:      name: "named"
   // MSVC-SAME:      identifier: ".?AUnamed@?1??main@@9@"
   // MSVC-SAME:      )
 
-
   // In CodeView, the LF_MFUNCTION entry for the lambda "operator()" routine
   // refers to the forward reference of the unnamed LF_CLASS for the lambda.
   // Visual Studio requires a unique identifier to match the forward reference
@@ -87,17 +100,13 @@ int main(int argc, char* argv[], char* arge[]) {
   auto four = [argc](int i) -> int { return argc == i ? 1 : 0; };
   //
   // LINUX:      !{{[0-9]+}} = !DILocalVariable(name: "four"
-  // LINUX-SAME:     type: [[TYPE_OF_FOUR:![0-9]+]]
-  // LINUX-SAME:     )
-  // LINUX:      [[TYPE_OF_FOUR]] = distinct !DICompositeType(
-  // LINUX-SAME:     tag: DW_TAG_class_type
-  // LINUX-NOT:      name:
-  // LINUX-NOT:      identifier:
+  // LINUX-SAME:     type: [[TYPE_OF_FOUR]]
   // LINUX-SAME:     )
   //
   // MSVC:       !{{[0-9]+}} = !DILocalVariable(name: "four"
   // MSVC-SAME:      type: [[TYPE_OF_FOUR:![0-9]+]]
   // MSVC-SAME:      )
+  //
   // MSVC:       [[TYPE_OF_FOUR]] = distinct !DICompositeType
   // MSVC-SAME:      tag: DW_TAG_class_type
   // MSVC-SAME:      name: "<lambda_0>"
diff --git a/clang/test/CodeGenCXX/debug-info-gline-tables-only-codeview.cpp b/clang/test/CodeGenCXX/debug-info-gline-tables-only-codeview.cpp
index 6b9c9a243decd1..122e4aa62ea7df 100644
--- a/clang/test/CodeGenCXX/debug-info-gline-tables-only-codeview.cpp
+++ b/clang/test/CodeGenCXX/debug-info-gline-tables-only-codeview.cpp
@@ -51,9 +51,9 @@ void test() {
   // CHECK-SAME:                              name: "<lambda_2_1>",
   c.lambda_params();
 
-  // CHECK: !DISubprogram(name: "operator()", scope: ![[LAMBDA1:[0-9]+]],
-  // CHECK: ![[LAMBDA1]] = !DICompositeType(tag: DW_TAG_class_type,
+  // CHECK: ![[LAMBDA1:[0-9]+]] = !DICompositeType(tag: DW_TAG_class_type,
   // CHECK-SAME:                            name: "<lambda_1>",
   // CHECK-SAME:                            flags: DIFlagFwdDecl
+  // CHECK: !DISubprogram(name: "operator()", scope: ![[LAMBDA1]],
   c.lambda2();
 }
diff --git a/clang/test/CodeGenCXX/debug-info-local-types.cpp b/clang/test/CodeGenCXX/debug-info-local-types.cpp
new file mode 100644
index 00000000000000..9a249fbf388312
--- /dev/null
+++ b/clang/test/CodeGenCXX/debug-info-local-types.cpp
@@ -0,0 +1,79 @@
+// RUN: %clang_cc1 -triple %itanium_abi_triple %s -o - -O0 -emit-llvm \
+// RUN:   -disable-llvm-passes -debug-info-kind=limited | FileCheck %s
+//
+// Test that types declared inside functions, that receive an "identifier"
+// field used for ODR-uniquing, are placed inside the declaration DISubprogram
+// for the function rather than the definition DISubprogram. This avoids
+// later problems with distinct types in distinct DISubprograms being
+// inadvertantly unique'd; see github PR 75385.
+//
+// NB: The types below are marked distinct, but other portions of LLVM
+// force-unique them at a later date, see the enableDebugTypeODRUniquing
+// feature. Clang doesn't enable that itself; instead this test ensures a safe
+// representation of the types is produced.
+//
+// The check-lines below are not strictly in order of hierachy, so here's a
+// diagram of what's desired:
+//
+//                  DIFile
+//                    |
+//          Decl-DISubprogram "foo"
+//          /                      \
+//         /                        \
+// Def-DISubprogram "foo"    DICompositeType "bar"
+//                                   |
+//                                   |
+//                          Decl-DISubprogram "get_a"
+//                         /         |
+//                        /          |
+// Def-DISubprogram "get_a"    DICompositeType "baz"
+//                                   |
+//                                   |
+//                        {Def,Decl}-DISubprogram "get_b"
+
+// CHECK: ![[FILENUM:[0-9]+]] = !DIFile(filename: "{{.*}}debug-info-local-types.cpp",
+
+// CHECK: ![[BARSTRUCT:[0-9]+]] = distinct !DICo...
[truncated]

Copy link

github-actions bot commented Dec 6, 2024

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff fdb1bf9b5949b2a97041922405a812a060fce5f4 3610d282bb493e26ee044cac7854a09f9accdce0 --extensions c,h,cpp -- clang/test/CodeGenCXX/debug-info-local-types.cpp clang/lib/CodeGen/CGDebugInfo.cpp clang/lib/CodeGen/CGDebugInfo.h clang/test/CodeGen/debug-info-codeview-unnamed.c clang/test/CodeGen/debug-info-unused-types.c clang/test/CodeGen/debug-info-unused-types.cpp clang/test/CodeGenCXX/debug-info-access.cpp clang/test/CodeGenCXX/debug-info-anon-union-vars.cpp clang/test/CodeGenCXX/debug-info-codeview-unnamed.cpp clang/test/CodeGenCXX/debug-info-gline-tables-only-codeview.cpp clang/test/CodeGenCXX/debug-lambda-this.cpp llvm/include/llvm/IR/DIBuilder.h llvm/include/llvm/IR/DebugInfo.h llvm/lib/Bitcode/Reader/MetadataLoader.cpp llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.h llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp llvm/lib/IR/DIBuilder.cpp llvm/lib/IR/DebugInfo.cpp llvm/lib/IR/Verifier.cpp llvm/lib/Transforms/Utils/CloneFunction.cpp llvm/unittests/Transforms/Utils/CloningTest.cpp
View the diff from clang-format here.
diff --git a/llvm/lib/Bitcode/Reader/MetadataLoader.cpp b/llvm/lib/Bitcode/Reader/MetadataLoader.cpp
index 194e3c7eba..cb5860c31d 100644
--- a/llvm/lib/Bitcode/Reader/MetadataLoader.cpp
+++ b/llvm/lib/Bitcode/Reader/MetadataLoader.cpp
@@ -132,7 +132,7 @@ public:
   Metadata *back() const { return MetadataPtrs.back(); }
   void pop_back() { MetadataPtrs.pop_back(); }
   bool empty() const { return MetadataPtrs.empty(); }
-  const_iterator begin() {  return MetadataPtrs.begin(); }
+  const_iterator begin() { return MetadataPtrs.begin(); }
   const_iterator end() { return MetadataPtrs.end(); }
 
   Metadata *operator[](unsigned i) const {
diff --git a/llvm/lib/IR/DIBuilder.cpp b/llvm/lib/IR/DIBuilder.cpp
index c6215951e7..705a1312e1 100644
--- a/llvm/lib/IR/DIBuilder.cpp
+++ b/llvm/lib/IR/DIBuilder.cpp
@@ -361,10 +361,10 @@ DIBuilder::createTemplateAlias(DIType *Ty, StringRef Name, DIFile *File,
                                unsigned LineNo, DIScope *Context,
                                DINodeArray TParams, uint32_t AlignInBits,
                                DINode::DIFlags Flags, DINodeArray Annotations) {
-  auto *T = DIDerivedType::get(VMContext, dwarf::DW_TAG_template_alias, Name, File,
-                            LineNo, getNonCompileUnitScope(Context), Ty, 0,
-                            AlignInBits, 0, std::nullopt, std::nullopt, Flags,
-                            TParams.get(), Annotations);
+  auto *T = DIDerivedType::get(VMContext, dwarf::DW_TAG_template_alias, Name,
+                               File, LineNo, getNonCompileUnitScope(Context),
+                               Ty, 0, AlignInBits, 0, std::nullopt,
+                               std::nullopt, Flags, TParams.get(), Annotations);
   if (isa_and_nonnull<DILocalScope>(Context))
     getSubprogramNodesTrackingVector(Context).emplace_back(T);
   return T;

@@ -486,6 +486,8 @@ class CGDebugInfo {
void EmitFunctionDecl(GlobalDecl GD, SourceLocation Loc,
QualType FnType, llvm::Function *Fn = nullptr);

llvm::DIScope *PickCompositeTypeScope(llvm::DIScope *S, StringRef Identifier);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a doxygen comment that explains the purpose of this function?

@@ -546,55 +546,75 @@ class MetadataLoader::MetadataLoaderImpl {

/// Move local imports from DICompileUnit's 'imports' field to
/// DISubprogram's retainedNodes.
/// Move fucntion-local enums from DICompileUnit's enums
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// Move fucntion-local enums from DICompileUnit's enums
/// Move function-local enums from DICompileUnit's enums

@@ -718,6 +738,29 @@ class MetadataLoader::MetadataLoaderImpl {
upgradeCULocals();
}

void cloneLocalTypes() {
for (unsigned I = 0; I < MetadataList.size(); ++I) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

range-based for?

@dwblaikie
Copy link
Collaborator

No idea if this sufficiently overlaps with, or is orthogonal to the following, but:

Currently we don't scope function-local types into the appropriate scope inside a function. It seems like associating types with their declaration may make this harder to address - how would we handle that? (like, we'd have to create scopes for function declarations? somehow associate those with the scopes in concrete definitions of a function?)

Maybe it's too big/complex a problem to try to think about, and incremental development should overrule here - but may be worth thinking about?

@jmorse
Copy link
Member Author

jmorse commented Dec 10, 2024

Maybe it's too big/complex a problem to try to think about, and incremental development should overrule here - but may be worth thinking about?

Urgh. I'll admit that I hadn't thought that far ahead, I just wanted to get the patch-series going again now that the root problem is identified.

The DWARF in the existing patch series [0] seems to take a similar direction to this patch, of putting information into declarations where possible. Have a look at the check-lines for llvm/test/DebugInfo/Generic/lexical-block-static-var.ll: the static-local variables are defined and given a location inside DW_TAG_lexical_blocks within the abstract definitions of functions, and then there are some other DW_TAG_lexical_blocks inside the inlined instances. It's not clear to me how the inlined-lexical-blocks refer back to the abstract ones.

I don't have a complete understanding of how the entire patch series operates, but types-and-variables-in-declarations appears to be where we're heading. It makes sense for our metadata to reflect this. Given that the ODR-uniquing issue was only found experimentally in-the-field, I think we have to accept the incremental approach and discover problems by experience.

Stepping back slightly: I think the ODR-uniquing of these types is directly opposed to scope-precision: reducing the number of type definitions down to one necessarily means discarding the extra locality information about where it's scoped in different function instances. Given that the ODR-uniquing facility already exists, I assume it's not something that can be turned off due to efficiency reasons. Thus we're reduced to workarounds while pushing towards the end-solution.

[0] final significant patch https://reviews.llvm.org/D144008 , you have to manually delete the phab-error overlay

@dwblaikie
Copy link
Collaborator

Maybe it's too big/complex a problem to try to think about, and incremental development should overrule here - but may be worth thinking about?

Urgh. I'll admit that I hadn't thought that far ahead, I just wanted to get the patch-series going again now that the root problem is identified.

The DWARF in the existing patch series [0] seems to take a similar direction to this patch, of putting information into declarations where possible. Have a look at the check-lines for llvm/test/DebugInfo/Generic/lexical-block-static-var.ll:

the static-local variables are defined and given a location inside DW_TAG_lexical_blocks within the abstract definitions of functions,

Ah, fair enough, because the subprogram will contian a list of these local types, then the local type's scope will be a (potentially series of) lexical scope - so the lexical scopes are preserved? That's OK-ish, at least not as problematic as I was imagining. (rest of the discussion below here is just discussion - this point ^ is enough to satisfy my concerns for this patch direction, at least)

and then there are some other DW_TAG_lexical_blocks inside the inlined instances. It's not clear to me how the inlined-lexical-blocks refer back to the abstract ones.

Yeah, that's probably missing - the ideal would be the inlined/concrete non-inlined instance's lexical blocks should have abstract_origins that refer to the abstract lexical blocks.

Given that the ODR-uniquing issue was only found experimentally in-the-field, I think we have to accept the incremental approach and discover problems by experience.

Not sure I follow this.

Stepping back slightly: I think the ODR-uniquing of these types is directly opposed to scope-precision: reducing the number of type definitions down to one necessarily means discarding the extra locality information about where it's scoped in different function instances.

Not sure I follow this either - I think that if each concrete lexical scope references the abstract ones, we could emit correct abstract origins to get the local static variables and types to be scoped appropriately in the resulting DWARF/for the consumer.

GCC does this: https://godbolt.org/z/Khrzdq1Wx

@jmorse
Copy link
Member Author

jmorse commented Dec 13, 2024

Ah, fair enough, because the subprogram will contian a list of these local types, then the local type's scope will be a (potentially series of) lexical scope - so the lexical scopes are preserved?

That's certainly the outcome that the original patch series is aiming for -- I'm not completely certain that we can still achieve it. Part of the patch series clones types contained within distinct DISubprograms retainedNodes when they're inlined, which suggests that there's a connection between the types and the duplicated/distinct scopes that must be maintained, something that this patch would damage.

I suppose I should hold off landing this patch until I fully understand that mechanism, and check we can still reach the desired outcome.

Given that the ODR-uniquing issue was only found experimentally in-the-field, I think we have to accept the incremental approach and discover problems by experience.

Not sure I follow this.

I mean: at some point we just have to commit code and see what breaks, to discover obscure requirements.

Stepping back slightly: I think the ODR-uniquing of these types is directly opposed to scope-precision: reducing the number of type definitions down to one necessarily means discarding the extra locality information about where it's scoped in different function instances.

Not sure I follow this either - I think that if each concrete lexical scope references the abstract ones, we could emit correct abstract origins to get the local static variables and types to be scoped appropriately in the resulting DWARF/for the consumer.

Indeed, if that connection is present then we should be fine. I'm worried that some of the extra plumbing (i.e. the cloning-of-types-during-inlining) indicates that this property requires maintenance during optimisation, and that that maintenance could be impossible in the presence of the ODR-uniquing feature.

@dwblaikie
Copy link
Collaborator

yeah, I worry this feels all a bit vague (that some things "suggest" other things, that root problems "seem to be" (& some fairly broad descriptions of what they seem to be) - not to nitpick your language, and I appreciate the clarity of the uncertainty, to be sure - but the uncertainty is a bit concerning)

Might help to have some specific/narrow example(s) to walk through & discuss how we think they should work?

@jmorse
Copy link
Member Author

jmorse commented Dec 15, 2024

Indeed @ the language, it's difficult to know that this is definitely going to work when I don't have a full understanding of the patch series, thus the caution,

I've studied a bit further and even with the full patch series we seem to be missing an important link. Here's a gist: https://gist.github.com/jmorse/091ae6ed42e6a36199b8593a7b81bd49 which contains the LLVM DWARF output for the aforementioned lexical-block-static-var.ll . While it passes the FileCheck lines in the test, it lacks the DW_AT_abstract_origin that gcc produces -- observe that nothing references addresses 8a or 5b. The consumer is then presumably left with a problem: it might be able to discover the static variables A and B, and that they're in a lexical block, but not which lexical block. There's also a risk that I misapplied D144008, which I had to copy-and-paste out of Phab, @dzhidzhoev would you have a public-github branch that contains that patch?.

In the IR metadata, each DIGlobalVariable refers into a tree of distinct DILexicalScopes and DISubprograms that's part of the definition subprogram for the relevant function. I suspect that in scenarios where multiple instances of an inline function with a static variable are LTO'd, we might see a repeat of this type uniquing problem: LLVM will pick a single DIGlobalVariable as the location for such an inlined-function static variable, which will then refer into the DILexicalScope/DISubprogram tree for one instance of the inlined function, potentially causing confusion or hard-to-interpret DWARF. I think the natural solution to this is putting DIGlobalVariable scopes into a "declaration" tree of DILexicalScopes/DISubprograms too.

I'll try to cook up a precise example of this happening so that it's clear, but have to dash for now.

@jmorse
Copy link
Member Author

jmorse commented Dec 15, 2024

(" it lacks the DW_AT_abstract_origin that gcc produces..." for the DW_TAG_lexical_scopes referring back to the abstract/declaration subprogram).

@dzhidzhoev
Copy link
Member

dzhidzhoev commented Feb 26, 2025

Apologies for the delay in reviewing that. I was stuck on the original issue, so I appreciate this contribution. Now it's clear what was the root cause of the problem.

There's also a risk that I misapplied D144008, which I had to copy-and-paste out of Phab, @dzhidzhoev would you have a public-github branch that contains that patch?.

Yes, it is here https://github.com/dzhidzhoev/llvm-project/tree/rfc-krisb. I've rebased it on top of your changes. I can try merging the rest of the patches from the set after this one, though I'm not sure if I should open new PRs here on GitHub for them, or how else we're going to collect feedback after a commit gets merged.

dzhidzhoev added a commit to dzhidzhoev/llvm-project that referenced this pull request Apr 17, 2025
…_TAG_lexical_blocks

During the discussion under
llvm#119001, it was noticed, that
concrete DW_TAG_lexical_blocks should refer to corresponding abstract
DW_TAG_lexical_blocks to avoid ambiguity. This behavior is implemented
in GCC (https://godbolt.org/z/Khrzdq1Wx), but not in LLVM so far.

Fixes llvm#49297.
dzhidzhoev added a commit to dzhidzhoev/llvm-project that referenced this pull request Apr 17, 2025
…_TAG_lexical_blocks

During the discussion under
llvm#119001, it was noticed, that
concrete DW_TAG_lexical_blocks should refer to corresponding abstract
DW_TAG_lexical_blocks to avoid ambiguity. This behavior is implemented
in GCC (https://godbolt.org/z/Khrzdq1Wx), but not in LLVM so far.

Fixes llvm#49297.
dzhidzhoev added a commit that referenced this pull request Apr 24, 2025
…_TAG_lexical_blocks (#136205)

During the discussion under
#119001, it was noticed that
concrete DW_TAG_lexical_blocks should refer to corresponding abstract
DW_TAG_lexical_blocks by having DW_AT_abstract_origin, to avoid
ambiguity. This behavior is implemented in GCC
(https://godbolt.org/z/Khrzdq1Wx), but not in LLVM.

Fixes #49297.
dzhidzhoev added a commit to dzhidzhoev/llvm-project that referenced this pull request May 5, 2025
…ns are not cloned (NFC)

This change was separated from llvm#119001.

When cloning functions, use IdentityMDPredicate to ensure that
if DISubprogram is not cloned, then its DILocalVariables are not cloned
either.

This is expected to be an NFC currently, as DILocalVariables only reference
their subprograms (via DILocalScopes) and types.
Since inlined DISubprograms and DITypes are not cloned during the process,
DILocalVariables are mapped to self in ValueMapper (in mapTopLevelUniquedNode).

However, it will be needed for the original PR llvm#119001,
where it is possible that the type of a DILocalVariable is cloned
if it lies in the scope of a DISubprogram that is being cloned,
whereas the DILocalVariable itself belongs to a different DISubprogram.
I'm making this change into a separate PR to make the original PR a bit smaller,
and since this has more to do with variables than with types.
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request May 6, 2025
…/inlined DW_TAG_lexical_blocks (#136205)

During the discussion under
llvm/llvm-project#119001, it was noticed that
concrete DW_TAG_lexical_blocks should refer to corresponding abstract
DW_TAG_lexical_blocks by having DW_AT_abstract_origin, to avoid
ambiguity. This behavior is implemented in GCC
(https://godbolt.org/z/Khrzdq1Wx), but not in LLVM.

Fixes llvm/llvm-project#49297.
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
…_TAG_lexical_blocks (llvm#136205)

During the discussion under
llvm#119001, it was noticed that
concrete DW_TAG_lexical_blocks should refer to corresponding abstract
DW_TAG_lexical_blocks by having DW_AT_abstract_origin, to avoid
ambiguity. This behavior is implemented in GCC
(https://godbolt.org/z/Khrzdq1Wx), but not in LLVM.

Fixes llvm#49297.
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
…_TAG_lexical_blocks (llvm#136205)

During the discussion under
llvm#119001, it was noticed that
concrete DW_TAG_lexical_blocks should refer to corresponding abstract
DW_TAG_lexical_blocks by having DW_AT_abstract_origin, to avoid
ambiguity. This behavior is implemented in GCC
(https://godbolt.org/z/Khrzdq1Wx), but not in LLVM.

Fixes llvm#49297.
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
…_TAG_lexical_blocks (llvm#136205)

During the discussion under
llvm#119001, it was noticed that
concrete DW_TAG_lexical_blocks should refer to corresponding abstract
DW_TAG_lexical_blocks by having DW_AT_abstract_origin, to avoid
ambiguity. This behavior is implemented in GCC
(https://godbolt.org/z/Khrzdq1Wx), but not in LLVM.

Fixes llvm#49297.
Ankur-0429 pushed a commit to Ankur-0429/llvm-project that referenced this pull request May 9, 2025
…_TAG_lexical_blocks (llvm#136205)

During the discussion under
llvm#119001, it was noticed that
concrete DW_TAG_lexical_blocks should refer to corresponding abstract
DW_TAG_lexical_blocks by having DW_AT_abstract_origin, to avoid
ambiguity. This behavior is implemented in GCC
(https://godbolt.org/z/Khrzdq1Wx), but not in LLVM.

Fixes llvm#49297.
Comment on lines +3948 to +3951
DISubprogram *DeclSP = DBuilder.createFunction(
SP->getScope(), SP->getName(), SP->getLinkageName(), SP->getFile(),
SP->getLine(), SP->getType(), SP->getScopeLine(), Flags, SPFlags,
SP->getTemplateParams(), nullptr, nullptr, SP->getAnnotations());
Copy link
Member

@dzhidzhoev dzhidzhoev May 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've noticed that in the LLVM IR output of clang/test/CodeGenCXX/debug-info-local-types.cpp, !DICompositeType(tag: DW_TAG_class_type, name: "bar" is not attached to the retainedNodes of its scope (declaration DISubprogram for "foo").

Is it the desired behavior for us?

The reason is that DIBuilder::finalizeSubprogram() is never called for DeclSP. We could track it and finalize it in CGDebugInfo::finalize(), or we could try something like this #139914.

dzhidzhoev added a commit that referenced this pull request May 14, 2025
…ns are not cloned (NFC) (#138590)

This change was separated from
#119001.

When cloning functions, use IdentityMDPredicate to ensure that if
DISubprogram is not cloned, then its DILocalVariables are not cloned
either.

This is currently expected to be an NFC, as DILocalVariables only
reference their subprograms (via DILocalScopes) and types, and inlined
DISubprograms and DITypes are not cloned. Thus, DILocalVariables are
mapped to self in ValueMapper (in mapTopLevelUniquedNode).

However, it will be needed for the original PR
#119001, where a
DILocalVariable may refer to a local type of a DISubprogram that is
being cloned. In that case, ValueMapper will clone DILocalVariable even
if it belongs to an inlined DISubprogram that is not cloned, which
should be avoided.
I'm making this change into a separate PR to make the original PR a bit
smaller, and because this has more to do with variables than with types.
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request May 14, 2025
…ned functions are not cloned (NFC) (#138590)

This change was separated from
llvm/llvm-project#119001.

When cloning functions, use IdentityMDPredicate to ensure that if
DISubprogram is not cloned, then its DILocalVariables are not cloned
either.

This is currently expected to be an NFC, as DILocalVariables only
reference their subprograms (via DILocalScopes) and types, and inlined
DISubprograms and DITypes are not cloned. Thus, DILocalVariables are
mapped to self in ValueMapper (in mapTopLevelUniquedNode).

However, it will be needed for the original PR
llvm/llvm-project#119001, where a
DILocalVariable may refer to a local type of a DISubprogram that is
being cloned. In that case, ValueMapper will clone DILocalVariable even
if it belongs to an inlined DISubprogram that is not cloned, which
should be avoided.
I'm making this change into a separate PR to make the original PR a bit
smaller, and because this has more to do with variables than with types.
dzhidzhoev added a commit to dzhidzhoev/llvm-project that referenced this pull request May 16, 2025
Since https://reviews.llvm.org/D144004, DISubprogram's retainedNodes
field is used to track DIImportedEntities, in addition to local
variables and labels.

However, the corresponding update for DebugInfoFinder, to make it visit
DISubprogram's retainedNodes, was missing.

This is the fix for it.

This change is separated from
llvm#119001 to simplify it.
dzhidzhoev added a commit that referenced this pull request May 20, 2025
…#140285)

Since https://reviews.llvm.org/D144004, DISubprogram's retainedNodes
field is used to track DIImportedEntities, in addition to local
variables and labels.

However, the corresponding update for DebugInfoFinder, to make it visit
DISubprogram's retainedNodes, was missing.
This is the fix for it.

This change is separated from
#119001 to simplify it.
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request May 20, 2025
…nto account (#140285)

Since https://reviews.llvm.org/D144004, DISubprogram's retainedNodes
field is used to track DIImportedEntities, in addition to local
variables and labels.

However, the corresponding update for DebugInfoFinder, to make it visit
DISubprogram's retainedNodes, was missing.
This is the fix for it.

This change is separated from
llvm/llvm-project#119001 to simplify it.
sivan-shani pushed a commit to sivan-shani/llvm-project that referenced this pull request Jun 3, 2025
…llvm#140285)

Since https://reviews.llvm.org/D144004, DISubprogram's retainedNodes
field is used to track DIImportedEntities, in addition to local
variables and labels.

However, the corresponding update for DebugInfoFinder, to make it visit
DISubprogram's retainedNodes, was missing.
This is the fix for it.

This change is separated from
llvm#119001 to simplify it.
ajaden-codes pushed a commit to Jaddyen/llvm-project that referenced this pull request Jun 6, 2025
…llvm#140285)

Since https://reviews.llvm.org/D144004, DISubprogram's retainedNodes
field is used to track DIImportedEntities, in addition to local
variables and labels.

However, the corresponding update for DebugInfoFinder, to make it visit
DISubprogram's retainedNodes, was missing.
This is the fix for it.

This change is separated from
llvm#119001 to simplify it.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:codegen IR generation bugs: mangling, exceptions, etc. clang Clang issues not falling into any other category debuginfo llvm:ir llvm:transforms
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants