Skip to content

Conversation

huangjd
Copy link
Contributor

@huangjd huangjd commented Jun 12, 2024

This is a different implementation to #94100, which has been reverted.

When -fdebug-info-for-profiling is specified, for any Load expression if the pointer operand is not a declared variable, clang will emit debug info describing the type of the pointer operand (which can be an intermediate expr)

…eferencing pointer to pointers

This is a different implementation to llvm#94100, which has been reverted.

When -fdebug-info-for-profiling is specified, for any Load expression if
the pointer operand is not a declared variable, clang will emit debug
info describing the type of the pointer operand (which can be an
intermediate expr)
@huangjd huangjd added clang:codegen IR generation bugs: mangling, exceptions, etc. debuginfo labels Jun 12, 2024
@llvmbot llvmbot added the clang Clang issues not falling into any other category label Jun 12, 2024
@llvmbot
Copy link
Member

llvmbot commented Jun 12, 2024

@llvm/pr-subscribers-clang
@llvm/pr-subscribers-debuginfo

@llvm/pr-subscribers-clang-codegen

Author: William Junda Huang (huangjd)

Changes

This is a different implementation to #94100, which has been reverted.

When -fdebug-info-for-profiling is specified, for any Load expression if the pointer operand is not a declared variable, clang will emit debug info describing the type of the pointer operand (which can be an intermediate expr)


Full diff: https://github.com/llvm/llvm-project/pull/95298.diff

4 Files Affected:

  • (modified) clang/lib/CodeGen/CGDebugInfo.cpp (+51)
  • (modified) clang/lib/CodeGen/CGDebugInfo.h (+6)
  • (modified) clang/lib/CodeGen/CGExprScalar.cpp (+20-1)
  • (added) clang/test/CodeGenCXX/debug-info-ptr-to-ptr.cpp (+115)
diff --git a/clang/lib/CodeGen/CGDebugInfo.cpp b/clang/lib/CodeGen/CGDebugInfo.cpp
index 99e12da0081af..d3b754d14b19e 100644
--- a/clang/lib/CodeGen/CGDebugInfo.cpp
+++ b/clang/lib/CodeGen/CGDebugInfo.cpp
@@ -5746,6 +5746,57 @@ void CGDebugInfo::EmitExternalVariable(llvm::GlobalVariable *Var,
   Var->addDebugInfo(GVE);
 }
 
+void CGDebugInfo::EmitPseudoVariable(CGBuilderTy &Builder,
+                                     llvm::Instruction *Value, QualType Ty) {
+  // Only when -g2 or above is specified, debug info for variables will be
+  // generated.
+  if (CGM.getCodeGenOpts().getDebugInfo() <=
+      llvm::codegenoptions::DebugLineTablesOnly)
+    return;
+
+  llvm::DILocation *DIL = Value->getDebugLoc().get();
+  if (!DIL)
+    return;
+
+  llvm::DIFile *Unit = DIL->getFile();
+  llvm::DIType *Type = getOrCreateType(Ty, Unit);
+
+  // Check if Value is already a declared variable and has debug info, in this
+  // case we have nothing to do. Clang emits declared variable as alloca, and
+  // it is loaded upon use, so we identify such pattern here.
+  if (llvm::LoadInst *Load = dyn_cast<llvm::LoadInst>(Value)) {
+    llvm::Value *Var = Load->getPointerOperand();
+    // There can be implicit type cast applied on a variable if it is an opaque
+    // ptr, in this case its debug info may not match the actual type of object
+    // being used as in the next instruction, so we will need to emit a pseudo
+    // variable for type-casted value.
+    auto DeclareTypeMatches = [&](auto *DbgDeclare) {
+      return DbgDeclare->getVariable()->getType() == Type;
+    };
+    if (any_of(llvm::findDbgDeclares(Var), DeclareTypeMatches) ||
+        any_of(llvm::findDVRDeclares(Var), DeclareTypeMatches))
+      return;
+  }
+
+  // Find the correct location to insert the debug value.
+  llvm::BasicBlock *InsertBB = Value->getParent();
+  llvm::Instruction *InsertBefore = Value->getIterator()->getNextNode();
+  if (llvm::InvokeInst *Invoke = dyn_cast<llvm::InvokeInst>(Value)) {
+    InsertBB = Invoke->getNormalDest();
+    InsertBefore = InsertBB->size() > 0 ? &(InsertBB->front()) : nullptr;
+  }
+
+  llvm::DILocalVariable *D = DBuilder.createAutoVariable(
+      LexicalBlockStack.back(), "", nullptr, 0, Type, false,
+      llvm::DINode::FlagArtificial);
+  if (InsertBefore)
+    DBuilder.insertDbgValueIntrinsic(Value, D, DBuilder.createExpression(), DIL,
+                                   InsertBefore);
+  else
+    DBuilder.insertDbgValueIntrinsic(Value, D, DBuilder.createExpression(), DIL,
+                                     InsertBB);
+}
+
 void CGDebugInfo::EmitGlobalAlias(const llvm::GlobalValue *GV,
                                   const GlobalDecl GD) {
 
diff --git a/clang/lib/CodeGen/CGDebugInfo.h b/clang/lib/CodeGen/CGDebugInfo.h
index 8fe738be21568..da466837aa3c3 100644
--- a/clang/lib/CodeGen/CGDebugInfo.h
+++ b/clang/lib/CodeGen/CGDebugInfo.h
@@ -530,6 +530,12 @@ class CGDebugInfo {
   /// Emit information about an external variable.
   void EmitExternalVariable(llvm::GlobalVariable *GV, const VarDecl *Decl);
 
+  /// Emit a pseudo variable and debug info for an intermediate value if it does
+  /// not correspond to a variable in the source code, so that a profiler can
+  /// track more accurate usage of certain instructions of interest.
+  void EmitPseudoVariable(CGBuilderTy &Builder, llvm::Instruction *Value,
+                          QualType Ty);
+
   /// Emit information about global variable alias.
   void EmitGlobalAlias(const llvm::GlobalValue *GV, const GlobalDecl Decl);
 
diff --git a/clang/lib/CodeGen/CGExprScalar.cpp b/clang/lib/CodeGen/CGExprScalar.cpp
index 1b144c178ce96..58f0a3113b4f8 100644
--- a/clang/lib/CodeGen/CGExprScalar.cpp
+++ b/clang/lib/CodeGen/CGExprScalar.cpp
@@ -1937,7 +1937,26 @@ Value *ScalarExprEmitter::VisitMemberExpr(MemberExpr *E) {
     }
   }
 
-  return EmitLoadOfLValue(E);
+  llvm::Value *Result = EmitLoadOfLValue(E);
+
+  // If -fdebug-info-for-profiling is specified, emit a pseudo variable and its
+  // debug info for the pointer, even if there is no variable associated with
+  // the pointer's expression.
+  if (CGF.CGM.getCodeGenOpts().DebugInfoForProfiling && CGF.getDebugInfo()) {
+    if (llvm::LoadInst *Load = dyn_cast<llvm::LoadInst>(Result)) {
+      if (llvm::GetElementPtrInst *GEP =
+              dyn_cast<llvm::GetElementPtrInst>(Load->getPointerOperand())) {
+        if (llvm::Instruction *Pointer =
+                dyn_cast<llvm::Instruction>(GEP->getPointerOperand())) {
+          QualType Ty = E->getBase()->getType();
+          if (!E->isArrow())
+            Ty = CGF.getContext().getPointerType(Ty);
+          CGF.getDebugInfo()->EmitPseudoVariable(Builder, Pointer, Ty);
+        }
+      }
+    }
+  }
+  return Result;
 }
 
 Value *ScalarExprEmitter::VisitArraySubscriptExpr(ArraySubscriptExpr *E) {
diff --git a/clang/test/CodeGenCXX/debug-info-ptr-to-ptr.cpp b/clang/test/CodeGenCXX/debug-info-ptr-to-ptr.cpp
new file mode 100644
index 0000000000000..51ce0f107b5e3
--- /dev/null
+++ b/clang/test/CodeGenCXX/debug-info-ptr-to-ptr.cpp
@@ -0,0 +1,115 @@
+// Test debug info for intermediate value of a chained pointer deferencing
+// expression when the flag -fdebug-info-for-pointer-type is enabled.
+// RUN: %clang_cc1 -emit-llvm -triple x86_64-linux-gnu %s -fdebug-info-for-profiling -debug-info-kind=constructor -o - | FileCheck %s
+
+class A {
+public:
+  int i;
+  char c;
+  void *p;
+  int arr[3];
+};
+
+class B {
+public:
+  A* a;
+};
+
+class C {
+public:
+  B* b;
+  A* a;
+  A arr[10];
+};
+
+// CHECK-LABEL: define dso_local noundef i32 @{{.*}}func1{{.*}}(
+// CHECK:         [[A_ADDR:%.*]] = getelementptr inbounds %class.B, ptr {{%.*}}, i32 0, i32 0, !dbg [[DBG1:![0-9]+]]
+// CHECK-NEXT:    [[A:%.*]] = load ptr, ptr [[A_ADDR]], align {{.*}}, !dbg [[DBG1]]
+// CHECK-NEXT:    call void @llvm.dbg.value(metadata ptr [[A]], metadata [[META1:![0-9]+]], metadata !DIExpression()), !dbg [[DBG1]]
+// CHECK-NEXT:    {{%.*}} = getelementptr inbounds %class.A, ptr [[A]], i32 0, i32 0,
+int func1(B *b) {
+  return b->a->i;
+}
+
+// Should generate a pseudo variable when pointer is type-casted.
+// CHECK-LABEL: define dso_local noundef ptr @{{.*}}func2{{.*}}(
+// CHECK:         call void @llvm.dbg.declare(metadata ptr [[B_ADDR:%.*]], metadata [[META2:![0-9]+]], metadata !DIExpression())
+// CHECK-NEXT:    [[B:%.*]] = load ptr, ptr [[B_ADDR]],
+// CHECK-NEXT:    call void @llvm.dbg.value(metadata ptr [[B]], metadata [[META3:![0-9]+]], metadata !DIExpression())
+// CHECK-NEXT:    {{%.*}} = getelementptr inbounds %class.B, ptr [[B]], i32 0,
+A* func2(void *b) {
+  return ((B*)b)->a;
+}
+
+// Should not generate pseudo variable in this case.
+// CHECK-LABEL: define dso_local noundef i32 @{{.*}}func3{{.*}}(
+// CHECK:    call void @llvm.dbg.declare(metadata ptr [[B_ADDR:%.*]], metadata [[META4:![0-9]+]], metadata !DIExpression())
+// CHECK:    call void @llvm.dbg.declare(metadata ptr [[LOCAL1:%.*]], metadata [[META5:![0-9]+]], metadata !DIExpression())
+// CHECK-NOT: call void @llvm.dbg.value(metadata ptr
+int func3(B *b) {
+  A *local1 = b->a;
+  return local1->i;
+}
+
+// CHECK-LABEL: define dso_local noundef signext i8 @{{.*}}func4{{.*}}(
+// CHECK:         [[A_ADDR:%.*]] = getelementptr inbounds %class.C, ptr {{%.*}}, i32 0, i32 1
+// CHECK-NEXT:    [[A:%.*]] = load ptr, ptr [[A_ADDR]],
+// CHECK-NEXT:    call void @llvm.dbg.value(metadata ptr [[A]], metadata [[META6:![0-9]+]], metadata !DIExpression())
+// CHECK-NEXT:    {{%.*}} = getelementptr inbounds %class.A, ptr [[A]], i32 0, i32 0,
+// CHECK:         [[CALL:%.*]] = call noundef ptr @{{.*}}foo{{.*}}(
+// CHECK-NEXT:    call void @llvm.dbg.value(metadata ptr [[CALL]], metadata [[META6]], metadata !DIExpression())
+// CHECK-NEXT:    [[I1:%.*]] = getelementptr inbounds %class.A, ptr [[CALL]], i32 0, i32 1
+char func4(C *c) {
+  extern A* foo(int x);
+  return foo(c->a->i)->c;
+}
+
+// CHECK-LABEL: define dso_local noundef signext i8 @{{.*}}func5{{.*}}(
+// CHECK:         call void @llvm.dbg.declare(metadata ptr {{%.*}}, metadata [[META7:![0-9]+]], metadata !DIExpression())
+// CHECK:         call void @llvm.dbg.declare(metadata ptr {{%.*}}, metadata [[META8:![0-9]+]], metadata !DIExpression())
+// CHECK:         [[A_ADDR:%.*]] = getelementptr inbounds %class.A, ptr {{%.*}}, i64 {{%.*}},
+// CHECK-NEXT:    call void @llvm.dbg.value(metadata ptr [[A_ADDR]], metadata [[META9:![0-9]+]], metadata !DIExpression())
+// CHECK-NEXT:    {{%.*}} = getelementptr inbounds %class.A, ptr [[A_ADDR]], i32 0, i32 1,
+char func5(void *arr, int n) {
+  return ((A*)arr)[n].c;
+}
+
+// CHECK-LABEL: define dso_local noundef i32 @{{.*}}func6{{.*}}(
+// CHECK:         call void @llvm.dbg.declare(metadata ptr {{%.*}}, metadata [[META10:![0-9]+]], metadata !DIExpression())
+// CHECK:         call void @llvm.dbg.value(metadata ptr {{%.*}}, metadata [[META11:![0-9]+]], metadata !DIExpression())
+int func6(B &b) {
+  return reinterpret_cast<A&>(b).i;
+}
+
+// CHECK-LABEL: define dso_local noundef i32 @{{.*}}global{{.*}}(
+// CHECK:         [[GA:%.*]] = load ptr, ptr @ga
+// CHECK-NEXT:    call void @llvm.dbg.value(metadata ptr [[GA]], metadata [[META12:![0-9]+]], metadata !DIExpression())
+A *ga;
+int global() {
+  return ga->i;
+}
+
+
+// CHECK-DAG: [[META_A:![0-9]+]] = distinct !DICompositeType(tag: DW_TAG_class_type, name: "A",
+// CHECK-DAG: [[META_AP:![0-9]+]] = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: [[META_A]],
+// CHECK-DAG: [[META_B:![0-9]+]] = distinct !DICompositeType(tag: DW_TAG_class_type, name: "B",
+// CHECK-DAG: [[META_BP:![0-9]+]] = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: [[META_B]],
+// CHECK-DAG: [[META_C:![0-9]+]] = distinct !DICompositeType(tag: DW_TAG_class_type, name: "C",
+// CHECK-DAG: [[META_CP:![0-9]+]] = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: [[META_C]],
+// CHECK-DAG: [[META_VP:![0-9]+]] = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: null,
+// CHECK-DAG: [[META_I32:![0-9]+]] = !DIBasicType(name: "int", size: 32,
+// CHECK-DAG: [[META_BR:![0-9]+]] = !DIDerivedType(tag: DW_TAG_reference_type, baseType: [[META_B]],
+
+// CHECK-DAG: [[DBG1]] = !DILocation(line: 31, column: 13,
+// CHECK-DAG: [[META1]] = !DILocalVariable(scope: {{.*}}, type: [[META_AP]], flags: DIFlagArtificial)
+// CHECK-DAG: [[META2]] = !DILocalVariable(name: "b", arg: 1, scope: {{.*}}, file: {{.*}}, line: 40, type: [[META_VP]])
+// CHECK-DAG: [[META3]] = !DILocalVariable(scope: {{.*}}, type: [[META_BP]], flags: DIFlagArtificial)
+// CHECK-DAG: [[META4]] = !DILocalVariable(name: "b", arg: 1, scope: {{.*}}, file: {{.*}}, line: 49, type: [[META_BP]])
+// CHECK-DAG: [[META5]] = !DILocalVariable(name: "local1", scope: {{.*}}, file: {{.*}}, line: 50, type: [[META_AP]])
+// CHECK-DAG: [[META6]] = !DILocalVariable(scope: {{.*}}, type: [[META_AP]], flags: DIFlagArtificial)
+// CHECK-DAG: [[META7]] = !DILocalVariable(name: "arr", arg: 1, scope: {{.*}}, file: {{.*}}, line: 73, type: [[META_VP]])
+// CHECK-DAG: [[META8]] = !DILocalVariable(name: "n", arg: 2, scope: {{.*}}, file: {{.*}}, line: 73, type: [[META_I32]])
+// CHECK-DAG: [[META9]] = !DILocalVariable(scope: {{.*}}, type: [[META_AP]], flags: DIFlagArtificial)
+// CHECK-DAG: [[META10]] = !DILocalVariable(name: "b", arg: 1, scope: {{.*}}, file: {{.*}}, line: 80, type: [[META_BR]])
+// CHECK-DAG: [[META11]] = !DILocalVariable(scope: {{.*}}, type: [[META_AP]], flags: DIFlagArtificial)
+// CHECK-DAG: [[META12]] = !DILocalVariable(scope: {{.*}}, type: [[META_AP]], flags: DIFlagArtificial)

@huangjd
Copy link
Contributor Author

huangjd commented Jun 12, 2024

It seems the restriction for dbg.value appearing in the front end was lifted recently, so now I can just emit dbg.value directly

Copy link

github-actions bot commented Jun 12, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@pogo59
Copy link
Collaborator

pogo59 commented Jun 13, 2024

You should have a look at this imminent patch which will affect your test.

Copy link
Member

@jmorse jmorse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good in principle, with a couple of nits.

Comment on lines 5781 to 5787
// Find the correct location to insert the debug value.
llvm::BasicBlock *InsertBB = Value->getParent();
llvm::Instruction *InsertBefore = Value->getIterator()->getNextNode();
if (llvm::InvokeInst *Invoke = dyn_cast<llvm::InvokeInst>(Value)) {
InsertBB = Invoke->getNormalDest();
InsertBefore = InsertBB->size() > 0 ? &(InsertBB->front()) : nullptr;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Part of the move towards debug-info away from intrinsics involves not inserting with raw instruction pointers, but instead iterators [0]. In practice I believe all that needs to be done is using InsertBB->begin() to fetch an iterator, you'll likely need to take end() and check it below for the empty-block case.

I understand #94224 provides a more graceful solution by allowing insertion at end(), however that won't land for a short while.

[0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939

Copy link
Contributor

@SLTozer SLTozer Jun 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To add a little extra to this, it looks like what this code is trying to do is the same as llvm::Instruction::getInsertionPointAfterDef - would that be a suitable substitution?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

llvm::DIType *Type = getOrCreateType(Ty, Unit);

// Check if Value is already a declared variable and has debug info, in this
// case we have nothing to do. Clang emits declared variable as alloca, and
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// case we have nothing to do. Clang emits declared variable as alloca, and
// case we have nothing to do. Clang emits declared variable as alloca, and
Suggested change
// case we have nothing to do. Clang emits declared variable as alloca, and
// case we have nothing to do. Clang emits declared variables as alloca, and

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@huangjd huangjd requested a review from jmorse June 14, 2024 02:13
Copy link
Member

@jmorse jmorse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for rapidly rewriting this. I think dereferencing the insertion iterator with &** will eventually need addressing with the move away from intrinsics and insertion, but that can be a matter for when the DIBuilder APIs get deprecated / retired.

@SLTozer
Copy link
Contributor

SLTozer commented Jun 14, 2024

Small recommendation that the patch @pogo59 mentioned above has now landed, so before merging this you should rebase and update the test accordingly.

@huangjd huangjd merged commit 675d8d6 into llvm:main Jun 15, 2024
@dyung
Copy link
Collaborator

dyung commented Jun 15, 2024

@huangjd, I'm not sure the buildbot emails on failure are working, but the test you added seems to be failing on many bots. Can you take a look and revert if you need time to investigate?

huangjd added a commit to huangjd/llvm-project that referenced this pull request Jun 15, 2024
Fix test case in llvm#95298 because another recent submitted patch removed
llvm.dbg intrinsics, updated test case accordingly
@huangjd
Copy link
Contributor Author

huangjd commented Jun 15, 2024

@huangjd, I'm not sure the buildbot emails on failure are working, but the test you added seems to be failing on many bots. Can you take a look and revert if you need time to investigate?

Test updated #95637

jayfoad pushed a commit that referenced this pull request Jun 15, 2024
Fix test case in #95298 because another recent submitted patch removed
llvm.dbg intrinsics, updated test case accordingly
@huangjd huangjd deleted the debuginfoforpointertype branch June 17, 2024 07:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:codegen IR generation bugs: mangling, exceptions, etc. clang Clang issues not falling into any other category debuginfo
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants