Skip to content

[llvm-objdump][BPF] --symbolize-operands: infer local labels for BPF #100550

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jul 31, 2024

Conversation

eddyz87
Copy link
Contributor

@eddyz87 eddyz87 commented Jul 25, 2024

Enable local labels computation for BPF disassembly when --symbolize-operands option is specified.
This relies on MCInstrAnalysis::evaluateBranch() method, which is already defined in BPFMCInstrAnalysis::evaluateBranch.

After this change the assembly code below:

    if r1 > 42 goto +1
    r1 -= 10
    ...

Would be printed as:

    if r1 > 42 goto +1 <L0>
    r1 -= 10
  <L0>:
    ...

(when --symbolize-operands option is set).

See https://reviews.llvm.org/D84191 for the main part of the --symbolize-operands implementation logic.

@llvmbot
Copy link
Member

llvmbot commented Jul 25, 2024

@llvm/pr-subscribers-llvm-binary-utilities

Author: None (eddyz87)

Changes

Enable local labels computation for BPF disassembly when --symbolize-address option is specified.
This relies on MCInstrAnalysis::evaluateBranch() method, which is already defined in BPFMCInstrAnalysis::evaluateBranch.

After this change the assembly code below:

        if r1 &gt; 42 goto +1
        r1 -= 10
        ...

Would be printed as:

        if r1 &gt; 42 goto +1 &lt;L0&gt;
        r1 -= 10
      &lt;L0&gt;:
        ...

(when --symbolize-address option is set).


Full diff: https://github.com/llvm/llvm-project/pull/100550.diff

2 Files Affected:

  • (added) llvm/test/tools/llvm-objdump/BPF/disassemble-symbolize-operands.s (+24)
  • (modified) llvm/tools/llvm-objdump/llvm-objdump.cpp (+3-1)
diff --git a/llvm/test/tools/llvm-objdump/BPF/disassemble-symbolize-operands.s b/llvm/test/tools/llvm-objdump/BPF/disassemble-symbolize-operands.s
new file mode 100644
index 0000000000000..fd4c3d5b2d26e
--- /dev/null
+++ b/llvm/test/tools/llvm-objdump/BPF/disassemble-symbolize-operands.s
@@ -0,0 +1,24 @@
+# REQUIRES: bpf-registered-target
+
+## Verify generation of 'Lxx' labels for local jump targets,
+## when --symbolize-operands option is specified.
+
+# RUN: llvm-mc -triple=bpfel %s -filetype=obj -o - | \
+# RUN:   llvm-objdump -d --symbolize-operands --no-show-raw-insn --no-leading-addr - | \
+# RUN:   FileCheck %s
+        .text
+main:
+        if r1 > 42 goto +2
+        r1 -= 10
+        goto -3
+        r0 = 0
+        exit
+
+# CHECK:      <main>:
+# CHECK-NEXT: <L1>:
+# CHECK-NEXT: 	if r1 > 0x2a goto +0x2 <L0>
+# CHECK-NEXT: 	r1 -= 0xa
+# CHECK-NEXT: 	goto -0x3 <main>
+# CHECK-NEXT: <L0>:
+# CHECK-NEXT: 	r0 = 0x0
+# CHECK-NEXT: 	exit
diff --git a/llvm/tools/llvm-objdump/llvm-objdump.cpp b/llvm/tools/llvm-objdump/llvm-objdump.cpp
index d1240025625ca..bd8a6b0c91c63 100644
--- a/llvm/tools/llvm-objdump/llvm-objdump.cpp
+++ b/llvm/tools/llvm-objdump/llvm-objdump.cpp
@@ -1486,7 +1486,9 @@ collectLocalBranchTargets(ArrayRef<uint8_t> Bytes, MCInstrAnalysis *MIA,
                           std::unordered_map<uint64_t, std::string> &Labels) {
   // So far only supports PowerPC and X86.
   const bool isPPC = STI->getTargetTriple().isPPC();
-  if (!isPPC && !STI->getTargetTriple().isX86())
+  const bool isX86 = STI->getTargetTriple().isX86();
+  const bool isBPF = STI->getTargetTriple().isBPF();
+  if (!isPPC && !isX86 && !isBPF)
     return;
 
   if (MIA)

@eddyz87 eddyz87 requested review from yonghong-song and MaskRay July 25, 2024 10:17
@eddyz87
Copy link
Contributor Author

eddyz87 commented Jul 25, 2024

The reason for the change is that with current clang master BPF selftests are disassembled w/o local labels at all:

$ llvm-objdump --no-show-raw-insn --no-addresses --symbolize-operands -Sdr cpuv4/pyperf600.bpf.o
...
<on_event>:
        ...
        if r0 == 0x0 goto +0x113 <on_event+0x900>
        ...

Which is quite inconvenient.

Enable local labels computation for BPF disassembly when
--symbolize-operands option is specified.
This relies on MCInstrAnalysis::evaluateBranch() method,
which is already defined in BPFMCInstrAnalysis::evaluateBranch.

After this change the assembly code below:

        if r1 > 42 goto +1
        r1 -= 10
        ...

Would be printed as:

        if r1 > 42 goto +1 <L0>
        r1 -= 10
      <L0>:
        ...

(when --symbolize-operands option is set).
@eddyz87 eddyz87 force-pushed the objdump-symbolize-operands-for-bpf branch from 40f7711 to 4ba4a00 Compare July 25, 2024 10:28
## Verify generation of 'Lxx' labels for local jump targets,
## when --symbolize-operands option is specified.

# RUN: llvm-mc -triple=bpfel %s -filetype=obj -o - | \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We prefer that the object file is written to the disk (e.g. %t) to ease debugging.

@MaskRay
Copy link
Member

MaskRay commented Jul 25, 2024

(when --symbolize-operands option is set).

The option conveys important context. How about [llvm-objdump] --symbolize-operands: infer local labels for BPF?

The description can mention https://reviews.llvm.org/D84191

Copy link
Member

@MaskRay MaskRay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but please give some time to @jh7370 who usually wants to be aware of these changes.

@@ -1486,7 +1486,9 @@ collectLocalBranchTargets(ArrayRef<uint8_t> Bytes, MCInstrAnalysis *MIA,
std::unordered_map<uint64_t, std::string> &Labels) {
// So far only supports PowerPC and X86.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment is now stale. I suggest that we replace it with // Supported by certain targets. so that it will not become stale whenever we add a new target.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, my bad, will update.

@eddyz87
Copy link
Contributor Author

eddyz87 commented Jul 25, 2024

Hi @MaskRay , thank you for the review, I agree with raised points.
Will update and force push, if you don't mind force pushing.

Copy link
Collaborator

@jh7370 jh7370 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside from one minor nit, I've got no issues with this, and am happy with @MaskRay's review.

Will update and force push, if you don't mind force pushing.

You shouldn't need to force push, and I would actively discourage it, as it makes follow-up reviews harder (it's tricky to see what's changed since the previous review). Instead, you can use merge commits from main, if rebasing is needed, and fixup commits on your branch with the changes requested (these will also collapse into a single regular commit when you squash and merge).

Also worth noting that the final commit title and description is taken from the PR's title and description, so there's no need to amend the commit message for the already-committed commit in this PR. Instead, make sure to update the PR itself for @MaskRay's suggestion.

Comment on lines 3 to 4
## Verify generation of 'Lxx' labels for local jump targets,
## when --symbolize-operands option is specified.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very minor nit: if we follow the same coding standards for line length as in code files, the "when" should be on the previous line, I believe.

Copy link
Contributor

@yonghong-song yonghong-song left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This patch looks good to me. But it looks like we still need --symbolize-operands in order to emit labels. So with this patch, there are two ways to emit labels:
(1). llvm-objdump --symbolize-operands .... The variant of (1) will be to enable --symbolize-operands by default for BPF target.
(2). add '-Wa,-L' to compilation flags so labels in symbol table. The variant of (2) will be that enable labels in symbol table by BPF backend.

It would be great if user does not need to learn '--symbolize-operands' or '-Wa,-L' which is not needed before llvm19.

But in any case, this patch looks good.

@yonghong-song
Copy link
Contributor

Since we do not want L* labels to appear in symbol table (as they are private, generated by compiler), it would be great if we can enable --symbolize-operands by default for BPF target in llvm-objdump.

@MaskRay
Copy link
Member

MaskRay commented Jul 27, 2024

Since we do not want L* labels to appear in symbol table (as they are private, generated by compiler), it would be great if we can enable --symbolize-operands by default for BPF target in llvm-objdump.

I really hope that we avoid target-specific defaults, which haven't been used so far to the best of my knowledge....

@eddyz87 eddyz87 changed the title [llvm-objdump][BPF] infer local label names in BPF disassembly [llvm-objdump][BPF] --symbolize-operands: infer local labels for BPF Jul 28, 2024
eddyz87 added 2 commits July 28, 2024 01:19
- Use temporary file for llvm-mc results in the test;
- Moved 'when' in the test comment one line up to correctly fill 80
  chars per line;
- Corrected a comment in the collectLocalBranchTargets
@eddyz87
Copy link
Contributor Author

eddyz87 commented Jul 28, 2024

Hi @MaskRay , @jh7370 , thank you for the review, I've pushed requested changes.
@jh7370 , thank you for clarification regarding force pushes and final commit description.

I will wait till Monday, if there would be no objections, I'll merge this change.

@yonghong-song , @4ast , do we want to make --symbolize-operands default for bpf and if so, should this be a part of this commit or a #95103? I understand why @MaskRay does not like it, but usability benefits seem significant.

@yonghong-song
Copy link
Contributor

@eddyz87 To fix the llvm-objdump regression for bpf target, I indeed would like to have --symbolize-operands default for bpf, which will roughly restore what we have for llvm18 and earlier. I guess @MaskRay can comment further on this.

If we indeed agree to have --symbolize-operands for bpf target, the change should be in this commit.
#95103 is to change L* label to .L* label which is not related to --symbolize-operands at all.

@MaskRay
Copy link
Member

MaskRay commented Jul 29, 2024

@yonghong-song , @4ast , do we want to make --symbolize-operands default for bpf and if so, should this be a part of this commit or a #95103? I understand why @MaskRay does not like it, but usability benefits seem significant.

The patch can land as is. The default --symbolize-operands does not need to be combined. It would be a pretty significant policy change as we do not allow targets to customize the defaults. That would require more debates.

@4ast
Copy link
Member

4ast commented Jul 29, 2024

@yonghong-song , @4ast , do we want to make --symbolize-operands default for bpf and if so, should this be a part of this commit or a #95103? I understand why @MaskRay does not like it, but usability benefits seem significant.

The patch can land as is. The default --symbolize-operands does not need to be combined. It would be a pretty significant policy change as we do not allow targets to customize the defaults. That would require more debates.

No. User experience matters a lot more than some "policy".
We're going to keep L in symbol table then.

@4ast 4ast closed this Jul 29, 2024
@jh7370
Copy link
Collaborator

jh7370 commented Jul 29, 2024

@yonghong-song , @4ast , do we want to make --symbolize-operands default for bpf and if so, should this be a part of this commit or a #95103? I understand why @MaskRay does not like it, but usability benefits seem significant.

The patch can land as is. The default --symbolize-operands does not need to be combined. It would be a pretty significant policy change as we do not allow targets to customize the defaults. That would require more debates.

No. User experience matters a lot more than some "policy". We're going to keep L in symbol table then.

This feels like an excessively strong reaction to what @MaskRay has been saying, especially as there wasn't even room for discussion.

This is an open source project, with a wide range of stakeholders. These "policies" are what are needed to keep the code in a maintainable state for all of these stakeholders. That being said, there is always an opportunity to discuss changing a policy or even working together to find an alternative solution that will satisfy everyone.

I'm not familiar at all with the BPF target. Ignoring the prior situation of a bug that used the wrong name for temporary symbols (i.e. the "L"/".L" symbols), is BPF significantly different to other targets on the need for a different default for the --symbolize-operands option?

@4ast
Copy link
Member

4ast commented Jul 31, 2024

keep the code in a maintainable state

a default state of a support flag being different for one architecture doesn't change anything from maintainability pov.

is BPF significantly different to other targets

It clearly is. BPF users read assembly code pretty much daily and good readability is paramount.
Compare how bpf assembly itself looks vs all other asms. The difference should be obvious.

This particular diff #100550 is imo good to land. I misunderstood it first.

@4ast 4ast reopened this Jul 31, 2024
@eddyz87
Copy link
Contributor Author

eddyz87 commented Jul 31, 2024

Merging this change (w/o update to default behavior).

@eddyz87 eddyz87 merged commit ee1040b into llvm:main Jul 31, 2024
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants