Skip to content

Missed optimization: _ => 0 generates worse code than 0 => 0, _ => unreachable!() #118306

@mcy

Description

@mcy
Contributor

Consider the following functions (https://godbolt.org/z/a8r3Tc7TE):

pub fn faster(input: u64) -> u64 {
  match input % 4 {
    0 => 0,
    1 | 2 => 1,
    3 => 2,
    _ => unreachable!(),
  }
}

pub fn branchy(input: u64) -> u64 {
  match input % 4 {
    1 | 2 => 1,
    3 => 2,
    _ => 0,
  }
}

These functions have identical behavior: they map input to input % 4 - (input % 4 / 2). In the former case, LLVM generates a nice lookup table for us, but in the latter, it emits an extra branch. The only difference is that I've used _ => ... to avoid needing to write an unreachable-by-optimization branch.

If we look at the generated IR (after -Cpasses=strip,mem2reg,simplifycfg):

define i64 @faster(i64 %0) unnamed_addr #0 {
  %2 = urem i64 %0, 4
  switch i64 %2, label %.unreachabledefault [
    i64 0, label %5
    i64 1, label %3
    i64 2, label %3
    i64 3, label %4
  ]

.unreachabledefault:                              ; preds = %1
  unreachable

3:                                                ; preds = %1, %1
  br label %5

4:                                                ; preds = %1
  br label %5

5:                                                ; preds = %1, %4, %3
  %.0 = phi i64 [ 2, %4 ], [ 1, %3 ], [ 0, %1 ]
  ret i64 %.0
}

define i64 @branchy(i64 %0) unnamed_addr #0 {
  %2 = urem i64 %0, 4
  switch i64 %2, label %5 [
    i64 1, label %3
    i64 2, label %3
    i64 3, label %4
  ]

3:                                                ; preds = %1, %1
  br label %5

4:                                                ; preds = %1
  br label %5

5:                                                ; preds = %1, %4, %3
  %.0 = phi i64 [ 2, %4 ], [ 1, %3 ], [ 0, %1 ]
  ret i64 %.0
}

The problem is clear: LLVM does not seem to realize that it can trivially transform branchy to faster here, by observing that the default in the switch is only taken when %2 == 0.

I suspect this is more LLVM bug than Rust bug, but it feels fixable by a MIR peephole optimization? Unclear. The _ => 0 code I wrote is an attractive nuisance that I imagine other people writing, too, so perhaps there is value to seeing if this optimization can be made before LLVM.

This bug is also present in Clang, in case someone wants to file an LLVM bug: https://godbolt.org/z/x7rec97E7. It's unclear to me if this is the sort of optimization Clang would do in the frontend instead of in LLVM; could go either-or here, tbh.

Activity

added
needs-triageThis issue may need triage. Remove it if it has been sufficiently triaged.
on Nov 26, 2023
added
I-slowIssue: Problems and improvements with respect to performance of generated code.
T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.
A-LLVMArea: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.
A-codegenArea: Code generation
A-mir-optArea: MIR optimizations
and removed
needs-triageThis issue may need triage. Remove it if it has been sufficiently triaged.
on Nov 26, 2023
dianqk

dianqk commented on Nov 26, 2023

@dianqk
Member

Upstream issue: llvm/llvm-project#73446.

@rustbot claim

dianqk

dianqk commented on Jan 3, 2024

@dianqk
Member

@rustbot label llvm-fixed-upstream

added
llvm-fixed-upstreamIssue expected to be fixed by the next major LLVM upgrade, or backported fixes
on Jan 3, 2024
added
E-needs-testCall for participation: An issue has been fixed and does not reproduce, but no test has been added.
and removed
llvm-fixed-upstreamIssue expected to be fixed by the next major LLVM upgrade, or backported fixes
on Feb 14, 2024
nikic

nikic commented on Feb 14, 2024

@nikic
Contributor

It looks like this is fixed since 1.75, but I don't know what fixed it: https://godbolt.org/z/eGnWbxbG4

10 remaining items

dianqk

dianqk commented on May 25, 2024

@dianqk
Member

Yep, I thought you meant that you implemented the optimization, is it not implemented? 😅

Yes, I have implemented it, but due to the compilation time issue mentioned in llvm/llvm-project#78578, I had to revert the commit. Now I have relanded it: llvm/llvm-project#73446 (comment).

@rustbot label +llvm-fixed-upstream

added
llvm-fixed-upstreamIssue expected to be fixed by the next major LLVM upgrade, or backported fixes
on May 25, 2024
nikic

nikic commented on Aug 1, 2024

@nikic
Contributor

Confirmed fixed by #127513, needs codegen test.

added
E-needs-testCall for participation: An issue has been fixed and does not reproduce, but no test has been added.
and removed
llvm-fixed-upstreamIssue expected to be fixed by the next major LLVM upgrade, or backported fixes
on Aug 1, 2024
added 5 commits that reference this issue on Aug 4, 2024
df62a42
193a169
47fa085
44530a0
76c93d1
added a commit that references this issue on Aug 7, 2024
1987f15
added 2 commits that reference this issue on Aug 9, 2024
c80d992
69b380d
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

Labels

A-LLVMArea: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.A-codegenArea: Code generationA-mir-optArea: MIR optimizationsE-needs-testCall for participation: An issue has been fixed and does not reproduce, but no test has been added.I-slowIssue: Problems and improvements with respect to performance of generated code.T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

    Development

    Participants

    @nikic@mcy@dianqk@saethlin@fmease

    Issue actions

      Missed optimization: `_ => 0` generates worse code than `0 => 0, _ => unreachable!()` · Issue #118306 · rust-lang/rust