-
Notifications
You must be signed in to change notification settings - Fork 13.8k
Description
Consider the following functions (https://godbolt.org/z/a8r3Tc7TE):
pub fn faster(input: u64) -> u64 {
match input % 4 {
0 => 0,
1 | 2 => 1,
3 => 2,
_ => unreachable!(),
}
}
pub fn branchy(input: u64) -> u64 {
match input % 4 {
1 | 2 => 1,
3 => 2,
_ => 0,
}
}
These functions have identical behavior: they map input
to input % 4 - (input % 4 / 2)
. In the former case, LLVM generates a nice lookup table for us, but in the latter, it emits an extra branch. The only difference is that I've used _ => ...
to avoid needing to write an unreachable-by-optimization branch.
If we look at the generated IR (after -Cpasses=strip,mem2reg,simplifycfg
):
define i64 @faster(i64 %0) unnamed_addr #0 {
%2 = urem i64 %0, 4
switch i64 %2, label %.unreachabledefault [
i64 0, label %5
i64 1, label %3
i64 2, label %3
i64 3, label %4
]
.unreachabledefault: ; preds = %1
unreachable
3: ; preds = %1, %1
br label %5
4: ; preds = %1
br label %5
5: ; preds = %1, %4, %3
%.0 = phi i64 [ 2, %4 ], [ 1, %3 ], [ 0, %1 ]
ret i64 %.0
}
define i64 @branchy(i64 %0) unnamed_addr #0 {
%2 = urem i64 %0, 4
switch i64 %2, label %5 [
i64 1, label %3
i64 2, label %3
i64 3, label %4
]
3: ; preds = %1, %1
br label %5
4: ; preds = %1
br label %5
5: ; preds = %1, %4, %3
%.0 = phi i64 [ 2, %4 ], [ 1, %3 ], [ 0, %1 ]
ret i64 %.0
}
The problem is clear: LLVM does not seem to realize that it can trivially transform branchy
to faster
here, by observing that the default in the switch
is only taken when %2 == 0
.
I suspect this is more LLVM bug than Rust bug, but it feels fixable by a MIR peephole optimization? Unclear. The _ => 0
code I wrote is an attractive nuisance that I imagine other people writing, too, so perhaps there is value to seeing if this optimization can be made before LLVM.
This bug is also present in Clang, in case someone wants to file an LLVM bug: https://godbolt.org/z/x7rec97E7. It's unclear to me if this is the sort of optimization Clang would do in the frontend instead of in LLVM; could go either-or here, tbh.
Activity
dianqk commentedon Nov 26, 2023
Upstream issue: llvm/llvm-project#73446.
@rustbot claim
dianqk commentedon Jan 3, 2024
@rustbot label llvm-fixed-upstream
nikic commentedon Feb 14, 2024
It looks like this is fixed since 1.75, but I don't know what fixed it: https://godbolt.org/z/eGnWbxbG4
10 remaining items
dianqk commentedon May 25, 2024
Yes, I have implemented it, but due to the compilation time issue mentioned in llvm/llvm-project#78578, I had to revert the commit. Now I have relanded it: llvm/llvm-project#73446 (comment).
@rustbot label +llvm-fixed-upstream
nikic commentedon Aug 1, 2024
Confirmed fixed by #127513, needs codegen test.
Auto merge of rust-lang#128584 - DianQK:tests-for-llvm-19, r=nikic
Auto merge of rust-lang#128584 - DianQK:tests-for-llvm-19, r=<try>
Auto merge of rust-lang#128584 - DianQK:tests-for-llvm-19, r=<try>
Auto merge of rust-lang#128584 - DianQK:tests-for-llvm-19, r=<try>
Auto merge of rust-lang#128584 - DianQK:tests-for-llvm-19, r=<try>
Rollup merge of rust-lang#128584 - DianQK:tests-for-llvm-19, r=nikic
Auto merge of rust-lang#128584 - DianQK:tests-for-llvm-19, r=nikic
Auto merge of rust-lang#128584 - DianQK:tests-for-llvm-19, r=nikic