-
Notifications
You must be signed in to change notification settings - Fork 13.6k
Open
Labels
A-codegenArea: Code generationArea: Code generationC-enhancementCategory: An issue proposing an enhancement or a PR with one.Category: An issue proposing an enhancement or a PR with one.C-optimizationCategory: An issue highlighting optimization opportunities or PRs implementing suchCategory: An issue highlighting optimization opportunities or PRs implementing suchI-slowIssue: Problems and improvements with respect to performance of generated code.Issue: Problems and improvements with respect to performance of generated code.
Description
This was isolated by @stjepang in #37939
This program: (playground)
pub fn id_result(a: Result<u64, i64>) -> Result<u64, i64> {
match a {
Ok(x) => Ok(x),
Err(y) => Err(y),
}
}
Should optimize to something that just copies the input to the output, with no conditionals.
Output on rustc 1.15.0-nightly (daf8c1dfc 2016-12-05)
using -Copt-level=3 --emit=llvm-ir
is this:
; ModuleID = 'id_result.cgu-0.rs'
source_filename = "id_result.cgu-0.rs"
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"
%"core::result::Result<u64, i64>" = type { i64, [0 x i64], [1 x i64] }
; Function Attrs: norecurse nounwind uwtable
define void @id_result(%"core::result::Result<u64, i64>"* noalias nocapture sret dereferenceable(16), %"core::result::Result<u64, i64>"* noalias nocapture readonly dereferenceable(16)) unnamed_addr #0 {
entry-block:
%a.sroa.0.0..sroa_idx = getelementptr inbounds %"core::result::Result<u64, i64>", %"core::result::Result<u64, i64>"* %1, i64 0, i32 0
%a.sroa.0.0.copyload = load i64, i64* %a.sroa.0.0..sroa_idx, align 8
%a.sroa.4.0..sroa_idx2 = getelementptr inbounds %"core::result::Result<u64, i64>", %"core::result::Result<u64, i64>"* %1, i64 0, i32 2, i64 0
%a.sroa.4.0.copyload = load i64, i64* %a.sroa.4.0..sroa_idx2, align 8
%2 = getelementptr inbounds %"core::result::Result<u64, i64>", %"core::result::Result<u64, i64>"* %0, i64 0, i32 0
%not.switch = icmp ne i64 %a.sroa.0.0.copyload, 0
%. = zext i1 %not.switch to i64
store i64 %., i64* %2, align 8
%3 = getelementptr inbounds %"core::result::Result<u64, i64>", %"core::result::Result<u64, i64>"* %0, i64 0, i32 2, i64 0
store i64 %a.sroa.4.0.copyload, i64* %3, align 8
ret void
}
attributes #0 = { norecurse nounwind uwtable }
Metadata
Metadata
Assignees
Labels
A-codegenArea: Code generationArea: Code generationC-enhancementCategory: An issue proposing an enhancement or a PR with one.Category: An issue proposing an enhancement or a PR with one.C-optimizationCategory: An issue highlighting optimization opportunities or PRs implementing suchCategory: An issue highlighting optimization opportunities or PRs implementing suchI-slowIssue: Problems and improvements with respect to performance of generated code.Issue: Problems and improvements with respect to performance of generated code.
Type
Projects
Milestone
Relationships
Development
Select code repository
Activity
ghost commentedon Dec 15, 2016
Just leaving a note: This fix in LLVM will probably resolve the issue.
bluss commentedon Dec 16, 2016
copy propagation in MIR for function arguments (added in #38332) can cause this to optimize correctly, at least for
Result<u64, i64>
(and not forResult<u32, i32>
). This optimization is never used by default though.the diff in unoptimized IR just points to the range metadata ending up somewhere it's not removed.
Mark-Simulacrum commentedon May 18, 2017
So according to the issue @stjepang notes, it looks like this may have been merged into LLVM trunk on Mar 22 2017; in https://reviews.llvm.org/rL298540. It's possible we could backport this? I'm not sure, but this could be #40914.
bluss commentedon Aug 4, 2017
What I can see, this is still an issue in rustc 1.21.0-nightly (b75d1f0 2017-08-02)
sinkuu commentedon Nov 12, 2017
Fixed by #45380?
Diff of unoptimized IRs (1.21 stable -> nightly):
Optimized IR on nightly (1.23.0-nightly (45caff8 2017-11-11)):
bluss commentedon Nov 12, 2017
Yep, that's fixed in the original code this bug was filed for. The issue persists if we tweak it just a little (32-bit instead of 64-bit payload):
eddyb commentedon Mar 28, 2018
@Mark-Simulacrum Is that patch only for
nonnull
, notrange
metadata? Looks like this issue is still a problem, it starts with!range
metadata, but there's stillicmp
left around.Mark-Simulacrum commentedon Mar 28, 2018
I don't know.
3 remaining items