Skip to content

Identity Result match mapping should optimize to an identity function #38349

@bluss

Description

@bluss
Member

This was isolated by @stjepang in #37939

This program: (playground)

pub fn id_result(a: Result<u64, i64>) -> Result<u64, i64> {
    match a {
        Ok(x) => Ok(x),
        Err(y) => Err(y),
    }
}

Should optimize to something that just copies the input to the output, with no conditionals.

Output on rustc 1.15.0-nightly (daf8c1dfc 2016-12-05) using -Copt-level=3 --emit=llvm-ir is this:

; ModuleID = 'id_result.cgu-0.rs'
source_filename = "id_result.cgu-0.rs"
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

%"core::result::Result<u64, i64>" = type { i64, [0 x i64], [1 x i64] }

; Function Attrs: norecurse nounwind uwtable
define void @id_result(%"core::result::Result<u64, i64>"* noalias nocapture sret dereferenceable(16), %"core::result::Result<u64, i64>"* noalias nocapture readonly dereferenceable(16)) unnamed_addr #0 {
entry-block:
  %a.sroa.0.0..sroa_idx = getelementptr inbounds %"core::result::Result<u64, i64>", %"core::result::Result<u64, i64>"* %1, i64 0, i32 0
  %a.sroa.0.0.copyload = load i64, i64* %a.sroa.0.0..sroa_idx, align 8
  %a.sroa.4.0..sroa_idx2 = getelementptr inbounds %"core::result::Result<u64, i64>", %"core::result::Result<u64, i64>"* %1, i64 0, i32 2, i64 0
  %a.sroa.4.0.copyload = load i64, i64* %a.sroa.4.0..sroa_idx2, align 8
  %2 = getelementptr inbounds %"core::result::Result<u64, i64>", %"core::result::Result<u64, i64>"* %0, i64 0, i32 0
  %not.switch = icmp ne i64 %a.sroa.0.0.copyload, 0
  %. = zext i1 %not.switch to i64
  store i64 %., i64* %2, align 8
  %3 = getelementptr inbounds %"core::result::Result<u64, i64>", %"core::result::Result<u64, i64>"* %0, i64 0, i32 2, i64 0
  store i64 %a.sroa.4.0.copyload, i64* %3, align 8
  ret void
}

attributes #0 = { norecurse nounwind uwtable }

Activity

added
A-codegenArea: Code generation
I-slowIssue: Problems and improvements with respect to performance of generated code.
on Dec 13, 2016
ghost

ghost commented on Dec 15, 2016

@ghost

Just leaving a note: This fix in LLVM will probably resolve the issue.

bluss

bluss commented on Dec 16, 2016

@bluss
MemberAuthor

copy propagation in MIR for function arguments (added in #38332) can cause this to optimize correctly, at least for Result<u64, i64> (and not for Result<u32, i32>). This optimization is never used by default though.

the diff in unoptimized IR just points to the range metadata ending up somewhere it's not removed.

Mark-Simulacrum

Mark-Simulacrum commented on May 18, 2017

@Mark-Simulacrum
Member

So according to the issue @stjepang notes, it looks like this may have been merged into LLVM trunk on Mar 22 2017; in https://reviews.llvm.org/rL298540. It's possible we could backport this? I'm not sure, but this could be #40914.

bluss

bluss commented on Aug 4, 2017

@bluss
MemberAuthor

What I can see, this is still an issue in rustc 1.21.0-nightly (b75d1f0 2017-08-02)

sinkuu

sinkuu commented on Nov 12, 2017

@sinkuu
Contributor

Fixed by #45380?

Diff of unoptimized IRs (1.21 stable -> nightly):

 ; test::id_result
 ; Function Attrs: uwtable
-define void @_ZN4test9id_result17h0e83cb02e8136307E(%"core::result::Result<u64, i64>"* noalias nocapture sret dereferenceable(16), %"core::result::Result<u64, i64>"* noalias nocapture dereferenceable(16)) unnamed_addr #0 {
+define void @_ZN4test9id_result17ha0e8372a885d01bfE(%"core::result::Result<u64, i64>"* noalias nocapture sret dereferenceable(16), %"core::result::Result<u64, i64>"* noalias nocapture dereferenceable(16) %a) unnamed_addr #0 {
 start:
-  %a = alloca %"core::result::Result<u64, i64>"
-  %2 = bitcast %"core::result::Result<u64, i64>"* %1 to i8*
-  %3 = bitcast %"core::result::Result<u64, i64>"* %a to i8*
-  call void @llvm.memcpy.p0i8.p0i8.i64(i8* %3, i8* %2, i64 16, i32 8, i1 false)
-  %4 = getelementptr inbounds %"core::result::Result<u64, i64>", %"core::result::Result<u64, i64>"* %a, i32 0, i32 0
-  %5 = load i64, i64* %4, !range !0
-  switch i64 %5, label %bb2 [
+  %1 = getelementptr inbounds %"core::result::Result<u64, i64>", %"core::result::Result<u64, i64>"* %a, i32 0, i32 0
+  %2 = load i64, i64* %1, !range !0
+  switch i64 %2, label %bb2 [
     i64 0, label %bb1
   ]
(snip)

Optimized IR on nightly (1.23.0-nightly (45caff8 2017-11-11)):

; test::id_result
; Function Attrs: norecurse nounwind uwtable
define void @_ZN4test9id_result17ha0e8372a885d01bfE(%"core::result::Result<u64, i64>"* noalias nocapture sret dereferenceable(16), %"core::result::Result<u64, i64>"* noalias nocapture readonly dereferenceable(16) %a) unnamed_addr #0 {
start:
  %1 = bitcast %"core::result::Result<u64, i64>"* %a to <2 x i64>*
  %2 = load <2 x i64>, <2 x i64>* %1, align 8
  %3 = bitcast %"core::result::Result<u64, i64>"* %0 to <2 x i64>*
  store <2 x i64> %2, <2 x i64>* %3, align 8
  ret void
}
bluss

bluss commented on Nov 12, 2017

@bluss
MemberAuthor

Yep, that's fixed in the original code this bug was filed for. The issue persists if we tweak it just a little (32-bit instead of 64-bit payload):

#![crate_type="lib"]

type T = i32;

#[no_mangle]
pub fn id_result(a: Result<T, T>) -> Result<T, T> {
    match a {
        Ok(x) => Ok(x),
        Err(y) => Err(y),
    }
}
eddyb

eddyb commented on Mar 28, 2018

@eddyb
Member

@Mark-Simulacrum Is that patch only for nonnull, not range metadata? Looks like this issue is still a problem, it starts with !range metadata, but there's still icmp left around.

Mark-Simulacrum

Mark-Simulacrum commented on Mar 28, 2018

@Mark-Simulacrum
Member

I don't know.

3 remaining items

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-codegenArea: Code generationC-enhancementCategory: An issue proposing an enhancement or a PR with one.C-optimizationCategory: An issue highlighting optimization opportunities or PRs implementing suchI-slowIssue: Problems and improvements with respect to performance of generated code.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @eddyb@bluss@Mark-Simulacrum@sinkuu@bjorn3

        Issue actions

          Identity Result match mapping should optimize to an identity function · Issue #38349 · rust-lang/rust