Closed
Description
Example code:
#[repr(C)]
pub struct ThreeSlices<'a>(&'a [u32], &'a [u32], &'a [u32]);
#[inline(always)]
pub fn should_be_no_op(val: ThreeSlices) -> ThreeSlices {
val
}
pub fn sum_slices_1(val: ThreeSlices) -> u32 {
sum(&val)
}
pub fn sum_slices_2(val: ThreeSlices) -> u32 {
let val = should_be_no_op(val);
sum(&val)
}
#[inline(never)]
pub fn sum(val: &ThreeSlices) -> u32 {
val.0.iter().sum::<u32>() + val.1.iter().sum::<u32>() + val.2.iter().sum::<u32>()
}
In rustc 1.67 stable this generates a number of moves (I suppose for calling convention?) that I don't think need to be there, especially when inlining:
example::sum_slices_1:
jmp qword ptr [rip + example::sum@GOTPCREL]
example::sum_slices_2:
sub rsp, 56
movups xmm0, xmmword ptr [rdi]
movups xmm1, xmmword ptr [rdi + 16]
movups xmm2, xmmword ptr [rdi + 32]
movaps xmmword ptr [rsp + 32], xmm2
movaps xmmword ptr [rsp + 16], xmm1
movaps xmmword ptr [rsp], xmm0
mov rdi, rsp
call qword ptr [rip + example::sum@GOTPCREL]
add rsp, 56
ret
See https://rust.godbolt.org/z/azs11edK8
While this is a pretty pointless example, this comes up in situations where you might want to convert a tuple of slices into a struct of slices in order to assign names to the tuple members.
Metadata
Metadata
Assignees
Labels
Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.Category: This is a bug.Call for participation: An issue has been fixed and does not reproduce, but no test has been added.Issue: Problems and improvements with respect to performance of generated code.Relevant to the compiler team, which will review and decide on the PR/issue.