Seemingly inefficient code generated to forward a parameter to a function

The generated code for passing arguments larger than a machine word looks inefficient.

Test case:

```
#[inline(never)]
pub fn bar(x: &str) { println!("{}", x) }
pub fn foo(x: &str) { bar(x); bar(x); }
```

On x86_64-unknown-linux-gnu, compiling with `rustc test.rs -O -C no-stack-check --crate-type dylib --emit asm`, I see this code for `foo`:

```
    .section    .text._ZN3foo20hb6f131ac36a30532PaaE,"ax",@progbits
    .globl  _ZN3foo20hb6f131ac36a30532PaaE
    .align  16, 0x90
    .type   _ZN3foo20hb6f131ac36a30532PaaE,@function
_ZN3foo20hb6f131ac36a30532PaaE:
    .cfi_startproc
    pushq   %rbx
.Ltmp4:
    .cfi_def_cfa_offset 16
    subq    $16, %rsp
.Ltmp5:
    .cfi_def_cfa_offset 32
.Ltmp6:
    .cfi_offset %rbx, -16
    movq    %rdi, %rbx
    movups  (%rbx), %xmm0
    movaps  %xmm0, (%rsp)
    leaq    (%rsp), %rdi
    callq   _ZN3bar20hf21270c370b3427feaaE@PLT
    movups  (%rbx), %xmm0
    movaps  %xmm0, (%rsp)
    leaq    (%rsp), %rdi
    callq   _ZN3bar20hf21270c370b3427feaaE@PLT
    addq    $16, %rsp
    popq    %rbx
    retq
.Ltmp7:
    .size   _ZN3foo20hb6f131ac36a30532PaaE, .Ltmp7-_ZN3foo20hb6f131ac36a30532PaaE
    .cfi_endproc
```

`foo` receives the address of the `&str` in `%rdi`.  It copies it into a new stack location for each call, then passes the address of that location to `bar`.

Could `foo` forward the address of the `&str` along without making stack copies?

If I remove one of the `bar` calls from `foo`, then the function also ought to become a tail call, but it doesn't.  Tail call optimization _does_ occur if I replace the `&str` types with `&&str`.

The calling convention for passing `&str` (and other arguments larger than a machine word?) seems to be:
1. Make a copy of the argument on the stack.
2. Pass the address of the copy in the conventional manner (in a register or on the stack).
3. The callee may modify the copy.

i.e. We seem to be passing values both by-value _and_ by-reference.

With the current convention, I think we could get smaller code by eliding some of the copies.  If the copies were instead immutable, I think we could elide more copies.

Compiler version:

```
rustc 1.0.0-nightly (b47aebe3f 2015-02-26) (built 2015-02-27)
binary: rustc
commit-hash: b47aebe3fc2da06c760fd8ea19f84cbc41d34831
commit-date: 2015-02-26
build-date: 2015-02-27
host: x86_64-unknown-linux-gnu
release: 1.0.0-nightly
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Seemingly inefficient code generated to forward a parameter to a function #22891

10 remaining items

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Participants

Seemingly inefficient code generated to forward a parameter to a function #22891

Description

Activity

dotdash commented on Feb 28, 2015

rprichard commented on Feb 28, 2015

dotdash commented on Feb 28, 2015

rprichard commented on Mar 1, 2015

dotdash commented on Mar 4, 2015

comex commented on Mar 12, 2015

dotdash commented on Mar 12, 2015

comex commented on Mar 12, 2015

dotdash commented on Mar 12, 2015

comex commented on Mar 13, 2015

dotdash commented on Mar 13, 2015

comex commented on Mar 13, 2015

10 remaining items

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Participants

Issue actions