Use explicit lifetimes for LLVM allocas #12004

dotdash · 2014-02-02T22:24:36Z

Currently, stack allocated variables live for the whole duration of the
function they're allocated in. This means that LLVM can't put two
allocas into the same stack space, even if they're never used at the
same time.

By using the LLVM intrinsics to declare the lifetime of allocas, we
allow LLVM to use the same stack space for allocas that have
non-overlapping lifetimes, reducing stack usage. In certain situations
this even allows LLVM to remove duplicated code that previously only
differed in which part of the stack it used.

Currently, stack allocated variables live for the whole duration of the function they're allocated in. This means that LLVM can't put two allocas into the same stack space, even if they're never used at the same time. By using the LLVM intrinsics to declare the lifetime of allocas, we allow LLVM to use the same stack space for allocas that have non-overlapping lifetimes, reducing stack usage. In certain situations this even allows LLVM to remove duplicated code that previously only differed in which part of the stack it used.

dotdash · 2014-02-02T22:38:09Z

This is more of an RFC at this time. I'm not exactly happy with the concrete implementation, because it introduces lots of extra cleanups and LLVM IR, increasing the memory usage when compiling librustc by about 15% and the compile time by maybe 3% as well. Both of whic seem excessive.

I guess we could drop the lifetime calls when we're in the toplevel scope of a function, but I couldn't figure out how to do that in a clean way.

That said, given this code (which is extremly favorable to this change):

pub fn foo (x: bool) {
    match x {
        true => println!("Foo{}", 5),
        false => println!("Foo{}", 6),
    };
}

This is how the resulting assembly differs:

--- before.s    2014-02-02 23:33:09.706317828 +0100
+++ after.s 2014-02-02 23:33:14.650273534 +0100
@@ -1,51 +1,40 @@
-   .globl  _ZN3foo19h81a28b9b9b1b3681ad4v0.0E
+   .globl  _ZN3foo19h81ff6a854ff46f5cad4v0.0E
    .align  16, 0x90
-   .type   _ZN3foo19h81a28b9b9b1b3681ad4v0.0E,@function
-_ZN3foo19h81a28b9b9b1b3681ad4v0.0E:
+   .type   _ZN3foo19h81ff6a854ff46f5cad4v0.0E,@function
+_ZN3foo19h81ff6a854ff46f5cad4v0.0E:
    .cfi_startproc
    cmpq    %fs:112, %rsp
    ja  .LBB0_2
-   movabsq $120, %r10
+   movabsq $56, %r10
    movabsq $0, %r11
    callq   __morestack
    retq
 .LBB0_2:
-   subq    $120, %rsp
+   subq    $56, %rsp
 .Ltmp1:
-   .cfi_def_cfa_offset 128
+   .cfi_def_cfa_offset 64
    movzbl  %dil, %eax
    cmpl    $1, %eax
    jne .LBB0_4
-   movq    $5, 112(%rsp)
-   leaq    _ZN3fmt11secret_show19h1cfae34b462b5821a24v0.0E(%rip), %rax
-   movq    %rax, 96(%rsp)
-   leaq    112(%rsp), %rax
-   movq    %rax, 104(%rsp)
-   leaq    _ZN3foo15__STATIC_FMTSTR19h1e2ce5d40ae038c7ah4v0.0E(%rip), %rax
-   movq    %rax, 64(%rsp)
-   movq    $2, 72(%rsp)
-   leaq    96(%rsp), %rax
-   movq    %rax, 80(%rsp)
-   movq    $1, 88(%rsp)
-   leaq    64(%rsp), %rdi
+   movq    $5, 48(%rsp)
    jmp .LBB0_5
 .LBB0_4:
-   movq    $6, 56(%rsp)
-   leaq    _ZN3fmt11secret_show19h1cfae34b462b5821a24v0.0E(%rip), %rax
-   movq    %rax, 40(%rsp)
-   leaq    56(%rsp), %rax
-   movq    %rax, 48(%rsp)
-   leaq    _ZN3foo15__STATIC_FMTSTR19h1e2ce5d40ae038c7ah4v0.0E(%rip), %rax
-   movq    %rax, 8(%rsp)
-   movq    $2, 16(%rsp)
-   leaq    40(%rsp), %rax
-   movq    %rax, 24(%rsp)
-   movq    $1, 32(%rsp)
-   leaq    8(%rsp), %rdi
+   movq    $6, 48(%rsp)
 .LBB0_5:
+   leaq    _ZN3fmt11secret_show19h49d00ba23b26f207a24v0.0E(%rip), %rax
+   movq    %rax, 32(%rsp)
+   leaq    48(%rsp), %rax
+   movq    %rax, 40(%rsp)
+   leaq    _ZN3foo15__STATIC_FMTSTR19h96bb7cc571f17624ah4v0.0E(%rip), %rax
+   movq    %rax, (%rsp)
+   movq    $2, 8(%rsp)
+   leaq    32(%rsp), %rax
+   movq    %rax, 16(%rsp)
+   movq    $1, 24(%rsp)
+   leaq    (%rsp), %rdi
    callq   _ZN2io5stdio12println_args19h7932d545acb66ab6ak9v0.10.preE@PLT
-   addq    $120, %rsp
+   addq    $56, %rsp
    retq
 .Ltmp2:
-   .size   _ZN3foo19h81a28b9b9b1b3681ad4v0.0E, .Ltmp2-_ZN3foo19h81a28b9b9b1b3681ad4v0.0E
+   .size   _ZN3foo19h81ff6a854ff46f5cad4v0.0E, .Ltmp2-_ZN3foo19h81ff6a854ff46f5cad4v0.0E
    .cfi_endproc

brson · 2014-02-03T00:25:47Z

Neat results!

Since this has a non-trivial expense I wonder if we might put it behind an experimental optimizations flag.

alexcrichton · 2014-02-03T04:50:14Z

src/librustc/middle/trans/intrinsic.rs

@@ -80,6 +80,8 @@ pub fn get_simple_intrinsic(ccx: @CrateContext, item: &ast::ForeignItem) -> Opti
        "bswap16" => Some(ccx.intrinsics.get_copy(&("llvm.bswap.i16"))),
        "bswap32" => Some(ccx.intrinsics.get_copy(&("llvm.bswap.i32"))),
        "bswap64" => Some(ccx.intrinsics.get_copy(&("llvm.bswap.i64"))),
+        "lifetime_start" => Some(ccx.intrinsics.get_copy(&("llvm.lifetime.start"))),
+        "lifetime_end" => Some(ccx.intrinsics.get_copy(&("llvm.lifetime.end"))),


I don't think you'd need to modify this location, these are the intrinsics we expose under extern "rust-intrinsic" { ... }

Ah, right. I initially thought about exposing the intrinsics, then decided against it and forgot to clean up. Thanks!

emberian · 2014-02-03T06:37:23Z

I really like the result, though I find the side effects on compiler performance unfortunate.

emberian · 2014-02-03T06:37:41Z

Maybe we can do it at --opt-level=3?

dotdash · 2014-02-03T22:05:32Z

This needs a lot more work. Array allocas are currently wrong (size only accounts for a single argument), and (for example) allocas for function arguments (which probably account for a lot of stack usage) don't get a proper marker for the end of their lifetime, yet. I don't expect to get enough time during the week due to $DAYJOB, but I'll revisit this during the weekend.

alexcrichton · 2014-02-14T05:47:14Z

Closing due to inactivity, but I'd love to see this as an optimization pass!

New lints `iter_filter_is_some` and `iter_filter_is_ok` Adds a pair of lints that check for cases of an iterator over `Result` and `Option` followed by `filter` without being followed by `map` as that is covered already by a different, specialized lint. Fixes #11843 PS, I also made some minor documentations fixes in a case where a double tick (`) was included. --- * changelog: New Lint: [`iter_filter_is_some`] [#12004](rust-lang/rust#12004) * changelog: New Lint: [`iter_filter_is_ok`] [#12004](rust-lang/rust#12004)

New lints `iter_filter_is_some` and `iter_filter_is_ok` Adds a pair of lints that check for cases of an iterator over `Result` and `Option` followed by `filter` without being followed by `map` as that is covered already by a different, specialized lint. Fixes #11843 PS, I also made some minor documentations fixes in a case where a double tick (`) was included. --- changelog: New Lint: [`iter_filter_is_some`] [#12004](rust-lang/rust#12004) changelog: New Lint: [`iter_filter_is_ok`] [#12004](rust-lang/rust#12004)

Issue: in rust-lang#12004, we emit a lint for `filter(Option::is_some)`. If the parent expression is a `.map` we don't emit that lint as there exists a more specialized lint for that. The ICE introduced in rust-lang#12004 is a consequence of the assumption that a parent expression after a filter would be a method call with the filter call being the receiver. However, it is entirely possible to have a closure of the form ``` || { vec![Some(1), None].into_iter().filter(Option::is_some) } ``` The previous implementation looked at the parent expression; namely the closure, and tried to check the parameters by indexing [0] on an empty list. This commit is an overhaul of the lint with significantly more FP tests and checks. Impl details: 1. We verify that the filter method we are in is a proper trait method to avoid FPs. 2. We check that the parent expression is not a map by checking whether it exists; if is a trait method; and then a method call. 3. We check that we don't have comments in the span. 4. We verify that we are in an Iterator of Option and Result. 5. We check the contents of the filter. 1. For closures we peel it. If it is not a single expression, we don't lint. 2. For paths, we do a typecheck to avoid FPs for types that impl functions with the same names. 3. For calls, we verify the type, via the path, and that the param of the closure is the single argument to the call. 4. For method calls we verify that the receiver is the parameter of the closure. Since we handle single, non-block exprs, the parameter can't be shadowed, so no FP. This commit also adds additional FP tests.

Fixed ICE introduced in rust-lang#12004 Issue: in rust-lang/rust-clippy#12004, we emit a lint for `filter(Option::is_some)`. If the parent expression is a `.map` we don't emit that lint as there exists a more specialized lint for that. The ICE introduced in rust-lang/rust-clippy#12004 is a consequence of the assumption that a parent expression after a filter would be a method call with the filter call being the receiver. However, it is entirely possible to have a closure of the form ``` || { vec![Some(1), None].into_iter().filter(Option::is_some) } ``` The previous implementation looked at the parent expression; namely the closure, and tried to check the parameters by indexing [0] on an empty list. This commit is an overhaul of the lint with significantly more FP tests and checks. Impl details: 1. We verify that the filter method we are in is a proper trait method to avoid FPs. 2. We check that the parent expression is not a map by checking whether it exists; if is a trait method; and then a method call. 3. We check that we don't have comments in the span. 4. We verify that we are in an Iterator of Option and Result. 5. We check the contents of the filter. 1. For closures we peel it. If it is not a single expression, we don't lint. We then try again by checking the peeled expression. 2. For paths, we do a typecheck to avoid FPs for types that impl functions with the same names. 3. For calls, we verify the type, via the path, and that the param of the closure is the single argument to the call. 4. For method calls we verify that the receiver is the parameter of the closure. Since we handle single, non-block exprs, the parameter can't be shadowed, so no FP. This commit also adds additional FP tests. Fixes: rust-lang#12058 Adding `@xFrednet` as you've the most context for this as you reviewed it last time. `@rustbot` r? `@xFrednet` --- changelog: none (Will be backported and therefore don't effect stable)

alexcrichton reviewed Feb 3, 2014
View reviewed changes

alexcrichton closed this Feb 14, 2014

zwarich mentioned this pull request Jul 14, 2014

rustc should use llvm.lifetime intrinsics to reduce stack space #15665

Closed

dotdash deleted the lifetimes branch February 4, 2015 12:45

xFrednet mentioned this pull request Dec 26, 2023

New lints iter_filter_is_some and iter_filter_is_ok rust-lang/rust-clippy#12004

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use explicit lifetimes for LLVM allocas #12004

Use explicit lifetimes for LLVM allocas #12004

dotdash commented Feb 2, 2014

dotdash commented Feb 2, 2014

brson commented Feb 3, 2014

alexcrichton Feb 3, 2014

dotdash Feb 3, 2014

emberian commented Feb 3, 2014

emberian commented Feb 3, 2014

dotdash commented Feb 3, 2014

alexcrichton commented Feb 14, 2014

Use explicit lifetimes for LLVM allocas #12004

Use explicit lifetimes for LLVM allocas #12004

Conversation

dotdash commented Feb 2, 2014

dotdash commented Feb 2, 2014

brson commented Feb 3, 2014

alexcrichton Feb 3, 2014

Choose a reason for hiding this comment

dotdash Feb 3, 2014

Choose a reason for hiding this comment

emberian commented Feb 3, 2014

emberian commented Feb 3, 2014

dotdash commented Feb 3, 2014

alexcrichton commented Feb 14, 2014