Skip to content

Use explicit lifetimes for LLVM allocas #12004

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

dotdash
Copy link
Contributor

@dotdash dotdash commented Feb 2, 2014

Currently, stack allocated variables live for the whole duration of the
function they're allocated in. This means that LLVM can't put two
allocas into the same stack space, even if they're never used at the
same time.

By using the LLVM intrinsics to declare the lifetime of allocas, we
allow LLVM to use the same stack space for allocas that have
non-overlapping lifetimes, reducing stack usage. In certain situations
this even allows LLVM to remove duplicated code that previously only
differed in which part of the stack it used.

Currently, stack allocated variables live for the whole duration of the
function they're allocated in. This means that LLVM can't put two
allocas into the same stack space, even if they're never used at the
same time.

By using the LLVM intrinsics to declare the lifetime of allocas, we
allow LLVM to use the same stack space for allocas that have
non-overlapping lifetimes, reducing stack usage. In certain situations
this even allows LLVM to remove duplicated code that previously only
differed in which part of the stack it used.
@dotdash
Copy link
Contributor Author

dotdash commented Feb 2, 2014

This is more of an RFC at this time. I'm not exactly happy with the concrete implementation, because it introduces lots of extra cleanups and LLVM IR, increasing the memory usage when compiling librustc by about 15% and the compile time by maybe 3% as well. Both of whic seem excessive.

I guess we could drop the lifetime calls when we're in the toplevel scope of a function, but I couldn't figure out how to do that in a clean way.

That said, given this code (which is extremly favorable to this change):

pub fn foo (x: bool) {
    match x {
        true => println!("Foo{}", 5),
        false => println!("Foo{}", 6),
    };
}

This is how the resulting assembly differs:

--- before.s    2014-02-02 23:33:09.706317828 +0100
+++ after.s 2014-02-02 23:33:14.650273534 +0100
@@ -1,51 +1,40 @@
-   .globl  _ZN3foo19h81a28b9b9b1b3681ad4v0.0E
+   .globl  _ZN3foo19h81ff6a854ff46f5cad4v0.0E
    .align  16, 0x90
-   .type   _ZN3foo19h81a28b9b9b1b3681ad4v0.0E,@function
-_ZN3foo19h81a28b9b9b1b3681ad4v0.0E:
+   .type   _ZN3foo19h81ff6a854ff46f5cad4v0.0E,@function
+_ZN3foo19h81ff6a854ff46f5cad4v0.0E:
    .cfi_startproc
    cmpq    %fs:112, %rsp
    ja  .LBB0_2
-   movabsq $120, %r10
+   movabsq $56, %r10
    movabsq $0, %r11
    callq   __morestack
    retq
 .LBB0_2:
-   subq    $120, %rsp
+   subq    $56, %rsp
 .Ltmp1:
-   .cfi_def_cfa_offset 128
+   .cfi_def_cfa_offset 64
    movzbl  %dil, %eax
    cmpl    $1, %eax
    jne .LBB0_4
-   movq    $5, 112(%rsp)
-   leaq    _ZN3fmt11secret_show19h1cfae34b462b5821a24v0.0E(%rip), %rax
-   movq    %rax, 96(%rsp)
-   leaq    112(%rsp), %rax
-   movq    %rax, 104(%rsp)
-   leaq    _ZN3foo15__STATIC_FMTSTR19h1e2ce5d40ae038c7ah4v0.0E(%rip), %rax
-   movq    %rax, 64(%rsp)
-   movq    $2, 72(%rsp)
-   leaq    96(%rsp), %rax
-   movq    %rax, 80(%rsp)
-   movq    $1, 88(%rsp)
-   leaq    64(%rsp), %rdi
+   movq    $5, 48(%rsp)
    jmp .LBB0_5
 .LBB0_4:
-   movq    $6, 56(%rsp)
-   leaq    _ZN3fmt11secret_show19h1cfae34b462b5821a24v0.0E(%rip), %rax
-   movq    %rax, 40(%rsp)
-   leaq    56(%rsp), %rax
-   movq    %rax, 48(%rsp)
-   leaq    _ZN3foo15__STATIC_FMTSTR19h1e2ce5d40ae038c7ah4v0.0E(%rip), %rax
-   movq    %rax, 8(%rsp)
-   movq    $2, 16(%rsp)
-   leaq    40(%rsp), %rax
-   movq    %rax, 24(%rsp)
-   movq    $1, 32(%rsp)
-   leaq    8(%rsp), %rdi
+   movq    $6, 48(%rsp)
 .LBB0_5:
+   leaq    _ZN3fmt11secret_show19h49d00ba23b26f207a24v0.0E(%rip), %rax
+   movq    %rax, 32(%rsp)
+   leaq    48(%rsp), %rax
+   movq    %rax, 40(%rsp)
+   leaq    _ZN3foo15__STATIC_FMTSTR19h96bb7cc571f17624ah4v0.0E(%rip), %rax
+   movq    %rax, (%rsp)
+   movq    $2, 8(%rsp)
+   leaq    32(%rsp), %rax
+   movq    %rax, 16(%rsp)
+   movq    $1, 24(%rsp)
+   leaq    (%rsp), %rdi
    callq   _ZN2io5stdio12println_args19h7932d545acb66ab6ak9v0.10.preE@PLT
-   addq    $120, %rsp
+   addq    $56, %rsp
    retq
 .Ltmp2:
-   .size   _ZN3foo19h81a28b9b9b1b3681ad4v0.0E, .Ltmp2-_ZN3foo19h81a28b9b9b1b3681ad4v0.0E
+   .size   _ZN3foo19h81ff6a854ff46f5cad4v0.0E, .Ltmp2-_ZN3foo19h81ff6a854ff46f5cad4v0.0E
    .cfi_endproc

@brson
Copy link
Contributor

brson commented Feb 3, 2014

Neat results!

Since this has a non-trivial expense I wonder if we might put it behind an experimental optimizations flag.

@@ -80,6 +80,8 @@ pub fn get_simple_intrinsic(ccx: @CrateContext, item: &ast::ForeignItem) -> Opti
"bswap16" => Some(ccx.intrinsics.get_copy(&("llvm.bswap.i16"))),
"bswap32" => Some(ccx.intrinsics.get_copy(&("llvm.bswap.i32"))),
"bswap64" => Some(ccx.intrinsics.get_copy(&("llvm.bswap.i64"))),
"lifetime_start" => Some(ccx.intrinsics.get_copy(&("llvm.lifetime.start"))),
"lifetime_end" => Some(ccx.intrinsics.get_copy(&("llvm.lifetime.end"))),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you'd need to modify this location, these are the intrinsics we expose under extern "rust-intrinsic" { ... }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, right. I initially thought about exposing the intrinsics, then decided against it and forgot to clean up. Thanks!

@emberian
Copy link
Member

emberian commented Feb 3, 2014

I really like the result, though I find the side effects on compiler performance unfortunate.

@emberian
Copy link
Member

emberian commented Feb 3, 2014

Maybe we can do it at --opt-level=3?

@dotdash
Copy link
Contributor Author

dotdash commented Feb 3, 2014

This needs a lot more work. Array allocas are currently wrong (size only accounts for a single argument), and (for example) allocas for function arguments (which probably account for a lot of stack usage) don't get a proper marker for the end of their lifetime, yet. I don't expect to get enough time during the week due to $DAYJOB, but I'll revisit this during the weekend.

@alexcrichton
Copy link
Member

Closing due to inactivity, but I'd love to see this as an optimization pass!

@dotdash dotdash deleted the lifetimes branch February 4, 2015 12:45
bors added a commit to rust-lang/rust-clippy that referenced this pull request Dec 26, 2023
New lints `iter_filter_is_some` and `iter_filter_is_ok`

Adds a pair of lints that check for cases of an iterator over `Result` and `Option` followed by `filter` without being followed by `map` as that is covered already by a different, specialized lint.

Fixes #11843

PS, I also made some minor documentations fixes in a case where a double tick (`) was included.

---

* changelog: New Lint: [`iter_filter_is_some`]
[#12004](rust-lang/rust#12004)
* changelog: New Lint: [`iter_filter_is_ok`]
[#12004](rust-lang/rust#12004)
bors added a commit to rust-lang/rust-clippy that referenced this pull request Dec 26, 2023
New lints `iter_filter_is_some` and `iter_filter_is_ok`

Adds a pair of lints that check for cases of an iterator over `Result` and `Option` followed by `filter` without being followed by `map` as that is covered already by a different, specialized lint.

Fixes #11843

PS, I also made some minor documentations fixes in a case where a double tick (`) was included.

---

changelog: New Lint: [`iter_filter_is_some`]
[#12004](rust-lang/rust#12004)
changelog: New Lint: [`iter_filter_is_ok`]
[#12004](rust-lang/rust#12004)
flip1995 pushed a commit to flip1995/rust that referenced this pull request Jan 11, 2024
Issue: in rust-lang#12004, we emit a lint for `filter(Option::is_some)`. If the
parent expression is a `.map` we don't emit that lint as there exists a
more specialized lint for that.

The ICE introduced in rust-lang#12004 is a consequence of the assumption that a
parent expression after a filter would be a method call with the filter
call being the receiver. However, it is entirely possible to have a
closure of the form

```
|| { vec![Some(1), None].into_iter().filter(Option::is_some) }
```
The previous implementation looked at the parent expression; namely the
closure, and tried to check the parameters by indexing [0] on an empty
list.

This commit is an overhaul of the lint with significantly more FP tests
and checks.

Impl details:

1. We verify that the filter method we are in is a proper trait method
   to avoid FPs.
2. We check that the parent expression is not a map by checking whether
   it exists; if is a trait method; and then a method call.
3. We check that we don't have comments in the span.
4. We verify that we are in an Iterator of Option and Result.
5. We check the contents of the filter.
  1. For closures we peel it. If it is not a single expression, we don't
     lint.
  2. For paths, we do a typecheck to avoid FPs for types that impl
     functions with the same names.
  3. For calls, we verify the type, via the path, and that the param of
     the closure is the single argument to the call.
  4. For method calls we verify that the receiver is the parameter of
     the closure. Since we handle single, non-block exprs, the
     parameter can't be shadowed, so no FP.

This commit also adds additional FP tests.
flip1995 pushed a commit to flip1995/rust that referenced this pull request Jan 11, 2024
Fixed ICE introduced in rust-lang#12004

Issue: in rust-lang/rust-clippy#12004, we emit a lint for `filter(Option::is_some)`. If the
parent expression is a `.map` we don't emit that lint as there exists a
more specialized lint for that.

The ICE introduced in rust-lang/rust-clippy#12004 is a consequence of the assumption that a
parent expression after a filter would be a method call with the filter
call being the receiver. However, it is entirely possible to have a
closure of the form

```
|| { vec![Some(1), None].into_iter().filter(Option::is_some) }
```
The previous implementation looked at the parent expression; namely the
closure, and tried to check the parameters by indexing [0] on an empty
list.

This commit is an overhaul of the lint with significantly more FP tests
and checks.

Impl details:

1. We verify that the filter method we are in is a proper trait method
   to avoid FPs.
2. We check that the parent expression is not a map by checking whether
   it exists; if is a trait method; and then a method call.
3. We check that we don't have comments in the span.
4. We verify that we are in an Iterator of Option and Result.
5. We check the contents of the filter.
   1. For closures we peel it. If it is not a single expression, we don't
     lint. We then try again by checking the peeled expression.
   2. For paths, we do a typecheck to avoid FPs for types that impl
     functions with the same names.
   3. For calls, we verify the type, via the path, and that the param of
     the closure is the single argument to the call.
   4. For method calls we verify that the receiver is the parameter of
     the closure. Since we handle single, non-block exprs, the
     parameter can't be shadowed, so no FP.

This commit also adds additional FP tests.

Fixes: rust-lang#12058

Adding `@xFrednet` as you've the most context for this as you reviewed it last time.

`@rustbot` r? `@xFrednet`

---

changelog: none
(Will be backported and therefore don't effect stable)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants