Skip to content

black_box(f()) in scope 'b gets optimized away, unless fis ||:'a -> T and 'a > 'b #14367

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
japaric opened this issue May 22, 2014 · 2 comments
Closed

Comments

@japaric
Copy link
Member

japaric commented May 22, 2014

Script to reproduce:

extern crate test;
extern crate time;

use test::black_box;
use time::precise_time_ns;

fn fib(n: uint) -> uint {
    match n {
        0 | 1 => 1,
        n => fib(n - 1) + fib(n - 2),
    }
}

fn bench(fun: || -> uint) -> f64 {
    // KILL_SWITCH
    //let fun = || fib(20);

    let iters = 1_000;
    let start = precise_time_ns();
    for _ in range(0, iters) {
        black_box(fun());
    }
    let end = precise_time_ns();

    (end - start) as f64 / iters as f64
}

fn main() {
    println!("{}", bench(|| fib(20)))
}
$ rustc -O bench.rs && ./bench  # any `--opt-level` can be used instead of `-O`
31190.276
$ rustc -O bench.rs && ./bench  # change `iters` to `10_000`, output doesn't change too much as expected
30723.2899
$ rustc -O bench.rs && ./bench  # toggle KILL_SWITCH
4.0624
$ rustc -O bench.rs && ./bench  # change `iters` back to `1_000`
45.602
$ rustc --version
rustc 0.11.0-pre (e7e5e9c 2014-05-22 02:51:21 -0700)

I found this issue because I tried to benchmark directly in main without the auxiliar bench function, by calling black_box(fib(20)) in a loop, but black_box got optimized away and I got results like shown above.

My hypothesis is that black_box only seems to only work with closures whose captured env outlive the scope where black_box is used. The above example supports this hypothesis.

This issue doesn't (seem to) affect the Bencher framework right now. ButIi/when we get unboxed closures, the example above would pass an unboxed closure that captures nothing (a unit-like struct?), would black_box still work in that case?

cc @huonw

@huonw
Copy link
Member

huonw commented May 22, 2014

I don't think this is a bug, at least, the only "bug" here is the non-KILL_SWITCH version not being optimised.

In particular, calling black_box(fib(20)) is the same as writing let n = fib(20); black_box(n), the fib call can be moved around freely, as long as the n value passed into black_box is the correct one. Looking at the IR, you can see this, the "correct" (i.e. without KILL_SWITCH) IR looks like (the match_else.i block is the body of the loop):

  %1 = call i64 @_ZN15precise_time_ns20h53df8191405a33f50pa11v0.11.0.preE()
  %2 = bitcast i64* %dummy.i.i to i8*
  br label %match_else.i

match_else.i:                                     ; preds = %match_else.i, %entry-block
  %3 = phi i64 [ 0, %entry-block ], [ %4, %match_else.i ]
  %4 = add i64 %3, 1
  %5 = call fastcc i64 @_ZN3fib20hebff6027c68c44a7iaa4v0.0E(i64 20)
  call void @llvm.lifetime.start(i64 8, i8* %2) #3
  store i64 %5, i64* %dummy.i.i, align 8
  call void asm "", "r,~{dirflag},~{fpsr},~{flags}"(i64* %dummy.i.i) #3
  call void @llvm.lifetime.end(i64 8, i8* %2) #3
  %exitcond.i = icmp eq i64 %4, 1000
  br i1 %exitcond.i, label %_ZN5bench20h269af04030408b87Jaa4v0.0E.exit, label %match_else.i

_ZN5bench20h269af04030408b87Jaa4v0.0E.exit:       ; preds = %match_else.i
  %6 = call i64 @_ZN15precise_time_ns20h53df8191405a33f50pa11v0.11.0.preE()

but the "bad" KILL_SWITCH version looks like

  %1 = call i64 @_ZN15precise_time_ns20h53df8191405a33f50pa11v0.11.0.preE()
  %2 = call fastcc i64 @_ZN3fib20hc787ae08bc88b138iaa4v0.0E(i64 20) #3
  %3 = bitcast i64* %dummy.i.i to i8*
  br label %match_else.i

match_else.i:                                     ; preds = %match_else.i, %entry-block
  %4 = phi i64 [ 0, %entry-block ], [ %5, %match_else.i ]
  %5 = add i64 %4, 1
  call void @llvm.lifetime.start(i64 8, i8* %3) #3
  store i64 %2, i64* %dummy.i.i, align 8
  call void asm "", "r,~{dirflag},~{fpsr},~{flags}"(i64* %dummy.i.i) #3
  call void @llvm.lifetime.end(i64 8, i8* %3) #3
  %exitcond.i = icmp eq i64 %5, 1000
  br i1 %exitcond.i, label %_ZN5bench20hd83cb69b9731384fJaa4v0.0E.exit, label %match_else.i

_ZN5bench20hd83cb69b9731384fJaa4v0.0E.exit:       ; preds = %match_else.i
  %6 = call i64 @_ZN15precise_time_ns20h53df8191405a33f50pa11v0.11.0.preE()

In particular, the black_box call is always there (the call void asm line), and the only real difference is the position of the fib call (call fastcc i64 @_ZN3fib20hc787ae08bc88b138iaa4v0.0E(i64 20) #3), since LLVM recognised that the fib function is "pure" (i.e. just arithmetic) and so doesn't need to be recalled each time. However, this doesn't explain why it's not optimising the "correct" case.

To get this to reliably do what you want, without us disabling optimisations (which is a nonstarter), you can pass the argument to fib through a black box, e.g.

|| {
    let mut n = 20;
    black_box(&mut n); // doesn't change it, but LLVM can't tell.
    fib(n)
}

This makes the IR slightly less nice, but it preserves the fib call in the loop.

@japaric
Copy link
Member Author

japaric commented May 23, 2014

@huonw Thanks a lot for the detailed explanation! This cleared my doubts about how black_box works, and also explains why f = precise_time_ns also "works" (precise_time_ns is an 'impure" ffi call).

I'm closing this because my hypothesis have been invalidated.

@japaric japaric closed this as completed May 23, 2014
bors added a commit to rust-lang-ci/rust that referenced this issue Aug 7, 2023
…rent-types, r=HKalbasi

Fixup path fragments upon MBE transcription

Fixes rust-lang#14367

There are roughly two types of paths: paths in expression context, where a separator `::` between an identifier and its following generic argument list is mandatory, and paths in type context, where `::` can be omitted.

Unlike rustc, we need to transform the parsed fragments back into tokens during transcription. When the matched path fragment is a type-context path and is transcribed as an expression-context path, verbatim transcription would cause a syntax error.

This PR fixes up path fragments by inserting `::` to make sure they are syntactically correct in all contexts. Note that this works because expression-context paths are a strict superset of type-context paths.
bors pushed a commit to rust-lang-ci/rust that referenced this issue Apr 22, 2025
…ust-lang#14367)

There were two bugs here. Let's assume `T` is a singleton type
implementing `Default` and that `f()` takes a `T`:

- `f(<T>::default())` cannot be replaced by `f(<T)` as it was (incorrect
spans – this is tricky because the type relative path uses a base span
covering only `T`, not `<T>`) (third commit)
- The argument of `f(<_>::default())` is inferred correctly, but cannot
be replaced by `<_>` or `_`, as this cannot be used to infer an instance
of a singleton type (first commit).

The second commit offers better error messages by pointing at the whole
expression.

Fix rust-lang#12654

changelog: [`default_constructed_unit_struct`]: do not suggest incorrect
fix when using a type surrounded by brackets
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants