Skip to content

4000% performance regression with "-C target-cpu=x86-64-v3" and fat LTO #146497

@im-0

Description

@im-0

The problem:

bench_mat4_transform_point3
                        time:   [18.883 ns 18.986 ns 19.093 ns]
                        change: [+4284.5% +4303.0% +4320.2%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 4 outliers among 100 measurements (4.00%)
  3 (3.00%) high mild
  1 (1.00%) high severe

I encountered this issue while working on benchmarks from nalgebra crate after recompiling it with RUSTFLAGS="-C target-cpu=native".

Here is the smaller reproducer with more details in README: https://github.com/im-0/rust-fat-lto-perf-degradation (it still uses nalgebra and criterion as dependencies). This reproducer requires codegen-units = 32, but some benchmarks from nalgebra suffer even with just lto = "fat" and default value of codegen-units (which is 16).

I expected to see this happen: no or small performance degradation

Instead, this happened: more than 4000% performance degradation

Meta

rustc --version --verbose:

rustc 1.89.0 (29483883e 2025-08-04) (Fedora 1.89.0-2.fc42)
binary: rustc
commit-hash: 29483883eed69d5fb4db01964cdf2af4d86e9cb2
commit-date: 2025-08-04
host: x86_64-unknown-linux-gnu
release: 1.89.0
LLVM version: 20.1.8

and

rustc 1.89.0 (29483883e 2025-08-04)
binary: rustc
commit-hash: 29483883eed69d5fb4db01964cdf2af4d86e9cb2
commit-date: 2025-08-04
host: x86_64-unknown-linux-gnu
release: 1.89.0
LLVM version: 20.1.7

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-LTOArea: Link-time optimization (LTO)C-bugCategory: This is a bug.I-slowIssue: Problems and improvements with respect to performance of generated code.needs-triageThis issue may need triage. Remove it if it has been sufficiently triaged.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions