Skip to content

release builds using rustc 1.86.0 on macOS Ventura (intel) SDK exhibit incorrect behaviour #140686

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
spikegrobstein opened this issue May 5, 2025 · 34 comments · Fixed by llvm/llvm-project#142304
Labels
A-cross Area: Cross compilation A-linkers Area: linkers... you gotta love linkers C-external-bug Category: issue that is caused by bugs in software beyond our control llvm-fixed-upstream Issue expected to be fixed by the next major LLVM upgrade, or backported fixes O-apple Operating system: Apple (macOS, iOS, tvOS, visionOS, watchOS) S-has-mcve Status: A Minimal Complete and Verifiable Example has been found for this issue T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@spikegrobstein
Copy link

spikegrobstein commented May 5, 2025

I’d like to report what appears to be a bug in the rust compiler which can be reproduced by using the reqwest crate, where it fails to include the http version string in the request and it only affects binaries that were:

  • built targeting x86_64-apple-darwin
  • using release mode
  • using a macOS Ventura (13.x) SDK

The bug itself has nothing to do with the reqwest crate (looking at the code, it appears that this behaviour should be impossible) and specifically related to the built binary.

this bug does exist in release binaries cross-compiled from linux to macOS when using the macOS 13 (Ventura) SDK with osxcross.

the bug does not exist when creating a debug build or when cross-compiling from an AppleSilicon machine → intel and does not exist when performing a release build on intel → intel when the OS is not Ventura.

We tracked this down to being triggered by opt-level = 2

this bug does not exist in rustc 1.85.0 and appears to only affect 1.86.0 (and also occurs in nightly rust).

I have a reduction:

add reqwest with --features=blocking and this main.rs:

use reqwest::{blocking::Client, Method};

fn main() {
    Client::new().request(Method::GET, "http://localhost:8888/hello").send().unwrap();
}

in another window, start a nc server with: nc -l localhost 8888

then cargo run and you should see:

GET /hello HTTP/1.1
accept: */*
host: localhost:8888

If you then start a new nc server and cargo run --release you will see:

GET /hello
accept: */*
host: localhost:8888

(note the missing HTTP/1.1 on the first line)

when running this code in a debugger, I can see that the slice that’s supposed to add the version number to the request buffer has a length of 0, so it contains no data:

Process 79729 stopped
* thread #2, name = 'reqwest-internal-sync-runtime', stop reason = step in
    frame #0: 0x00000001000b380f reqwest-test`_$LT$hyper..proto..h1..role..Client$u20$as$u20$hyper..proto..h1..Http1Transaction$GT$::encode::h33cc17a167a456d1 at spec_extend.rs:61:18 [opt]
   58       #[track_caller]
   59       fn spec_extend(&mut self, iterator: slice::Iter<'a, T>) {
   60           let slice = iterator.as_slice();
-> 61           unsafe { self.append_elements(slice) };
   62       }
   63   }
Target 0: (reqwest-test) stopped.
(lldb) p self
(alloc::vec::Vec<unsigned char, alloc::alloc::Global> *) $0 = 0x0000000100504b90
(lldb) p self[0]
(alloc::vec::Vec<unsigned char, alloc::alloc::Global>) $1 = size=11 {
  [0] = 'G'
  [1] = 'E'
  [2] = 'T'
  [3] = ' '
  [4] = '/'
  [5] = 'h'
  [6] = 'e'
  [7] = 'l'
  [8] = 'l'
  [9] = 'o'
  [10] = ' '
}
(lldb) p slice
(*const [u8]) $2 = {
  data_ptr = 0x0000000000000000
  length = 0
}

a note from someone on my team who continued to look into it more deeply:

“it looks like it's computing the correct string HTTP/1.1 but for some odd reason it's computing a length of 0. as best I can tell it uses one lookup table of string pointers to find the string, and it uses another lookup table of 4-byte offsets that get added to the lookup table's base address to produce a new pointer, which looks like it's supposed to be the end of the string. Unfortunately that second lookup table has 3 identical values in it, meaning it will produce the correct end pointer for HTTP/1.0 but it produces the start pointer for HTTP/1.1, and so it ends up calculating a length of 0”

This always happens no matter what version of reqwest is used, and it seems to be caused by the rustc version.

I hope this is enough information.

@spikegrobstein spikegrobstein added C-bug Category: This is a bug. regression-untriaged Untriaged performance or correctness regression. labels May 5, 2025
@rustbot rustbot added needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. I-prioritize Issue: Indicates that prioritization has been requested for this issue. labels May 5, 2025
@jieyouxu jieyouxu added the E-needs-bisection Call for participation: This issue needs bisection: https://github.com/rust-lang/cargo-bisect-rustc label May 5, 2025
@moxian
Copy link
Contributor

moxian commented May 5, 2025

Can you try running cargo bisect-rustc to narrow down what triggered the change? The command line would probably look like cargo bisect-rustc --start 1.85.0 --end 1.86.0 --prompt -- run

Edit: .. ah, you might need to replace the -- run with a more involved --script setup (I missed the fact that the issue requires cross-compilation)..

@jieyouxu jieyouxu added A-cross Area: Cross compilation O-apple Operating system: Apple (macOS, iOS, tvOS, visionOS, watchOS) labels May 5, 2025
@jieyouxu
Copy link
Member

jieyouxu commented May 5, 2025

this bug does exist in release binaries cross-compiled from linux to macOS when using the macOS 13 (Ventura) SDK with osxcross.

We may have to rely on your bisection to pinpoint a commit (range), because that setup is quite complicated...

@jieyouxu
Copy link
Member

jieyouxu commented May 5, 2025

To minimize the variables:

  • Another thing to try: does this also reproduce in the latest beta (I'd assume so)?
  • Can you provide the exact reqwest version?
    • And transitively, which http version?

@saethlin saethlin added I-miscompile Issue: Correct Rust code lowers to incorrect machine code and removed needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. labels May 5, 2025
@spikegrobstein
Copy link
Author

spikegrobstein commented May 6, 2025

@moxian Sorry, I should have been more clear; this isn't only on cross compile. it happens if the macOS 13 SDK is used with the x86_64-apple-darwin target. both native and cross compiled.

I ran the bisect script and came up with this being offending commit: d88ffcd

output:

searched nightlies: from nightly-2025-01-03 to nightly-2025-02-15
regressed nightly: nightly-2025-02-15
searched commit range: a567209...d8810e3
regressed commit: d88ffcd

bisected with cargo-bisect-rustc v0.6.9

Host triple: x86_64-apple-darwin
Reproduce with:

cargo bisect-rustc --start 1.85.0 --end 1.86.0 --prompt -- run --release

@jieyouxu

confirmed that this bug occurs in the beta, as well.

I ran this with the following versions of reqwest and it occurred on each: 0.12.3, 0.12.5, 0.12.15

http crate version 1.3.1 is the one in my reduction project.

@jieyouxu jieyouxu added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label May 6, 2025
@jieyouxu
Copy link
Member

jieyouxu commented May 6, 2025

cc @scottmcm

@jieyouxu jieyouxu added E-needs-mcve Call for participation: This issue has a repro, but needs a Minimal Complete and Verifiable Example and removed A-cross Area: Cross compilation E-needs-bisection Call for participation: This issue needs bisection: https://github.com/rust-lang/cargo-bisect-rustc labels May 6, 2025
@moxian

This comment has been minimized.

@jieyouxu

This comment has been minimized.

@spikegrobstein
Copy link
Author

@moxian the tests in hyper all pass with the --release flag.

I ran tests in the reqwest crate for completeness. I get 2 failures in release mode that don't fail without the flag:

running 3 tests
test test_badssl_no_built_in_roots ... ok
test test_badssl_modern ... FAILED
test test_badssl_self_signed ... FAILED

failures:

---- test_badssl_modern stdout ----

thread 'test_badssl_modern' panicked at tests/badssl.rs:14:10:
called `Result::unwrap()` on an `Err` value: reqwest::Error { kind: Request, url: "https://mozilla-modern.badssl.com/", source: hyper_util::client::legacy::Error(SendRequest, hyper::Error(Parse(Version))) }
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

---- test_badssl_self_signed stdout ----

thread 'test_badssl_self_signed' panicked at tests/badssl.rs:55:10:
called `Result::unwrap()` on an `Err` value: reqwest::Error { kind: Request, url: "https://self-signed.badssl.com/", source: hyper_util::client::legacy::Error(SendRequest, hyper::Error(Parse(Version))) }


failures:
    test_badssl_modern
    test_badssl_self_signed

test result: FAILED. 1 passed; 2 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.23s

these are the same errors that I'm reporting in this issue.

the code snippet you had me run does not error whether it's run with --release or not.

Could this be a very specific edge case that reqwest just happens to surface?

@spikegrobstein

This comment has been minimized.

@jieyouxu

This comment has been minimized.

@apiraino
Copy link
Contributor

apiraino commented May 8, 2025

cc: some people in the apple marker team for possibly a help reducing this regression on this platform.

(a bisection is at this comment)

@hkratz
@thomcc
@madsmtm
@BlackHoleFox

@madsmtm
Copy link
Contributor

madsmtm commented May 8, 2025

Thanks for the ping! I can reproduce the top-level example using:

DEVELOPER_DIR=/Applications/Xcode\ 14.3.1.app/Contents/Developer cargo run --release --target x86_64-apple-darwin

I'll try to investigate further.

@jieyouxu jieyouxu added the S-has-bisection Status: A bisection has been found for this issue label May 8, 2025
@madsmtm
Copy link
Contributor

madsmtm commented May 8, 2025

I'm inclined to think that this is a linker bug? Can reproduce with just:

cargo rustc --release --target x86_64-apple-darwin -- -Clinker="/Applications/Xcode 14.3.1.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ld"
./target/x86_64-apple-darwin/release/foo

Though still unsure of what the linker is actually doing wrong here, and if there's something rustc can do differently to avoid hitting this.

But then again, this is also present in the older Xcode 13.4.1, so it may also be a recent problem in rustc / LLVM.

@madsmtm
Copy link
Contributor

madsmtm commented May 8, 2025

I've minimized this to:

// dep.rs
#![crate_type = "lib"]

pub enum Version {
    Http09,
    Http10,
    Http11,
    H2,
    H3,
}

pub fn encode(version: Version, dst: &mut Vec<u8>) -> NonTrivialDrop {
    let has_drop = NonTrivialDrop;

    match version {
        Version::Http10 => dst.extend_from_slice(b"HTTP/1.0"),
        Version::Http11 => dst.extend_from_slice(b"HTTP/1.1"),
        Version::H2 => dst.extend_from_slice(b"HTTP/1.1"),
        _ => {}
    }

    has_drop
}

pub struct NonTrivialDrop;

impl Drop for NonTrivialDrop {
    #[inline(never)]
    #[export_name = "non_trivial_drop"]
    fn drop(&mut self) {
        let _ = Box::new(0);
    }
}
// foo.rs
extern crate dep;

use dep::{Version, encode};

fn main() {
    let mut dst = Vec::with_capacity(100);
    encode(Version::Http11, &mut dst);
    let res = std::str::from_utf8(&dst).unwrap();
    eprintln!("{res:?}");
}

Compile with:

rustc dep.rs --target x86_64-apple-darwin -C opt-level=1
rustc foo.rs --target x86_64-apple-darwin --extern dep=./libdep.rlib '-Clinker=/Applications/Xcode 14.3.1.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ld'
./foo # Outputs "", where it should've output "HTTP/1.1"

Put it up as a GH repo at https://github.com/madsmtm/rustc-switch-older-xcode. Had to be in separate crates for some reason, possibly something relating to how the linker reads things from archives? Still unclear to me.

@madsmtm
Copy link
Contributor

madsmtm commented May 8, 2025

The bisected commit is indeed a red herring, I've pushed a commit to the repo above which demonstrates the problem with at least Rust 1.76 as well. Will try to minimize it further.

@jieyouxu jieyouxu removed the S-has-bisection Status: A bisection has been found for this issue label May 8, 2025
@BlackHoleFox
Copy link
Contributor

BlackHoleFox commented May 13, 2025

Backread the issue but where does the -ld_classic usage come from in Rust by default? Rust should be passing either ld or cc. Then ld in modern SDKs should be their new linker.

@madsmtm
Copy link
Contributor

madsmtm commented May 13, 2025

Yup, rustc doesn't use -ld_classic. The issue appears with the macOS Ventura (13.x) SDK with osxcross, which uses the old ld64 (which is also what's available in Xcode 14.3.1).

The new linker source code hasn't been released, so this is currently the best way to cross-compile from Linux. See also tpoechtrager/osxcross#453 / tpoechtrager/osxcross#457.

@dianqk dianqk added the A-cross Area: Cross compilation label May 14, 2025
@PaulDance
Copy link
Contributor

PaulDance commented May 19, 2025

Considering llvm/llvm-project#140382 and llvm/llvm-project#139439, couldn't this be that rustc somehow emits wrong data that the linker handles however it can?

In our case, we're using ld64.lld as a linker and a crash still occurs, but only since 1.87: llvm/llvm-project#140382 (comment). I don't know if this and that are exactly related, though. Also, it only occurs when CARGO_PROFILE_RELEASE_DEBUG is set to anything other than none.

@dianqk
Copy link
Member

dianqk commented May 19, 2025

As far as I know, this is a linker bug.

Hofer-Julian added a commit to Hofer-Julian/pixi-build-backends that referenced this issue May 20, 2025
This should unblock osx-64 releases.
See also rust-lang/rust#140686
Hofer-Julian added a commit to Hofer-Julian/pixi-build-backends that referenced this issue May 21, 2025
 - unset linker on osx-64 because of rust-lang/rust#140686
 - move to recipes, since we can't unset env vars
@briansmith
Copy link
Contributor

briansmith commented May 21, 2025

[edit: oops, misread the comment I initially replied to] See #141306, which has very similar symptoms, except it has Rust 1.87 && lto=true as the preconditions.

@briansmith
Copy link
Contributor

  • built targeting x86_64-apple-darwin
  • using release mode
  • using a macOS Ventura (13.x) SDK

Is the linker that is believed to have a bug the one that is used by default in macos-13 runners in GitHub Actions?

From https://github.com/actions/runner-images/blob/main/images/macos/macos-13-Readme.md:

OS Version: macOS 13.7.5 (22H527)
Xcode Command Line Tools 14.3.1.0.1.1683849156
Xcodes 1.6.0

@madsmtm
Copy link
Contributor

madsmtm commented May 22, 2025

Yes, either of:

  • Xcode 14.3.1 or below
  • Or higher Xcode, with -ld_classic

@workingjubilee
Copy link
Member

@briansmith The bug, despite seemingly being related, is probably not. Your failed job actually uses Xcode 15, which means it has ld-prime available and is likely not ld64.

dianqk added a commit to llvm/llvm-project that referenced this issue Jun 1, 2025
…REL relocations (#142304)

Follow
#72584 (comment),
the patch will drop the `unnamed_addr` attribute when generating
relative lookup tables. I'm not very confident about this patch, but it
does resolve rust-lang/rust#140686,
rust-lang/rust#141306 and
rust-lang/rust#141737.

But I don't think this will result in worse problems.

> LLVM provides that the calculation of such a constant initializer will
not overflow at link time under the medium code model if x is an
unnamed_addr function. However, it does not provide this guarantee for a
constant initializer folded into a function body. This intrinsic can be
used to avoid the possibility of overflows when loading from such a
constant. ([‘llvm.load.relative’
Intrinsic](https://llvm.org/docs/LangRef.html#id2592))

This is my concern. I'm not sure how unnamed_addr provides this
guarantee, and I haven't found any test cases.
@jieyouxu
Copy link
Member

jieyouxu commented Jun 1, 2025

I think the LLVM patch might need to be backported to our fork? (I.e. this issue is still a problem?)

@jieyouxu jieyouxu reopened this Jun 1, 2025
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this issue Jun 1, 2025
…ating GOTPCREL relocations (#142304)

Follow
llvm/llvm-project#72584 (comment),
the patch will drop the `unnamed_addr` attribute when generating
relative lookup tables. I'm not very confident about this patch, but it
does resolve rust-lang/rust#140686,
rust-lang/rust#141306 and
rust-lang/rust#141737.

But I don't think this will result in worse problems.

> LLVM provides that the calculation of such a constant initializer will
not overflow at link time under the medium code model if x is an
unnamed_addr function. However, it does not provide this guarantee for a
constant initializer folded into a function body. This intrinsic can be
used to avoid the possibility of overflows when loading from such a
constant. ([‘llvm.load.relative’
Intrinsic](https://llvm.org/docs/LangRef.html#id2592))

This is my concern. I'm not sure how unnamed_addr provides this
guarantee, and I haven't found any test cases.
dianqk added a commit to dianqk/llvm-project that referenced this issue Jun 1, 2025
…REL relocations (llvm#142304)

Follow
llvm#72584 (comment),
the patch will drop the `unnamed_addr` attribute when generating
relative lookup tables. I'm not very confident about this patch, but it
does resolve rust-lang/rust#140686,
rust-lang/rust#141306 and
rust-lang/rust#141737.

But I don't think this will result in worse problems.

> LLVM provides that the calculation of such a constant initializer will
not overflow at link time under the medium code model if x is an
unnamed_addr function. However, it does not provide this guarantee for a
constant initializer folded into a function body. This intrinsic can be
used to avoid the possibility of overflows when loading from such a
constant. ([‘llvm.load.relative’
Intrinsic](https://llvm.org/docs/LangRef.html#id2592))

This is my concern. I'm not sure how unnamed_addr provides this
guarantee, and I haven't found any test cases.

(cherry picked from commit aa09dbb)
@dianqk
Copy link
Member

dianqk commented Jun 1, 2025

I think the LLVM patch might need to be backported to our fork? (I.e. this issue is still a problem?)

GitHub "helped" me again.

tstellar pushed a commit to dianqk/llvm-project that referenced this issue Jun 3, 2025
…REL relocations (llvm#142304)

Follow
llvm#72584 (comment),
the patch will drop the `unnamed_addr` attribute when generating
relative lookup tables. I'm not very confident about this patch, but it
does resolve rust-lang/rust#140686,
rust-lang/rust#141306 and
rust-lang/rust#141737.

But I don't think this will result in worse problems.

> LLVM provides that the calculation of such a constant initializer will
not overflow at link time under the medium code model if x is an
unnamed_addr function. However, it does not provide this guarantee for a
constant initializer folded into a function body. This intrinsic can be
used to avoid the possibility of overflows when loading from such a
constant. ([‘llvm.load.relative’
Intrinsic](https://llvm.org/docs/LangRef.html#id2592))

This is my concern. I'm not sure how unnamed_addr provides this
guarantee, and I haven't found any test cases.

(cherry picked from commit aa09dbb)
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this issue Jun 3, 2025
…ating GOTPCREL relocations (#142304)

Follow
llvm/llvm-project#72584 (comment),
the patch will drop the `unnamed_addr` attribute when generating
relative lookup tables. I'm not very confident about this patch, but it
does resolve rust-lang/rust#140686,
rust-lang/rust#141306 and
rust-lang/rust#141737.

But I don't think this will result in worse problems.

> LLVM provides that the calculation of such a constant initializer will
not overflow at link time under the medium code model if x is an
unnamed_addr function. However, it does not provide this guarantee for a
constant initializer folded into a function body. This intrinsic can be
used to avoid the possibility of overflows when loading from such a
constant. ([‘llvm.load.relative’
Intrinsic](https://llvm.org/docs/LangRef.html#id2592))

This is my concern. I'm not sure how unnamed_addr provides this
guarantee, and I haven't found any test cases.

(cherry picked from commit aa09dbb)
@madsmtm madsmtm added the llvm-fixed-upstream Issue expected to be fixed by the next major LLVM upgrade, or backported fixes label Jun 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-cross Area: Cross compilation A-linkers Area: linkers... you gotta love linkers C-external-bug Category: issue that is caused by bugs in software beyond our control llvm-fixed-upstream Issue expected to be fixed by the next major LLVM upgrade, or backported fixes O-apple Operating system: Apple (macOS, iOS, tvOS, visionOS, watchOS) S-has-mcve Status: A Minimal Complete and Verifiable Example has been found for this issue T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet