-
Notifications
You must be signed in to change notification settings - Fork 13.8k
Description
Code
I minimized the issue to the following testcase, which uses serde and bincode
to serialize a structure by boxing it, serializing the exposed address and reconstructs the box from the deserialized address.
I believe this to be valid, as long as one can uphold the condition that you don't deserialize the same struct twice. However, the testcase below fails due to a manifestation of UB.
use serde::de::Error;
use serde::{Deserialize, Deserializer, Serialize, Serializer};
use std::hint::black_box;
use std::marker::PhantomData;
use std::sync::Arc;
use std::{fmt, ptr};
struct MySender<T> {
inner: Arc<T>,
}
impl<T> Clone for MySender<T> {
fn clone(&self) -> Self {
Self {
inner: self.inner.clone(),
}
}
}
impl<T> MySender<T>
where
T: Serialize,
{
fn send(&self, value: T) -> Result<(), ()> {
black_box(value);
Ok(())
}
}
impl<T> Serialize for MySender<T>
where
T: Serialize,
{
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where
S: Serializer,
{
// We know everything is in one address-space, so we can "serialize" the sender by
// sending a leaked Box pointer.
let leaked_sender = Box::into_raw(Box::new(self.clone()));
let sender_clone_addr: usize = leaked_sender.expose_provenance();
println!("Serialized addr is {:X}", sender_clone_addr);
serializer.serialize_newtype_struct("MySender", &sender_clone_addr)
}
}
struct MySenderVisitor<T> {
marker: PhantomData<T>,
}
impl<'de, T: Serialize + Deserialize<'de>> serde::de::Visitor<'de> for MySenderVisitor<T> {
type Value = MySender<T>;
fn expecting(&self, formatter: &mut fmt::Formatter) -> fmt::Result {
formatter.write_str("a MySender")
}
// ---------
// Variant 2: deserialize usize: Testcase passes
// ---------
fn visit_newtype_struct<D>(self, deserializer: D) -> Result<Self::Value, D::Error>
where
D: Deserializer<'de>,
{
// Newtype structs transparently serialize to the inner type.
let addr = usize::deserialize(deserializer)?;
let is_aligned = addr % align_of::<Self::Value>() == 0;
// Adding these checks here, adds branches, which causes the UB to manifest more
// visibly. We can also remove / comment the checks here, and the Box::from_raw will fail.
if addr == 0 {
return Err(D::Error::custom("address is zero"));
} else if !is_aligned {
// Result: "addr 103791B10 is not aligned to 8. is_aligned: true"
// The above line is only possible if we have UB (and the condition check is optimized away).
let msg = format!(
"addr {addr:X} is not aligned to {}. is_aligned: {is_aligned}",
align_of::<Self::Value>()
);
return Err(D::Error::custom(msg));
}
let ptr: *mut Self::Value = ptr::with_exposed_provenance_mut(addr);
assert!(!ptr.is_null());
let reconstructed = unsafe { Box::from_raw(ptr) };
Ok(*reconstructed)
}
// ---------
// Variant 2: deserialize u64 and cast to usize: Testcase passes
// ---------
// fn visit_newtype_struct<D>(self, deserializer: D) -> Result<Self::Value, D::Error>
// where
// D: Deserializer<'de>,
// {
// deserializer.deserialize_u64(self)
// }
//
// fn visit_u64<E>(self, v: u64) -> Result<Self::Value, E>
// where
// E: Error,
// {
// let addr = v as usize;
// let ptr: *mut Self::Value = ptr::with_exposed_provenance_mut(addr);
// assert!(!ptr.is_null());
// let reconstructed = unsafe { Box::from_raw(ptr) };
// Ok(*reconstructed)
// }
}
impl<'a, T: Serialize + Deserialize<'a>> Deserialize<'a> for MySender<T> {
fn deserialize<D>(d: D) -> Result<MySender<T>, D::Error>
where
D: Deserializer<'a>,
{
d.deserialize_newtype_struct(
"MySender",
MySenderVisitor {
marker: PhantomData,
},
)
}
}
#[cfg(test)]
mod single_process_channel_tests {
use std::sync::Arc;
use crate::MySender;
// This test works / does not show signs of UB
#[test]
fn serialize_roundtrip_bincode2() {
let generic_sender = MySender {
inner: Arc::new(42),
};
let config = bincode2::config::legacy();
let data = bincode2::serde::encode_to_vec(&generic_sender, config.clone()).unwrap();
eprintln!("Serialized: {data:?} - len: {}", data.len());
let (reconstructed, _len): (MySender<u64>, _) =
bincode2::serde::decode_from_slice(&data, config).unwrap();
reconstructed.send(42).unwrap();
}
// This test case manifests UB
#[test]
fn serialize_roundtrip_bincode() {
let generic_sender: MySender<u64> = MySender {
inner: Arc::new(42_u64),
};
let mut data = Vec::with_capacity(1024);
bincode::serialize_into(&mut data, &generic_sender).expect("Serialization failed");
eprintln!("Serialized: {data:?}",);
let reconstructed: MySender<u64> =
bincode::deserialize(&data).expect("Deserialization failed");
reconstructed.send(42_u64).unwrap();
}
}
Cargo.toml:
[package]
name = "test_ub"
version = "0.1.0"
edition = "2024"
[lib]
name = "test_ub"
[dependencies]
serde = { version = "1.0.225", features = ["derive"] }
bincode2 = {package = "bincode", version = "2.0.1", features = ["serde"]}
bincode = "1.3.3"
[profile.release]
debug-assertions = true
opt-level = 3
debug = true
I expected to see this happen: The test cases pass in release mode (with debug assertions enabled)
Instead, this happened: The test case serialize_roundtrip_bincode
fails due to a manifestation of UB in release mode. (See below for details)
Version it worked on
It most recently worked on: Rust 1.89
Version with regression
rustc --version --verbose
:
rustc 1.90.0 (1159e78c4 2025-09-14)
binary: rustc
commit-hash: 1159e78c4747b02ef996e55082b704c09b970588
commit-date: 2025-09-14
host: aarch64-apple-darwin
release: 1.90.0
LLVM version: 20.1.8
Still reproduces with latest nightly (1.92). I believe it also reproduces on Linux (although I haven't tested with this minimal reproducer)
UB manifestation
During program execution (cargo test --release --lib
), we reach line 76 during deserialization, and the printed error message reveals that the else if
condition check got optimised away (the condition is !is_aligned
, but is_aligned
is true), which is only possible if there is UB.
black_box
ing the address (usize) during deserialization or printing the address makes the UB manifestation disappear.
I believe the above program to be valid, assume that serde
is sound, and can still reproduce the issue after replacing the 2 unsafe usages in bincode 1.3.3
we hit (reading and writing a u64) with safe rust.
@rustbot modify labels: +regression-from-stable-to-stable -regression-untriaged