Skip to content

Commit 6cd02a8

Browse files
authored
Rollup merge of #77844 - RalfJung:zst-box, r=nikomatsakis
clarify rules for ZST Boxes LLVM's rules around `getelementptr inbounds` with offset 0 are a bit annoying, and as a consequence we have no choice but say that a `Box<()>` pointing to previously allocated memory that has since been freed is UB. Clarify the docs to reflect this. This is based on conversations on the LLVM mailing list. * Here's my initial mail: https://lists.llvm.org/pipermail/llvm-dev/2019-February/130452.html * The first email of the March part of that thread: https://lists.llvm.org/pipermail/llvm-dev/2019-March/130831.html * First email of the April part: https://lists.llvm.org/pipermail/llvm-dev/2019-April/131693.html The conclusion for me at least was that `getelementptr inbounds` with offset 0 is *not* the identity function, but can sometimes return `poison` even when the input is a regular pointer -- specifically, it returns `poison` when this pointer points into something that LLVM "knows has been deallocated", i.e., a former LLVM-managed allocation. It is however the identity function on pointers obtained by casting integers. Note that there [are formal proposals](https://people.mpi-sws.org/~jung/twinsem/twinsem.pdf) for LLVM semantics where `getelementptr inbounds` with offset 0 isn't quite the identity function but never returns `poison` (it affects the provenance of the pointer but in a way that doesn't matter if this pointer is never used for memory accesses), and indeed this is likely necessary to consistently describe LLVM semantics. But with the informal LLVM LangRef that we have right now, and with LLVM devs insisting otherwise, it seems unwise to rely on this.
2 parents d806d65 + a7677f7 commit 6cd02a8

File tree

2 files changed

+17
-2
lines changed

2 files changed

+17
-2
lines changed

library/alloc/src/boxed.rs

+11
Original file line numberDiff line numberDiff line change
@@ -62,6 +62,13 @@
6262
//! T` obtained from [`Box::<T>::into_raw`] may be deallocated using the
6363
//! [`Global`] allocator with [`Layout::for_value(&*value)`].
6464
//!
65+
//! For zero-sized values, the `Box` pointer still has to be [valid] for reads
66+
//! and writes and sufficiently aligned. In particular, casting any aligned
67+
//! non-zero integer literal to a raw pointer produces a valid pointer, but a
68+
//! pointer pointing into previously allocated memory that since got freed is
69+
//! not valid. The recommended way to build a Box to a ZST if `Box::new` cannot
70+
//! be used is to use [`ptr::NonNull::dangling`].
71+
//!
6572
//! So long as `T: Sized`, a `Box<T>` is guaranteed to be represented
6673
//! as a single pointer and is also ABI-compatible with C pointers
6774
//! (i.e. the C type `T*`). This means that if you have extern "C"
@@ -125,6 +132,7 @@
125132
//! [`Global`]: crate::alloc::Global
126133
//! [`Layout`]: crate::alloc::Layout
127134
//! [`Layout::for_value(&*value)`]: crate::alloc::Layout::for_value
135+
//! [valid]: ptr#safety
128136
129137
#![stable(feature = "rust1", since = "1.0.0")]
130138

@@ -530,7 +538,10 @@ impl<T: ?Sized> Box<T> {
530538
/// memory problems. For example, a double-free may occur if the
531539
/// function is called twice on the same raw pointer.
532540
///
541+
/// The safety conditions are described in the [memory layout] section.
542+
///
533543
/// # Examples
544+
///
534545
/// Recreate a `Box` which was previously converted to a raw pointer
535546
/// using [`Box::into_raw`]:
536547
/// ```

library/core/src/ptr/mod.rs

+6-2
Original file line numberDiff line numberDiff line change
@@ -16,12 +16,16 @@
1616
//! provided at this point are very minimal:
1717
//!
1818
//! * A [null] pointer is *never* valid, not even for accesses of [size zero][zst].
19-
//! * All pointers (except for the null pointer) are valid for all operations of
20-
//! [size zero][zst].
2119
//! * For a pointer to be valid, it is necessary, but not always sufficient, that the pointer
2220
//! be *dereferenceable*: the memory range of the given size starting at the pointer must all be
2321
//! within the bounds of a single allocated object. Note that in Rust,
2422
//! every (stack-allocated) variable is considered a separate allocated object.
23+
//! * Even for operations of [size zero][zst], the pointer must not be pointing to deallocated
24+
//! memory, i.e., deallocation makes pointers invalid even for zero-sized operations. However,
25+
//! casting any non-zero integer *literal* to a pointer is valid for zero-sized accesses, even if
26+
//! some memory happens to exist at that address and gets deallocated. This corresponds to writing
27+
//! your own allocator: allocating zero-sized objects is not very hard. The canonical way to
28+
//! obtain a pointer that is valid for zero-sized accesses is [`NonNull::dangling`].
2529
//! * All accesses performed by functions in this module are *non-atomic* in the sense
2630
//! of [atomic operations] used to synchronize between threads. This means it is
2731
//! undefined behavior to perform two concurrent accesses to the same location from different

0 commit comments

Comments
 (0)