diff --git a/text/0000-extended-compare-and-swap.md b/text/0000-extended-compare-and-swap.md new file mode 100644 index 00000000000..5c219202aa2 --- /dev/null +++ b/text/0000-extended-compare-and-swap.md @@ -0,0 +1,115 @@ +- Feature Name: extended_compare_and_swap +- Start Date: 2016-1-5 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary + +Rust currently provides a `compare_and_swap` method on atomic types, but this method only exposes a subset of the functionality of the C++11 equivalents [`compare_exchange_strong` and `compare_exchange_weak`](http://en.cppreference.com/w/cpp/atomic/atomic/compare_exchange): + +- `compare_and_swap` maps to the C++11 `compare_exchange_strong`, but there is no Rust equivalent for `compare_exchange_weak`. The latter is allowed to fail spuriously even when the comparison succeeds, which allows the compiler to generate better assembly code when the compare and swap is used in a loop. + +- `compare_and_swap` only has a single memory ordering parameter, whereas the C++11 versions have two: the first describes the memory ordering when the operation succeeds while the second one describes the memory ordering on failure. + +# Motivation +[motivation]: #motivation + +While all of these variants are identical on x86, they can allow more efficient code to be generated on architectures such as ARM: + +- On ARM, the strong variant of compare and swap is compiled into an `LDREX` / `STREX` loop which restarts the compare and swap when a spurious failure is detected. This is unnecessary for many lock-free algorithms since the compare and swap is usually already inside a loop and a spurious failure is often caused by another thread modifying the atomic concurrently, which will probably cause the compare and swap to fail anyways. + +- When Rust lowers `compare_and_swap` to LLVM, it uses the same memory ordering type for success and failure, which on ARM adds extra memory barrier instructions to the failure path. Most lock-free algorithms which make use of compare and swap in a loop only need relaxed ordering on failure since the operation is going to be restarted anyways. + +# Detailed design +[design]: #detailed-design + +Since `compare_and_swap` is stable, we can't simply add a second memory ordering parameter to it. This RFC proposes deprecating the `compare_and_swap` function and replacing it with `compare_exchange` and `compare_exchange_weak`, which match the names of the equivalent C++11 functions (with the `_strong` suffix removed). + +## `compare_exchange` + +A new method is instead added to atomic types: + +```rust +fn compare_exchange(&self, current: T, new: T, success: Ordering, failure: Ordering) -> T; +``` + +The restrictions on the failure ordering are the same as C++11: only `SeqCst`, `Acquire` and `Relaxed` are allowed and it must be equal or weaker than the success ordering. Passing an invalid memory ordering will result in a panic, although this can often be optimized away since the ordering is usually statically known. + +The documentation for the original `compare_and_swap` is updated to say that it is equivalent to `compare_exchange` with the following mapping for memory orders: + +Original | Success | Failure +-------- | ------- | ------- +Relaxed | Relaxed | Relaxed +Acquire | Acquire | Acquire +Release | Release | Relaxed +AcqRel | AcqRel | Acquire +SeqCst | SeqCst | SeqCst + +## `compare_exchange_weak` + +A new method is instead added to atomic types: + +```rust +fn compare_exchange_weak(&self, current: T, new: T, success: Ordering, failure: Ordering) -> (T, bool); +``` + +`compare_exchange` does not need to return a success flag because it can be inferred by checking if the returned value is equal to the expected one. This is not possible for `compare_exchange_weak` because it is allowed to fail spuriously, which means that it could fail to perform the swap even though the returned value is equal to the expected one. + +A lock free algorithm using a loop would use the returned bool to determine whether to break out of the loop, and if not, use the returned value for the next iteration of the loop. + +## Intrinsics + +These are the existing intrinsics used to implement `compare_and_swap`: + +```rust + pub fn atomic_cxchg(dst: *mut T, old: T, src: T) -> T; + pub fn atomic_cxchg_acq(dst: *mut T, old: T, src: T) -> T; + pub fn atomic_cxchg_rel(dst: *mut T, old: T, src: T) -> T; + pub fn atomic_cxchg_acqrel(dst: *mut T, old: T, src: T) -> T; + pub fn atomic_cxchg_relaxed(dst: *mut T, old: T, src: T) -> T; +``` + +The following intrinsics need to be added to support relaxed memory orderings on failure: + +```rust + pub fn atomic_cxchg_acqrel_failrelaxed(dst: *mut T, old: T, src: T) -> T; + pub fn atomic_cxchg_failacq(dst: *mut T, old: T, src: T) -> T; + pub fn atomic_cxchg_failrelaxed(dst: *mut T, old: T, src: T) -> T; + pub fn atomic_cxchg_acq_failrelaxed(dst: *mut T, old: T, src: T) -> T; +``` + +The following intrinsics need to be added to support `compare_exchange_weak`: + +```rust + pub fn atomic_cxchg_weak(dst: *mut T, old: T, src: T) -> (T, bool); + pub fn atomic_cxchg_weak_acq(dst: *mut T, old: T, src: T) -> (T, bool); + pub fn atomic_cxchg_weak_rel(dst: *mut T, old: T, src: T) -> (T, bool); + pub fn atomic_cxchg_weak_acqrel(dst: *mut T, old: T, src: T) -> (T, bool); + pub fn atomic_cxchg_weak_relaxed(dst: *mut T, old: T, src: T) -> (T, bool); + pub fn atomic_cxchg_weak_acqrel_failrelaxed(dst: *mut T, old: T, src: T) -> (T, bool); + pub fn atomic_cxchg_weak_failacq(dst: *mut T, old: T, src: T) -> (T, bool); + pub fn atomic_cxchg_weak_failrelaxed(dst: *mut T, old: T, src: T) -> (T, bool); + pub fn atomic_cxchg_weak_acq_failrelaxed(dst: *mut T, old: T, src: T) -> (T, bool); +``` + +# Drawbacks +[drawbacks]: #drawbacks + +Ideally support for failure memory ordering would be added by simply adding an extra parameter to the existing `compare_and_swap` function. However this is not possible because `compare_and_swap` is stable. + +This RFC proposes deprecating a stable function, which may not be desirable. + +# Alternatives +[alternatives]: #alternatives + +One alternative for supporting failure orderings is to add new enum variants to `Ordering` instead of adding new methods with two ordering parameters. The following variants would need to be added: `AcquireFailRelaxed`, `AcqRelFailRelaxed`, `SeqCstFailRelaxed`, `SeqCstFailAcquire`. The downside is that the names are quite ugly and are only valid for `compare_and_swap`, not other atomic operations. It is also a breaking change to a stable enum. + +Another alternative is to not deprecate `compare_and_swap` and instead add `compare_and_swap_explicit`, `compare_and_swap_weak` and `compare_and_swap_weak_explicit`. However the distiniction between the explicit and non-explicit isn't very clear and can lead to some confusion. + +Not doing anything is also a possible option, but this will cause Rust to generate worse code for some lock-free algorithms. + +# Unresolved questions +[unresolved]: #unresolved-questions + +None