Skip to content

Add unchecked_disjoint_bitor per ACP373 #135760

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Feb 4, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions compiler/rustc_codegen_llvm/src/builder.rs
Original file line number Diff line number Diff line change
Expand Up @@ -421,6 +421,20 @@ impl<'a, 'll, 'tcx> BuilderMethods<'a, 'tcx> for Builder<'a, 'll, 'tcx> {
unchecked_umul(x, y) => LLVMBuildNUWMul,
}

fn or_disjoint(&mut self, a: &'ll Value, b: &'ll Value) -> &'ll Value {
unsafe {
let or = llvm::LLVMBuildOr(self.llbuilder, a, b, UNNAMED);

// If a and b are both values, then `or` is a value, rather than
// an instruction, so we need to check before setting the flag.
// (See also `LLVMBuildNUWNeg` which also needs a check.)
if llvm::LLVMIsAInstruction(or).is_some() {
llvm::LLVMSetIsDisjoint(or, True);
}
or
}
}

set_math_builder_methods! {
fadd_fast(x, y) => (LLVMBuildFAdd, LLVMRustSetFastMath),
fsub_fast(x, y) => (LLVMBuildFSub, LLVMRustSetFastMath),
Expand Down
3 changes: 3 additions & 0 deletions compiler/rustc_codegen_llvm/src/llvm/ffi.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1380,6 +1380,9 @@ unsafe extern "C" {
pub fn LLVMBuildFNeg<'a>(B: &Builder<'a>, V: &'a Value, Name: *const c_char) -> &'a Value;
pub fn LLVMBuildNot<'a>(B: &Builder<'a>, V: &'a Value, Name: *const c_char) -> &'a Value;

// Extra flags on arithmetic
pub fn LLVMSetIsDisjoint(Instr: &Value, IsDisjoint: Bool);

// Memory
pub fn LLVMBuildAlloca<'a>(B: &Builder<'a>, Ty: &'a Type, Name: *const c_char) -> &'a Value;
pub fn LLVMBuildArrayAlloca<'a>(
Expand Down
5 changes: 5 additions & 0 deletions compiler/rustc_codegen_ssa/src/mir/intrinsic.rs
Original file line number Diff line number Diff line change
Expand Up @@ -225,6 +225,11 @@ impl<'a, 'tcx, Bx: BuilderMethods<'a, 'tcx>> FunctionCx<'a, 'tcx, Bx> {
args[1].val.unaligned_volatile_store(bx, dst);
return Ok(());
}
sym::disjoint_bitor => {
let a = args[0].immediate();
let b = args[1].immediate();
bx.or_disjoint(a, b)
}
sym::exact_div => {
let ty = arg_tys[0];
match int_type_width_signed(ty, bx.tcx()) {
Expand Down
5 changes: 5 additions & 0 deletions compiler/rustc_codegen_ssa/src/traits/builder.rs
Original file line number Diff line number Diff line change
Expand Up @@ -167,6 +167,11 @@ pub trait BuilderMethods<'a, 'tcx>:
fn unchecked_umul(&mut self, lhs: Self::Value, rhs: Self::Value) -> Self::Value;
fn and(&mut self, lhs: Self::Value, rhs: Self::Value) -> Self::Value;
fn or(&mut self, lhs: Self::Value, rhs: Self::Value) -> Self::Value;
/// Defaults to [`Self::or`], but guarantees `(lhs & rhs) == 0` so some backends
/// can emit something more helpful for optimizations.
fn or_disjoint(&mut self, lhs: Self::Value, rhs: Self::Value) -> Self::Value {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this a method and not just a normal intrinsic? Why does this have 2 default implementations? (fallback in core and default impl here)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So that other places in cg_ssa can use it if it's helpful.

The fallback one in core is for clif and ctfe, whereas the one here is for gcc.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kinda annoying that we need 2 fallbacks :(

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this what we do for all intrinsics that can't be implemented in cg_ssa? Isn't there some place where cg_llvm matches on the intrinsic name to provide its own impls before falling back to the cg_ssa one?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could do that, it just feels like a worse fit here.

This isn't like these intrinsics

fn get_simple_intrinsic<'ll>(
cx: &CodegenCx<'ll, '_>,
name: Symbol,
) -> Option<(&'ll Type, &'ll Value)> {
let llvm_name = match name {
sym::sqrtf16 => "llvm.sqrt.f16",
sym::sqrtf32 => "llvm.sqrt.f32",
sym::sqrtf64 => "llvm.sqrt.f64",
sym::sqrtf128 => "llvm.sqrt.f128",
sym::powif16 => "llvm.powi.f16.i32",
sym::powif32 => "llvm.powi.f32.i32",
sym::powif64 => "llvm.powi.f64.i32",
sym::powif128 => "llvm.powi.f128.i32",
sym::sinf16 => "llvm.sin.f16",
sym::sinf32 => "llvm.sin.f32",
sym::sinf64 => "llvm.sin.f64",
sym::sinf128 => "llvm.sin.f128",
sym::cosf16 => "llvm.cos.f16",
sym::cosf32 => "llvm.cos.f32",
sym::cosf64 => "llvm.cos.f64",
sym::cosf128 => "llvm.cos.f128",
sym::powf16 => "llvm.pow.f16",
sym::powf32 => "llvm.pow.f32",
sym::powf64 => "llvm.pow.f64",
sym::powf128 => "llvm.pow.f128",
sym::expf16 => "llvm.exp.f16",
sym::expf32 => "llvm.exp.f32",
sym::expf64 => "llvm.exp.f64",
sym::expf128 => "llvm.exp.f128",
sym::exp2f16 => "llvm.exp2.f16",
sym::exp2f32 => "llvm.exp2.f32",
sym::exp2f64 => "llvm.exp2.f64",
sym::exp2f128 => "llvm.exp2.f128",
sym::logf16 => "llvm.log.f16",
sym::logf32 => "llvm.log.f32",
sym::logf64 => "llvm.log.f64",
sym::logf128 => "llvm.log.f128",
sym::log10f16 => "llvm.log10.f16",
sym::log10f32 => "llvm.log10.f32",
sym::log10f64 => "llvm.log10.f64",
sym::log10f128 => "llvm.log10.f128",
sym::log2f16 => "llvm.log2.f16",
sym::log2f32 => "llvm.log2.f32",
sym::log2f64 => "llvm.log2.f64",
sym::log2f128 => "llvm.log2.f128",
sym::fmaf16 => "llvm.fma.f16",
sym::fmaf32 => "llvm.fma.f32",
sym::fmaf64 => "llvm.fma.f64",
sym::fmaf128 => "llvm.fma.f128",
sym::fmuladdf16 => "llvm.fmuladd.f16",
sym::fmuladdf32 => "llvm.fmuladd.f32",
sym::fmuladdf64 => "llvm.fmuladd.f64",
sym::fmuladdf128 => "llvm.fmuladd.f128",
sym::fabsf16 => "llvm.fabs.f16",
sym::fabsf32 => "llvm.fabs.f32",
sym::fabsf64 => "llvm.fabs.f64",
sym::fabsf128 => "llvm.fabs.f128",
sym::minnumf16 => "llvm.minnum.f16",
sym::minnumf32 => "llvm.minnum.f32",
sym::minnumf64 => "llvm.minnum.f64",
sym::minnumf128 => "llvm.minnum.f128",
sym::maxnumf16 => "llvm.maxnum.f16",
sym::maxnumf32 => "llvm.maxnum.f32",
sym::maxnumf64 => "llvm.maxnum.f64",
sym::maxnumf128 => "llvm.maxnum.f128",
sym::copysignf16 => "llvm.copysign.f16",
sym::copysignf32 => "llvm.copysign.f32",
sym::copysignf64 => "llvm.copysign.f64",
sym::copysignf128 => "llvm.copysign.f128",
sym::floorf16 => "llvm.floor.f16",
sym::floorf32 => "llvm.floor.f32",
sym::floorf64 => "llvm.floor.f64",
sym::floorf128 => "llvm.floor.f128",
sym::ceilf16 => "llvm.ceil.f16",
sym::ceilf32 => "llvm.ceil.f32",
sym::ceilf64 => "llvm.ceil.f64",
sym::ceilf128 => "llvm.ceil.f128",
sym::truncf16 => "llvm.trunc.f16",
sym::truncf32 => "llvm.trunc.f32",
sym::truncf64 => "llvm.trunc.f64",
sym::truncf128 => "llvm.trunc.f128",
sym::rintf16 => "llvm.rint.f16",
sym::rintf32 => "llvm.rint.f32",
sym::rintf64 => "llvm.rint.f64",
sym::rintf128 => "llvm.rint.f128",
sym::nearbyintf16 => "llvm.nearbyint.f16",
sym::nearbyintf32 => "llvm.nearbyint.f32",
sym::nearbyintf64 => "llvm.nearbyint.f64",
sym::nearbyintf128 => "llvm.nearbyint.f128",
sym::roundf16 => "llvm.round.f16",
sym::roundf32 => "llvm.round.f32",
sym::roundf64 => "llvm.round.f64",
sym::roundf128 => "llvm.round.f128",
sym::ptr_mask => "llvm.ptrmask",
sym::roundevenf16 => "llvm.roundeven.f16",
sym::roundevenf32 => "llvm.roundeven.f32",
sym::roundevenf64 => "llvm.roundeven.f64",
sym::roundevenf128 => "llvm.roundeven.f128",
_ => return None,
};
Some(cx.get_intrinsic(llvm_name))
}

where we're lowering it to some call, or the ones depending on LLVM-only integer types, or whatever.

It's far more like add nuw, where we have add, unchecked_sadd, and unchecked_uadd on BuilderMethods, because it's an instruction in the IR and something that we can use in other places in MIR-to-Backend lowering. (For example, we could use it in emitting rotates since we might as well and it's no less safe than the shifts that are also unchecked in the builder.)

And sure, those two unchecked examples don't have defaults today, but they should, because cg_gcc is just emitting self.gcc_add(a, b) for all three anyway, and if we want cg_ssa to be useful it shouldn't require implementing these when there's perfectly fine -- albeit potentially suboptimal -- provided implementations.

self.or(lhs, rhs)
}
fn xor(&mut self, lhs: Self::Value, rhs: Self::Value) -> Self::Value;
fn neg(&mut self, v: Self::Value) -> Self::Value;
fn fneg(&mut self, v: Self::Value) -> Self::Value;
Expand Down
2 changes: 1 addition & 1 deletion compiler/rustc_hir_analysis/src/check/intrinsic.rs
Original file line number Diff line number Diff line change
Expand Up @@ -472,7 +472,7 @@ pub fn check_intrinsic_type(
vec![Ty::new_imm_ptr(tcx, param(0)), Ty::new_imm_ptr(tcx, param(0))],
tcx.types.usize,
),
sym::unchecked_div | sym::unchecked_rem | sym::exact_div => {
sym::unchecked_div | sym::unchecked_rem | sym::exact_div | sym::disjoint_bitor => {
(1, 0, vec![param(0), param(0)], param(0))
}
sym::unchecked_shl | sym::unchecked_shr => (2, 0, vec![param(0), param(1)], param(0)),
Expand Down
1 change: 1 addition & 0 deletions compiler/rustc_span/src/symbol.rs
Original file line number Diff line number Diff line change
Expand Up @@ -776,6 +776,7 @@ symbols! {
discriminant_kind,
discriminant_type,
discriminant_value,
disjoint_bitor,
dispatch_from_dyn,
div,
div_assign,
Expand Down
38 changes: 38 additions & 0 deletions library/core/src/intrinsics/fallback.rs
Original file line number Diff line number Diff line change
Expand Up @@ -110,3 +110,41 @@ impl const CarryingMulAdd for i128 {
(low, high)
}
}

#[const_trait]
#[rustc_const_unstable(feature = "core_intrinsics_fallbacks", issue = "none")]
pub trait DisjointBitOr: Copy + 'static {
/// See [`super::disjoint_bitor`]; we just need the trait indirection to handle
/// different types since calling intrinsics with generics doesn't work.
unsafe fn disjoint_bitor(self, other: Self) -> Self;
}
macro_rules! zero {
(bool) => {
false
};
($t:ident) => {
0
};
}
macro_rules! impl_disjoint_bitor {
($($t:ident,)+) => {$(
#[rustc_const_unstable(feature = "core_intrinsics_fallbacks", issue = "none")]
impl const DisjointBitOr for $t {
#[cfg_attr(miri, track_caller)]
#[inline]
unsafe fn disjoint_bitor(self, other: Self) -> Self {
// Note that the assume here is required for UB detection in Miri!

// SAFETY: our precondition is that there are no bits in common,
// so this is just telling that to the backend.
unsafe { super::assume((self & other) == zero!($t)) };
self | other
}
}
)+};
}
impl_disjoint_bitor! {
bool,
u8, u16, u32, u64, u128, usize,
i8, i16, i32, i64, i128, isize,
}
20 changes: 20 additions & 0 deletions library/core/src/intrinsics/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3244,6 +3244,26 @@ pub const fn three_way_compare<T: Copy>(_lhs: T, _rhss: T) -> crate::cmp::Orderi
unimplemented!()
}

/// Combine two values which have no bits in common.
///
/// This allows the backend to implement it as `a + b` *or* `a | b`,
/// depending which is easier to implement on a specific target.
///
/// # Safety
///
/// Requires that `(a & b) == 0`, or equivalently that `(a | b) == (a + b)`.
///
/// Otherwise it's immediate UB.
#[rustc_const_unstable(feature = "disjoint_bitor", issue = "135758")]
#[rustc_nounwind]
#[cfg_attr(not(bootstrap), rustc_intrinsic)]
#[cfg_attr(miri, track_caller)] // even without panics, this helps for Miri backtraces
#[miri::intrinsic_fallback_is_spec] // the fallbacks all `assume` to tell Miri
pub const unsafe fn disjoint_bitor<T: ~const fallback::DisjointBitOr>(a: T, b: T) -> T {
// SAFETY: same preconditions as this function.
unsafe { fallback::DisjointBitOr::disjoint_bitor(a, b) }
}

/// Performs checked integer addition.
///
/// Note that, unlike most intrinsics, this is safe to call;
Expand Down
1 change: 1 addition & 0 deletions library/core/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -117,6 +117,7 @@
#![feature(const_eval_select)]
#![feature(core_intrinsics)]
#![feature(coverage_attribute)]
#![feature(disjoint_bitor)]
#![feature(internal_impls_macro)]
#![feature(ip)]
#![feature(is_ascii_octdigit)]
Expand Down
57 changes: 54 additions & 3 deletions library/core/src/num/uint_macros.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1187,6 +1187,50 @@ macro_rules! uint_impl {
self % rhs
}

/// Same value as `self | other`, but UB if any bit position is set in both inputs.
///
/// This is a situational micro-optimization for places where you'd rather
/// use addition on some platforms and bitwise or on other platforms, based
/// on exactly which instructions combine better with whatever else you're
/// doing. Note that there's no reason to bother using this for places
/// where it's clear from the operations involved that they can't overlap.
/// For example, if you're combining `u16`s into a `u32` with
/// `((a as u32) << 16) | (b as u32)`, that's fine, as the backend will
/// know those sides of the `|` are disjoint without needing help.
///
/// # Examples
///
/// ```
/// #![feature(disjoint_bitor)]
///
/// // SAFETY: `1` and `4` have no bits in common.
/// unsafe {
#[doc = concat!(" assert_eq!(1_", stringify!($SelfT), ".unchecked_disjoint_bitor(4), 5);")]
/// }
/// ```
///
/// # Safety
///
/// Requires that `(self & other) == 0`, otherwise it's immediate UB.
///
/// Equivalently, requires that `(self | other) == (self + other)`.
#[unstable(feature = "disjoint_bitor", issue = "135758")]
#[rustc_const_unstable(feature = "disjoint_bitor", issue = "135758")]
#[inline]
pub const unsafe fn unchecked_disjoint_bitor(self, other: Self) -> Self {
assert_unsafe_precondition!(
check_language_ub,
concat!(stringify!($SelfT), "::unchecked_disjoint_bitor cannot have overlapping bits"),
(
lhs: $SelfT = self,
rhs: $SelfT = other,
) => (lhs & rhs) == 0,
);

// SAFETY: Same precondition
unsafe { intrinsics::disjoint_bitor(self, other) }
}

/// Returns the logarithm of the number with respect to an arbitrary base,
/// rounded down.
///
Expand Down Expand Up @@ -2346,15 +2390,22 @@ macro_rules! uint_impl {
/// assert_eq!((sum1, sum0), (9, 6));
/// ```
#[unstable(feature = "bigint_helper_methods", issue = "85532")]
#[rustc_const_unstable(feature = "bigint_helper_methods", issue = "85532")]
#[must_use = "this returns the result of the operation, \
without modifying the original"]
#[inline]
pub const fn carrying_add(self, rhs: Self, carry: bool) -> (Self, bool) {
// note: longer-term this should be done via an intrinsic, but this has been shown
// to generate optimal code for now, and LLVM doesn't have an equivalent intrinsic
let (a, b) = self.overflowing_add(rhs);
let (c, d) = a.overflowing_add(carry as $SelfT);
(c, b | d)
let (a, c1) = self.overflowing_add(rhs);
let (b, c2) = a.overflowing_add(carry as $SelfT);
// Ideally LLVM would know this is disjoint without us telling them,
// but it doesn't <https://github.com/llvm/llvm-project/issues/118162>
// SAFETY: Only one of `c1` and `c2` can be set.
// For c1 to be set we need to have overflowed, but if we did then
// `a` is at most `MAX-1`, which means that `c2` cannot possibly
// overflow because it's adding at most `1` (since it came from `bool`)
(b, unsafe { intrinsics::disjoint_bitor(c1, c2) })
}

/// Calculates `self` + `rhs` with a signed `rhs`.
Expand Down
5 changes: 5 additions & 0 deletions src/tools/miri/tests/fail/intrinsics/disjoint_bitor.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
#![feature(core_intrinsics)]
fn main() {
// one bit in common
unsafe { std::intrinsics::disjoint_bitor(0b01101001_u8, 0b10001110) }; //~ ERROR: Undefined Behavior
}
15 changes: 15 additions & 0 deletions src/tools/miri/tests/fail/intrinsics/disjoint_bitor.stderr
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
error: Undefined Behavior: `assume` called with `false`
--> tests/fail/intrinsics/disjoint_bitor.rs:LL:CC
|
LL | unsafe { std::intrinsics::disjoint_bitor(0b01101001_u8, 0b10001110) };
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ `assume` called with `false`
|
= help: this indicates a bug in the program: it performed an invalid operation, and caused Undefined Behavior
= help: see https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html for further information
= note: BACKTRACE:
= note: inside `main` at tests/fail/intrinsics/disjoint_bitor.rs:LL:CC

note: some details are omitted, run with `MIRIFLAGS=-Zmiri-backtrace=full` for a verbose backtrace

error: aborting due to 1 previous error

13 changes: 13 additions & 0 deletions tests/codegen/bigint-helpers.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
//@ compile-flags: -C opt-level=3

#![crate_type = "lib"]
#![feature(bigint_helper_methods)]

// CHECK-LABEL: @u32_carrying_add
#[no_mangle]
pub fn u32_carrying_add(a: u32, b: u32, c: bool) -> (u32, bool) {
// CHECK: @llvm.uadd.with.overflow.i32
// CHECK: @llvm.uadd.with.overflow.i32
// CHECK: or disjoint i1
u32::carrying_add(a, b, c)
}
30 changes: 30 additions & 0 deletions tests/codegen/intrinsics/disjoint_bitor.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
//@ compile-flags: -C no-prepopulate-passes -Z mir-opt-level=0

#![crate_type = "lib"]
#![feature(core_intrinsics)]

use std::intrinsics::disjoint_bitor;

// CHECK-LABEL: @disjoint_bitor_signed
#[no_mangle]
pub unsafe fn disjoint_bitor_signed(x: i32, y: i32) -> i32 {
// CHECK: or disjoint i32 %x, %y
disjoint_bitor(x, y)
}

// CHECK-LABEL: @disjoint_bitor_unsigned
#[no_mangle]
pub unsafe fn disjoint_bitor_unsigned(x: u64, y: u64) -> u64 {
// CHECK: or disjoint i64 %x, %y
disjoint_bitor(x, y)
}

// CHECK-LABEL: @disjoint_bitor_literal
#[no_mangle]
pub unsafe fn disjoint_bitor_literal() -> u8 {
// This is a separate check because even without any passes,
// LLVM will fold so it's not an instruction, which can assert in LLVM.

// CHECK: store i8 3
disjoint_bitor(1, 2)
}
Loading