Skip to content

WIP: Add in atomic_{min,max}_x intrinsics #1653

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

GregBowyer
Copy link
Contributor

I have probably not got this even remotely right.

Recent rust stable seems to have added atomic_max and atomic_min intrinsics. I recently bumped into this being unimplemented while using miri to (hopefully) track down some bugs.

I think this is an implementation of these.

@oli-obk
Copy link
Contributor

oli-obk commented Dec 18, 2020

The implementation lgtm. Could you also add some tests that end up invoking at least one min and one max function from a stable Rust API?

@RalfJung
Copy link
Member

Cc @JCTyblaidd for the data race detector changes

src/data_race.rs Outdated

if cond.to_bool()? {
this.allow_data_races_mut(|this| this.write_immediate(*rhs, place.into()))?;
this.validate_atomic_rmw(place, atomic)?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The second validate_atomic_rmw is unnecessary, i think atomic max & min provides the same semantics regarding ordering even if the value is not changed? (either way the first validate means this would just increment the counters), I would also move the first validate_atomic_rmw to the end of the function for consistency with the other functions as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume if the value is not changed, then like CmpXchg this is only considered a read, not a write?

Copy link
Contributor

@JCTyblaidd JCTyblaidd Dec 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the two architectures i checked, LLVM compiles AtomicUsize::fetch_max(0, Ordering::AcqRel) to an AtomicUsize::fetch_or(0, Ordering::AcqRel) or similar style atomic ops which supports my interpretation. It would also be unusual for fetch_max to have different semantics to fetch_add and not continue release sequences even if no change occurs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example both of these functions in LLVM:

pub fn null_max_release_u8(x: &AtomicU8) -> u8 {
    x.fetch_max(0, Ordering::Release)
}

pub fn null_max_release_usize(x: &AtomicUsize) -> usize {
    x.fetch_max(0, Ordering::Release)
}

compile to (for Ordering::Release, Ordering::AcqRel, Ordering::SeqCst):

        mfence
        mov     rax, qword ptr [rdi]
        ret

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So adding one atomic semantic check at the end is what we need here, I was confused a little on the exact semantics coming from LLVM.

Copy link
Member

@RalfJung RalfJung Dec 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was not concerned about the interaction with release sequences, but about the notion of data races. If a non-atomic access happens concurrently with a non-mutating RMW, is this a data race or not? I thought the answer was "no".

For release sequences (or release semantics in general)... an AcqRel RMW uses Acq semantics for the "unsuccessful" memory ordering, right? So there's no release happening when the compare-exchange fails.

OTOH... there is no such thing as a "failing" fetch_or / fetch_add / fetch_max. So my analogy makes no sense. I retract my objection -- all the "never-failing" fetch_* operations should clearly behave the same, and they should be "release" even if they do not change the value stored at that location. But I expect this also means they are always considered to race with concurrency non-atomic reads, even if they do not change the value stored at that location.

For example both of these functions in LLVM:

FWIW, I don't find this very convincing, as it should easily just show suboptimal code generation. But I found other arguments convincing so it doesn't matter. ;)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, the stable docs for fetch_max say:

Finds the maximum of the current value and the argument val, and sets the new value to the result.

My reading of that sentence is that the operation is "read then compute then store" unconditionally like the other fetch_*.

@GregBowyer
Copy link
Contributor Author

The implementation lgtm. Could you also add some tests that end up invoking at least one min and one max function from a stable Rust API?

The tests are those in the compiler test suite, or are there some other test suite I can add to?

@@ -417,6 +417,16 @@ pub trait EvalContextExt<'mir, 'tcx: 'mir>: crate::MiriEvalContextExt<'mir, 'tcx
"atomic_xsub_acqrel" => this.atomic_op(args, dest, BinOp::Sub, false, AtomicRwOp::AcqRel)?,
"atomic_xsub_relaxed" => this.atomic_op(args, dest, BinOp::Sub, false, AtomicRwOp::Relaxed)?,

"atomic_min" => this.atomic_min_max(args, dest, true, AtomicRwOp::SeqCst)?,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there anything fundamentally different between atomic_op and atomic_min_max? Would it make sense to merge them into a single function? As we just established, they should be basically the same in terms of synchronization behavior, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK The atomic_op appears to take its output from binary_op with overflow which I think means that for LT / GT would result in new becoming the condition of the branch (e.g. 1) rather than being max or min

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry if I am missing something I am obviously super new to the miri codebase

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, so currently it uses mir::BinOp to determine the atomic_op. But we could instead use an enum like

enum AtomicOp {
  MirOp(mir::BinOp),
  Max,
  Min
}

and then we should be able to share much more code -- maybe?

Sorry if I am missing something I am obviously super new to the miri codebase

Don't worry, these are good questions. :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am ok with that, I was under the impression the style was to keep towards mir more directly. I will play with that a bit when I figure out the test case and my test failures.

I guess the other option is to match on mir::BinOp::{Lt, Gt} too which removes the need for a wrapper and punts the specifics to atomic_op I will also play with that

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lt is for less-than though, and this is not "atomic less-than", it is "atomic min". Using Lt for this seems rather hacky.

I was under the impression the style was to keep towards mir more directly.

That's just what worked well so far.^^ But given that all the fetch_ operations behave very similarly, I feel we really should share as much code between them as we can.

@RalfJung
Copy link
Member

The tests are those in the compiler test suite, or are there some other test suite I can add to?

Yes, see the tests/ folder. The compiler test suite is not run in Miri.

@@ -427,6 +427,16 @@ pub trait EvalContextExt<'mir, 'tcx: 'mir>: crate::MiriEvalContextExt<'mir, 'tcx
"atomic_max_rel" => this.atomic_min_max(args, dest, false, AtomicRwOp::Release)?,
"atomic_max_acqrel" => this.atomic_min_max(args, dest, false, AtomicRwOp::AcqRel)?,
"atomic_max_relaxed" => this.atomic_min_max(args, dest, false, AtomicRwOp::Relaxed)?,
"atomic_umin" => this.atomic_min_max(args, dest, true, AtomicRwOp::SeqCst)?,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are these about? If "u" is for "unsigned", then the code needs to be different... < behaves differently in the same bit patterns depending on whether the values are signed or unsigned.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

;) I was just getting to that, you are fast on the review

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:D

Please make sure to add a test that does the max of -1 and 1 (signed) and the min of 1 and usize::MAX (unsigned), to make sure the code uses the signs correctly.

@GregBowyer GregBowyer changed the title Add in atomic_{min,max}_x intrinsics WIP: Add in atomic_{min,max}_x intrinsics Dec 18, 2020
@gThorondorsen
Copy link

miri/src/data_race.rs

Lines 508 to 515 in 9b762c7

fn atomic_op_immediate(
&mut self,
place: MPlaceTy<'tcx, Tag>,
rhs: ImmTy<'tcx, Tag>,
op: mir::BinOp,
neg: bool,
atomic: AtomicRwOp,
) -> InterpResult<'tcx, ImmTy<'tcx, Tag>> {

Why not replace the op and neg arguments with a for<'ctx> fn(ImmTy<'ctx, Tag>, ImmTy<'ctx, Tag>) -> InterpResult<'tcx, ImmTy<'tcx, Tag>> function pointer, which closures without captures can coerce to?

@RalfJung
Copy link
Member

Why not replace the op and neg arguments with a for<'ctx> fn(ImmTy<'ctx, Tag>, ImmTy<'ctx, Tag>) -> InterpResult<'tcx, ImmTy<'tcx, Tag>> function pointer, which closures without captures can coerce to?

A closure would be an alternative to the enum, yeah. It could even for impl for<'ctx> FnOnce ..., so it can actually capture things in the closure.

@RalfJung
Copy link
Member

@GregBowyer What is the status of this, do you still plan to continue working on this PR?

@RalfJung
Copy link
Member

RalfJung commented Feb 6, 2021

@GregBowyer thank you for opening this PR! I am going to close it due to inactivity, but if you (or anyone else) plan to revive this code, take care of the remaining concerns, and drive the PR to completion, feel free to reopen or create a new PR. :)

@RalfJung RalfJung closed this Feb 6, 2021
@henryboisdequin henryboisdequin mentioned this pull request Feb 23, 2021
2 tasks
bors added a commit that referenced this pull request Mar 2, 2021
Add atomic min and max

Closes #1718
Previous attempt: #1653

TODO:

- [x] Merge `atomic_op` and `atomic_min_max` functions
- [x] Fix CI

**Note:** this PR also removes arbitrary trailing whitespace and generally formats the affected files
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants