-
Notifications
You must be signed in to change notification settings - Fork 389
WIP: Add in atomic_{min,max}_x
intrinsics
#1653
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The implementation lgtm. Could you also add some tests that end up invoking at least one min and one max function from a stable Rust API? |
Cc @JCTyblaidd for the data race detector changes |
src/data_race.rs
Outdated
|
||
if cond.to_bool()? { | ||
this.allow_data_races_mut(|this| this.write_immediate(*rhs, place.into()))?; | ||
this.validate_atomic_rmw(place, atomic)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The second validate_atomic_rmw is unnecessary, i think atomic max & min provides the same semantics regarding ordering even if the value is not changed? (either way the first validate means this would just increment the counters), I would also move the first validate_atomic_rmw to the end of the function for consistency with the other functions as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume if the value is not changed, then like CmpXchg this is only considered a read, not a write?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the two architectures i checked, LLVM compiles AtomicUsize::fetch_max(0, Ordering::AcqRel) to an AtomicUsize::fetch_or(0, Ordering::AcqRel) or similar style atomic ops which supports my interpretation. It would also be unusual for fetch_max to have different semantics to fetch_add and not continue release sequences even if no change occurs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For example both of these functions in LLVM:
pub fn null_max_release_u8(x: &AtomicU8) -> u8 {
x.fetch_max(0, Ordering::Release)
}
pub fn null_max_release_usize(x: &AtomicUsize) -> usize {
x.fetch_max(0, Ordering::Release)
}
compile to (for Ordering::Release, Ordering::AcqRel, Ordering::SeqCst):
mfence
mov rax, qword ptr [rdi]
ret
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So adding one atomic semantic check at the end is what we need here, I was confused a little on the exact semantics coming from LLVM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was not concerned about the interaction with release sequences, but about the notion of data races. If a non-atomic access happens concurrently with a non-mutating RMW, is this a data race or not? I thought the answer was "no".
For release sequences (or release semantics in general)... an AcqRel RMW uses Acq semantics for the "unsuccessful" memory ordering, right? So there's no release happening when the compare-exchange fails.
OTOH... there is no such thing as a "failing" fetch_or / fetch_add / fetch_max. So my analogy makes no sense. I retract my objection -- all the "never-failing" fetch_*
operations should clearly behave the same, and they should be "release" even if they do not change the value stored at that location. But I expect this also means they are always considered to race with concurrency non-atomic reads, even if they do not change the value stored at that location.
For example both of these functions in LLVM:
FWIW, I don't find this very convincing, as it should easily just show suboptimal code generation. But I found other arguments convincing so it doesn't matter. ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, the stable docs for fetch_max
say:
Finds the maximum of the current value and the argument val, and sets the new value to the result.
My reading of that sentence is that the operation is "read then compute then store" unconditionally like the other fetch_*
.
The tests are those in the compiler test suite, or are there some other test suite I can add to? |
@@ -417,6 +417,16 @@ pub trait EvalContextExt<'mir, 'tcx: 'mir>: crate::MiriEvalContextExt<'mir, 'tcx | |||
"atomic_xsub_acqrel" => this.atomic_op(args, dest, BinOp::Sub, false, AtomicRwOp::AcqRel)?, | |||
"atomic_xsub_relaxed" => this.atomic_op(args, dest, BinOp::Sub, false, AtomicRwOp::Relaxed)?, | |||
|
|||
"atomic_min" => this.atomic_min_max(args, dest, true, AtomicRwOp::SeqCst)?, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there anything fundamentally different between atomic_op
and atomic_min_max
? Would it make sense to merge them into a single function? As we just established, they should be basically the same in terms of synchronization behavior, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIK The atomic_op
appears to take its output from binary_op
with overflow which I think means that for LT
/ GT
would result in new becoming the condition of the branch (e.g. 1
) rather than being max or min
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry if I am missing something I am obviously super new to the miri codebase
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, so currently it uses mir::BinOp
to determine the atomic_op
. But we could instead use an enum like
enum AtomicOp {
MirOp(mir::BinOp),
Max,
Min
}
and then we should be able to share much more code -- maybe?
Sorry if I am missing something I am obviously super new to the miri codebase
Don't worry, these are good questions. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am ok with that, I was under the impression the style was to keep towards mir
more directly. I will play with that a bit when I figure out the test case and my test failures.
I guess the other option is to match on mir::BinOp::{Lt, Gt}
too which removes the need for a wrapper and punts the specifics to atomic_op
I will also play with that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lt
is for less-than though, and this is not "atomic less-than", it is "atomic min". Using Lt
for this seems rather hacky.
I was under the impression the style was to keep towards mir more directly.
That's just what worked well so far.^^ But given that all the fetch_ operations behave very similarly, I feel we really should share as much code between them as we can.
fa551a9
to
57ef47a
Compare
Yes, see the |
@@ -427,6 +427,16 @@ pub trait EvalContextExt<'mir, 'tcx: 'mir>: crate::MiriEvalContextExt<'mir, 'tcx | |||
"atomic_max_rel" => this.atomic_min_max(args, dest, false, AtomicRwOp::Release)?, | |||
"atomic_max_acqrel" => this.atomic_min_max(args, dest, false, AtomicRwOp::AcqRel)?, | |||
"atomic_max_relaxed" => this.atomic_min_max(args, dest, false, AtomicRwOp::Relaxed)?, | |||
"atomic_umin" => this.atomic_min_max(args, dest, true, AtomicRwOp::SeqCst)?, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are these about? If "u" is for "unsigned", then the code needs to be different... <
behaves differently in the same bit patterns depending on whether the values are signed or unsigned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
;) I was just getting to that, you are fast on the review
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:D
Please make sure to add a test that does the max of -1
and 1
(signed) and the min of 1
and usize::MAX
(unsigned), to make sure the code uses the signs correctly.
atomic_{min,max}_x
intrinsicsatomic_{min,max}_x
intrinsics
Lines 508 to 515 in 9b762c7
Why not replace the |
A closure would be an alternative to the enum, yeah. It could even for |
@GregBowyer What is the status of this, do you still plan to continue working on this PR? |
@GregBowyer thank you for opening this PR! I am going to close it due to inactivity, but if you (or anyone else) plan to revive this code, take care of the remaining concerns, and drive the PR to completion, feel free to reopen or create a new PR. :) |
I have probably not got this even remotely right.
Recent rust stable seems to have added
atomic_max
andatomic_min
intrinsics. I recently bumped into this being unimplemented while using miri to (hopefully) track down some bugs.I think this is an implementation of these.