You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While SeqCst is indeed most likely to be correct if using atomics is correct at all, given that it is a restrictive superset of other atomic orderings, I would be a hesitant to promote it as strongly as e.g. C++ does, because when compared with Acquire/Release the extra guarantees that it provides are relatively obscure, easy to misunderstand and lose, and expensive to provide in hardware.
I think the right thing to do might rather be to tell prospective users that atomics are an expert synchronization feature, and that like every sharp tool they require some prior safety training. In my experience, I have found that once one has built up a solid understanding of the underlying memory model issues, acquire / release becomes remarkably easy to understand, and the use cases that actually require SeqCst (which are rare) become more obvious.
I can share various learning resources on this topic which I have personally found very helpful if other people agree with me that this "safety training" approach is most appropriate and want to help those going through that learning process.
When it comes to Rust atomics in particular, I love this blog post from @jeehoonkang, which provides a reasonably accessible glimpse of the promising semantics research that they participated in. Basically, the goal of this blog post is to explain what programmers can assume about relaxed atomics and what the various atomic memory orderings add on top of that, and I think it does that beautifully.
If you have time for further reading, another resource which I love is the Preshing on Programming blog archive. After standardization of the C++11 memory model (which matters to Rust because we essentially use it), starting around 2012, this blog published many articles on lock-free programming, some underlying hardware issues, how people reason about it when coding close to the metal (read/write barriers and the like), and how the C++11 model builds up higher-level abstractions on top of that understanding. I found this series of articles very enlightening, as it allowed me to...
Understand why some atomic orderings are forbidden or broken in the C++11 model (e.g. read + release), by figuring out that they just don't make sense at the hardware level.
Solidify my understanding of memory fences.
Cross-check my understanding with many examples and several reformulations of the same idea.
Another nice resource that also has a very "hardware" point of view on memory accesses is the Linux kernel memory barrier documentation, which AFAIK is what the C++ standard committee started with when they designed their memory model (notice how even the terminology is similar). There was also a nice series of articles on LWN about it, but I'm not sure if I could find it back as I read it quite a while ago.
Now, one pedagogical resource which I do not have and would welcome is a single localized discussion of compiler issues around lock-free programming. The basic problem is easy to state: compiler optimizers basically assume that code is single-threaded and will perform transformations that add, remove or reorder memory accesses in a manner that is invisible in serial code, but can break the correctness of parallel code as other threads can observe the corresponding memory access changes. The "reordering" part is familiar from hardware, but the "adding/removing accesses" part is more specific to compilers and even nastier. Thankfully, these transformations can be restricted using many code constructs of varying effectiveness, of which atomics are basically the latest (and arguably most ergonomic) iteration.
However, this is stating the problem, not the solution. It would be nice to have a more solid explanation of what exactly compiler optimizers can and can't do, why, and how programmers can interact with them effectively to produce correct thread synchronization transactions. For example, the correctness of thread synchronization protocols that involve both atomic and non-atomic memory accesses (think mutexes) is very far from obvious, and I have yet to find a solid justification for why it is supposed to keep working as compiler optimizers become more and more sophisticated. I believe that resolving this sort of problems is one key goal of people who design programming language memory models.
Pinging @RalfJung and @jeehoonkang , given their usual areas of interest they might have more material to suggest or further insight on the "what exactly optimizers can do and how programmers can prevent it through language constructs where undesirable" topic.
My stanza is that one should rarely use SeqCst. In the vast majority of cases, release-acquire semantics are strong enough, and by restricting the possible correctness arguments to "message-passing style" (and excluding "case distinction on interleavings"), I feel they also promote a better way to think about synchronization in concurrent programs. If your code needsSeqCst, you are doing something really complicated, and there should be a long comment explaining what is going on.
So, I would be opposed to declaring SeqCst the default. Also see my arguing in this rejected RFC. However, that RFC also showed that my opinion on these matters is a minority opinion here.
Activity
HadrienG2 commentedon May 11, 2019
While SeqCst is indeed most likely to be correct if using atomics is correct at all, given that it is a restrictive superset of other atomic orderings, I would be a hesitant to promote it as strongly as e.g. C++ does, because when compared with Acquire/Release the extra guarantees that it provides are relatively obscure, easy to misunderstand and lose, and expensive to provide in hardware.
I think the right thing to do might rather be to tell prospective users that atomics are an expert synchronization feature, and that like every sharp tool they require some prior safety training. In my experience, I have found that once one has built up a solid understanding of the underlying memory model issues, acquire / release becomes remarkably easy to understand, and the use cases that actually require SeqCst (which are rare) become more obvious.
I can share various learning resources on this topic which I have personally found very helpful if other people agree with me that this "safety training" approach is most appropriate and want to help those going through that learning process.
tesuji commentedon May 11, 2019
@HadrienG2 Where are those? I wanna know.
HadrienG2 commentedon May 11, 2019
When it comes to Rust atomics in particular, I love this blog post from @jeehoonkang, which provides a reasonably accessible glimpse of the promising semantics research that they participated in. Basically, the goal of this blog post is to explain what programmers can assume about relaxed atomics and what the various atomic memory orderings add on top of that, and I think it does that beautifully.
If you have time for further reading, another resource which I love is the Preshing on Programming blog archive. After standardization of the C++11 memory model (which matters to Rust because we essentially use it), starting around 2012, this blog published many articles on lock-free programming, some underlying hardware issues, how people reason about it when coding close to the metal (read/write barriers and the like), and how the C++11 model builds up higher-level abstractions on top of that understanding. I found this series of articles very enlightening, as it allowed me to...
Another nice resource that also has a very "hardware" point of view on memory accesses is the Linux kernel memory barrier documentation, which AFAIK is what the C++ standard committee started with when they designed their memory model (notice how even the terminology is similar). There was also a nice series of articles on LWN about it, but I'm not sure if I could find it back as I read it quite a while ago.
Now, one pedagogical resource which I do not have and would welcome is a single localized discussion of compiler issues around lock-free programming. The basic problem is easy to state: compiler optimizers basically assume that code is single-threaded and will perform transformations that add, remove or reorder memory accesses in a manner that is invisible in serial code, but can break the correctness of parallel code as other threads can observe the corresponding memory access changes. The "reordering" part is familiar from hardware, but the "adding/removing accesses" part is more specific to compilers and even nastier. Thankfully, these transformations can be restricted using many code constructs of varying effectiveness, of which atomics are basically the latest (and arguably most ergonomic) iteration.
However, this is stating the problem, not the solution. It would be nice to have a more solid explanation of what exactly compiler optimizers can and can't do, why, and how programmers can interact with them effectively to produce correct thread synchronization transactions. For example, the correctness of thread synchronization protocols that involve both atomic and non-atomic memory accesses (think mutexes) is very far from obvious, and I have yet to find a solid justification for why it is supposed to keep working as compiler optimizers become more and more sophisticated. I believe that resolving this sort of problems is one key goal of people who design programming language memory models.
HadrienG2 commentedon May 13, 2019
Pinging @RalfJung and @jeehoonkang , given their usual areas of interest they might have more material to suggest or further insight on the "what exactly optimizers can do and how programmers can prevent it through language constructs where undesirable" topic.
RalfJung commentedon May 13, 2019
My stanza is that one should rarely use
SeqCst
. In the vast majority of cases, release-acquire semantics are strong enough, and by restricting the possible correctness arguments to "message-passing style" (and excluding "case distinction on interleavings"), I feel they also promote a better way to think about synchronization in concurrent programs. If your code needsSeqCst
, you are doing something really complicated, and there should be a long comment explaining what is going on.So, I would be opposed to declaring
SeqCst
the default. Also see my arguing in this rejected RFC. However, that RFC also showed that my opinion on these matters is a minority opinion here.Dylan-DPC commentedon Dec 8, 2023
Closing this as there's no consensus on a default for the orderings and this needs a bigger discussion before being adding the implementation