Adopt a Zig Memory Model #6396

mb64 · 2020-09-23T00:51:00Z

TL;DR: Zig should adopt an official memory model. Some options are to use the C/C++ memory model (easy), or its own better memory model (very hard).

(Note that this issue is about the semantics of concurrent access, not other just-as-important things like pointer aliasing rules and provenance. That should also be specified, but it's the job of another issue.)

The problem

The memory model determines which concurrent programs are valid, and is important as the ground truth for what transformations the compiler is allowed to do. Currently, the Zig memory model is by default "whatever LLVM says". It would be nice to have a Zig memory model independent of LLVM. LLVM memory model

This is by no means urgent -- despite being the subject of significant formal verification efforts, Rust still lacks its own memory model, instead defaulting to the C11/C++11 spec.

Here are some reasons to not just stick with LLVM's model:

The Zig language would be tied to the LLVM release schedule. If LLVM changes what it does in the future, Zig would be forced to either change with it, or implement some kind of backwards compatibility on top of it.
Other Zig backends (Stage2, as well as any other future backend) are forced to have feature parity with LLVM
The LLVM memory model is more docs than spec, so it lacks the formalisms that would unambiguously deal with all edge cases. This also has implications for other Stage2/other backends -- if LLVM starts compiling some code in a more conservative way than Stage2, that shouldn't retroactively make Stage2's behavior a bug.

The LLVM memory model also (somewhat necessarily) incorporates a lot of cruft, from C, Java, and probably more -- what could "starting over" with a memory model fix?

Reordering of dependent loads is allowed by C/C++ relaxed loads, and maybe by Java non-volatiles (this is unclear to me, maybe someone else can say for sure). LLVM never allows reordering of dependent loads, by simply not supporting the DEC Alpha processor. If Zig is to ever be run on the DEC Alpha, Zig will have to take a stance on this.
LLVM doesn't support C/C++ consume operations, which are arguably broken. Instead, Clang changes all consumes to acquires. Consume semantics were introduced for a reason though, so it's worth thinking about how Zig could succeed where the C/C++ standard failed.
LLVM has unordered for Java non-volatile, and monotonic for C/C++ relaxed. It also asserts that monotonic is strictly more ordered than unordered. However, C/C++ relaxed allows some things that Java non-volatile doesn't. This section of the Java memory model has an example:

Initially x = 0, y = 0.

Thread 1 Thread 2

int tmp1 = x; int tmp2 = y;

if (tmp1 == 1) y = 1; if (tmp2 == 1) x = 1;

Assuming relaxed loads/stores to x and y, C/C++ allows this program to end with x = y = 1, but Java non-volatiles disallow this result. This is largely considered a bug in the C/C++ spec, but (afaik) all fixes are either too restrictive or require significant changes to how the memory model is specified.

Proposed solution

Start off by officially adopting the C/C++ memory model, minus consume. (This is what Rust has currently.)
- Introduce AtomicOrder.Relaxed, which gets translated to LLVM's monotonic, and remove/deprecate AtomicOrder.Unordered and AtomicOrder.Monotonic.
- More importantly, improve the AtomicOrder docs, including stating that Zig follows the C11/C++11 memory model.
If consume is deemed useful in the future, introduce something like consume in a way that fixes the poor C/C++ semantics. This could be implemented by LLVM with acquire, and in a more fine-grained way in Stage2.
Zig might eventually wish to replace the C/C++ memory model with its own memory model, as it's currently doing with other C infrastructure like libc. This would be very hard, and I don't expect it to happen soon, if ever.

The text was updated successfully, but these errors were encountered:

andrewrk · 2020-09-23T00:53:29Z

Note that in the language specification (#75) the memory model will be specified without referencing an external definition (even if it happens to match C11).

rohlem · 2020-09-23T21:24:45Z

I've also extensively tried to wrap my head around memory models, so I'll try to partake in the discussion when it eventually happens.
Just a couple of quick notes for the interested:

Different memory orders (or "AtomicOrder") are a high-level way of reasoning which CPU instructions to use; a more-strictly-defined order means you get more guarantees ("sane behaviour"), but need stronger fencing (and less optimizations through reordering). Consume wasn't "broken" conceptually (IIUC), but it didn't end up mapping to the set of instructions (on older-generation ARM targets I think?) it was designed for - compilers incorrectly assuming so had a broken implementation.
The C/C++11 model is sort of the 'best attempt' at an efficient abstraction so far, but it is not (yet?) bullet-proof due to "out-of-thin-air" behaviour. Since its inception, dozens of papers have been published trying to rectify this, most of them incur stronger ordering/costs. The newest most promising one seems to address all issues without penalty, and hasn't been disproved for now. Summary, Presentation, Paper
The C/C++11 model turns programs with race conditions wholly-undefined. One sweeping solution is to "localize" their effects, which is done by the "Multicore OCaml" memory model (Paper, Implementation with doc comment); they essentially have to disallow certain optimizations, and use fencing store instructions where concurrent access can occur (or might have occured).
(race conditions cont.) While undefined behaviour is a bad fit for Zig, ReleaseFast will probably not want to forfeit optimizations, and I think it's generally difficult to transition from a "safer" model, unless we can provide some runtime detection (for Debug / ReleaseSafe, which I've read is a non-trivial task) and guarantee it doesn't interfer with any of the optimization passes (which may be the source of race conditions themselves).

If LLVM changes what it does in the future, Zig would be forced to either change with it, or implement some kind of backwards compatibility on top of it.

As long as clang supports C/C++11 - 20+, adopting their memory model puts us in the same boat, which seems reasonably stable to me (though maybe only on their target platforms).
I agree that it's a difficult topic, which shouldn't (and needn't) be rushed. IMO 1.0 can probably do with picking the stable sections of the state-of-the-art C/C++ model that we're interested in supporting.

HugoFlorentino · 2021-08-12T09:58:44Z

Andrew, have you ever read the way Pony language deals with concurrency? From what I have read, even though it is an actor-based oriented language using a GC, the type system has been formally verified, and the reference capabilities leave no option to write unsafe code. Maybe Zig could include something similar, it seems like a very elegant approach to safe concurrency.

Some interesting documents worth reading:

https://sites.google.com/a/athaydes.com/renato-athaydes/posts/fearlessconcurrencyhowclojurerustponyerlanganddartletyouachievethat

https://www.ponylang.io/media/papers/a_prinicipled_design_of_capabilities_in_pony.pdf

https://uu.diva-portal.org/smash/get/diva2:1363822/FULLTEXT01.pdf

http://www.ponylang.org/media/papers/opsla237-clebsch.pdf

matu3ba · 2021-11-06T19:11:08Z

Some stuff worth to follow:
Discussion on how Rust could decide about their memory model with article by maintainer of Linux Kernel Memory Model title "What Memory Model Should the Rust Language Use?" by Paul E. McKenney.
TLDR; "My recommendation is therefore to adopt the untroublesome portions of the C/C++ memory model in safe mode, and the rest in unsafe mode."

matu3ba · 2021-11-29T23:20:33Z

More relevant discussion on the c side of things:
propsal n2676 for c memory model. tldr;
"additional machinery may well be desirable for PNVI-ae and PNVI-ae-udi
to give programmers more control of the provenance of the results of byte manipulations, and of what is left marked
as exposed. The design of that machinery should ideally be based on the treatment of representation-byte-accessed
pointer values by existing compiler alias analyses and optimisations"

also @SpexGuy "The primary goal there is to ensure that ptr->int->ptr casts are preserved. Which is good, but there are many other operations that can cause provenance information to be lost. Loading a pointer from memory, for example, or obtaining one from an extern fn."

Contrary positions: how iso c became unusable for OS devs holding the same position as Linus Torvalds and M. Anton Ertl:
define operations in a virtual machine and ... then each operation (like "load" vs "load_acquire" vs "load_volatile") would have well-defined semantics in that virtual machine. And then a compiler writer would be able to do any optimization he dam,n well please, as long as it is semantically 100% equivalent.
Unfortunately Ertl and Torvalds dont express how these semantics would fit with the reality of how optimizing compilers work (see below).

Ralf Jung explains that making stuff too platform specific does not reflect reality of what compilers do and gives an example why this is a bad idea:
"any out-of-bounds write could potentially modify any other variable (at least any variable that has an address in memory). This can make even the most basic parts of high-quality code generation, such as register allocation, tricky or impossible: a variable that has its address taken has to be re-loaded from that same address any time an out-of-bounds write might have happened, since that write might just have hit the right address to change this variable’s value."
(He also mentions that some compilers may be sufficiently simple in not doing optimizations that they would be ok with doing that.)

Further more Jung mentions that there is no data on what is possible in performance improvements with more UB, but that likely one does not use C in the beginning for that.

mb64 · 2021-11-30T01:44:17Z

stuff about provenance

That's an interesting discussion, but it might be better on a different issue. (So far this issue has largely only been about concurrent memory accesses.) Do we have an issue for discussing pointer provenance and aliasing rules?

matu3ba · 2021-11-30T08:22:32Z

it might be better on a different issue

The context will remain the same, so it would be linked anyway.

Do we have an issue for discussing pointer provenance and aliasing rules?

If the thread becomes to long, we can make linked meta-issues. Only proposals to use them, but not how and why with semantics: #1108 #7742
Closely related are the comptime memory rules #7770 #9646

There was also a related proposal, which sounds to me like it goes into direction of storing the provenance explicit by the programmer (though I have no idea what and how provenance is planned to be encoded and used): #7887

Related to pointer semantics is #15232.

matu3ba · 2022-02-13T14:49:25Z

One formal description of ReleaseSafe could be inspirited by "A Formal Model of Checked C" by Li et al, although (bound-)"checked c" and the paper do no statements on allocation errors like double free.
link to checked c
Likewise, Li et al do not document the performance/compilation time cost of converting fat pointers+checks into normal "weaker typed" ones. link to paper on arxiv

matu3ba · 2022-03-20T13:14:59Z

Another relevant idea on the Rust side is to put the provenance information as being part of the type akind to what CHERI is doing: https://gankra.github.io/blah/fix-rust-pointers/.
Article is also a great read as introduction into 1. Allocations and 2. Pointer Provenance.
It is suggested to 1. distinguish between Pointers And Addresses, 2. special syntax for places and offsets (access via adress and raw pointer).

This has been implemented and is described: https://gankra.github.io/blah/tower-of-weakenings/.
Article suggest "The Tower Of Weakenings" to have something super strict and checkable on default with weakening to allow optimizations etc.
Article also describes PNVI-ae and PNVI-ae-udi more in detail.

matu3ba · 2022-04-12T22:22:05Z

An update with accurate description of PNVI-ae-udi and the Rust equivalent was
posted by Ralf Jung.
PNVI-ae-udi means "provenance not via integers adress expose user disambiguation".

Rust

ptrtoint casting has sideeffects
except: if inttoptr is never used
- strict provenance experiment
  to evaluate if int->ptr can be replaced
- could be ensured with the type system by tracking provenance with permissions
ptr-int transmutation (union access as ptr from int)
- use case 1: chaining as casts
- use case 2: some way to hold arbitrary data in container of a given time
- => evaluate on removing it from the language
on inttoptr guess provenance from all provenances previously marked as "exposed" (globally all)
- any ptrtoint casting exposes the provenance of "ptr": its permission is “up for grabs”
- this includes potential aliasing etc
code that does not syntactically contain exposing pointer-integer
or integer-pointer casts can forget that such casts exist at all.

C

PNVI-ae-udi does not account for the restrict modifier of C
udi part: set of valid guesses in C is a lot more restricted
ptr-int transmutation (union access as ptr from int)
- legacy code
- no type that can be used to hold data without losing provenance
- not all loads at int type have provenance, only loads at char type

LLVM

ptr-int transmutation with many semantic options for LLVM to choose
- load behaves like ptrtoint: strip provenance + expose ptr
- load strips provenance without exposing ptr
- load is UB or return poison
- load produces int with provenance and any computation on such int is UB
int-ptr transmutation:
- load has same effects of inttoptr
- load creates ptr with "invalid" provenance
- load procudes poison

Example, where optimization will still fail with reason (not in Ralf's work), but described in comments

call a C function that returns a pointer
- unkown provenance
- do not optimize anything based on pointers as fallback
- optimization would require additional guarantee
  1. pointer was not exposed/modified and provided by us or
  2. accurate provenance information are stored in between compilation units
- "blog post is about the specification, where we always know which function is being called"
- "compiler has to assume it has suitable provenance and that provenance might already be exposed. Or it might not. The compiler has to do something that is correct in both cases."

matu3ba · 2022-04-24T20:37:37Z

"Memory Consistency & Atomicity" consists of 2 parts, which have an effect on program correctness:

compiler reorderings (for optimizations)
execution reorderings (which instructions are emitted to ensure the CPU cores do the correct thing)

More details of memory barriers https://stackoverflow.com/a/61711095

// Software Memory Models in C++
// memory_order | fences
// relaxed      | None
// consume      | LoadLoad, LoadStore (fences only for vars that are data-dependent on that atomic)
// acquire      | LoadLoad, LoadStore
// release      | LoadStore, StoreStore
// acq_rel      | LoadLoad, LoadStore, StoreStore
// seq_cst      | All (default order)

LoadStore fence means that reorderings from

Load instruction
Store instruction

into

Store instruction
Load instruction

are prevented for both compiler and CPU core.

Default of all fences looks to be LoadStore + another fence, so LoadStore comes up always as a combination in the C11/C11++ memory model.

Not sure, if this is library is relevant: http://mintomic.github.io/.

matu3ba · 2022-04-25T20:14:31Z

Current problems of C11 memory model (execution semantics for multiple CPU cores) outlined here: https://llvm.org/devmtg/2017-03/assets/slides/weak_memory_concurrency_in_c_cxx11_and_llvm.pdf
The C11 memory model is

not mathematically sane: not monotone
not too weak: Reasoning principles for C11 subsets.
too strong: Compilation to Power and ARM is broken.
not actually useful: It disallows intended program transformations (optimzations)

I think this is the related paper: https://people.mpi-sws.org/~viktor/papers/popl2015-c11comp.pdf

Lots of work going on with different approaches, but desired optimizations could not be verified yet https://arxiv.org/pdf/2108.06944.pdf

matu3ba · 2022-08-08T14:51:39Z

An idealized human readable model (currently restricted to sequential) memory access and incomplete for Rust is given in minirust. This is intended as the specification later to be verified against.
More interactively given is https://github.com/nikomatsakis/a-mir-formality/. However, that is still operational by the underlying language.
"PLT Redex is a domain-specific language designed for specifying and debugging operational semantics. Write down a grammar and the reduction rules, and PLT Redex allows you to interactively explore terms and to use randomized test generation to attempt to falsify properties of your semantics. "

A full specification is aimed for in https://github.com/ferrocene/specification, but this specification is like C and C++ wishful axiomatic ("style is axiomatic") and yet without any formal proof or reasoning.

matu3ba · 2022-08-10T06:27:11Z

Suggested by Ralf, "Promising 2.0: Global Optimizations in Relaxed Memory Concurrency" by Lee et al. looks like the latest state of art on a model for weak memory semantics and summarizes the main problems of other models including the shortcomings of C11. Link https://dl.acm.org/doi/pdf/10.1145/3385412.3386010

TLDR;

1. Weak memory models will always be more complex to support thread-local + global optimizations affecting each other, but these optimizations are opt-in without creation of unsoundness.
1. C11 or other models either dont support both or not all possible hardware semantics/optimizations.
1. More research is needed to compare the tradeoffs, for example to fix the C11 thin-air problem, see https://arxiv.org/pdf/2007.09944.pdf for results+proofs and https://people.mpi-sws.org/~viktor/papers/esop2021-decidable.pdf for future work "Finally, studying the decidability problem for related models that solve the thin-air problem (e.g., Paviotti et al. [27 ]) is interesting and kept as future work."

matu3ba · 2023-07-22T09:15:49Z

Some notes from the excellent to read "Rust Atomics and Locks Low-Level Concurrency in Practice" by Mara Bos (https://marabos.nl/atomics/):

Zig currently supports only hardware fences and compiler fences (for stuff on the same thread like signal handler and interrupt handler/exceptions) are not yet supported, as far as I understand. Ideally, this could be solved via an annotated inline assembly pass from comptime to allow optimal flexibility, but probably intrinsics will be offered to keep things simple at performance cost (ARM etc). (Zig has calling convention for that)
(To be checked) Zig semantics currently are according to the C11 memory model with hopefully change (likely according to LLVM semantics) to eliminate the "out of thin air problem" in the formal model.
LLVM is currently not able to model and optimize low level primitives at scale and thus intrinsics should be preferred instead of inline assembly.

TODO Hardware protocols (MOESI etc), functionality, future and influence of hardware platform + configuration

matu3ba · 2023-10-11T21:14:08Z

For example programs and a tool for checking, see "Exploring C Semantics and Pointer Provenance"
Looks like there is a broad zoo of memory models, here an overview from "VIP: Verifying Real-World C Idioms with Integer-Pointer Casts" link:

Most notably, "concrete models, in which pointer values are simple machine words or integers" are used in HOL, seL4, "abstract models like that of CompCert in which pointers are represented as a pair of an abstract allocation identifier and an offset in the corresponding allocation block" and "abstract models can only support restricted forms of conversions between pointers and integer".
PNVI-ae-udi looks to be the most realisting one to make it into the C standard, but its complexity make it non appealing for verification.
"The main idea of VIP is to provide the user with a new construct, copy_alloc_id, by means of which they can explicitly annotate integer-to-pointer casts with a pointer value from which to copy the provenance."

More interestingly though, "PNVI-ae-udi tracks (1) ambiguities in provenance and (2) provenance exposure whereas VIP
does not. On the other hand, (3) VIP tracks provenance in integers (in a limited way, for round-trip
casts) whereas PNVI-ae-udi does not, and (4) VIP relies on the copy_alloc_id primitive, which is
not available in PNVI-ae-udi.".

Appendum (20231012):

Afaiu, temporal (undereferenced) out of bounds pointers are also not forced to be UB in contrast to C.
From what I understand, the one past the end ambiguous provenance of pointers originates from pointer that were casted to integers losing their provenance (applicative to PNVI-ae-udi, but not to VIP model).
This model looks much simpler, but I have no clue what LLVM is or will be doing. Question is in https://discord.com/channels/636084430946959380/636732535434510338/1161785114603298947, but I think llvm discord is more chatter and I'll ask on IRC.

zxubian · 2023-12-02T10:07:39Z

Some resources:

memory models in general & some specifics of GoLang's case: https://research.swtch.com/mm

andrewrk added the proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. label Sep 23, 2020

andrewrk added this to the 0.7.0 milestone Sep 23, 2020

andrewrk modified the milestones: 0.7.0, 0.8.0 Oct 13, 2020

andrewrk modified the milestones: 0.8.0, 0.9.0 May 19, 2021

rohlem mentioned this issue Nov 6, 2021

Generate checks for all forms of UB in Debug and ReleaseSafe modes #2301

Open

andrewrk modified the milestones: 0.9.0, 0.10.0 Nov 23, 2021

andrewrk modified the milestones: 0.10.0, 0.11.0 Apr 16, 2022

andrewrk modified the milestones: 0.11.0, 0.12.0 Apr 9, 2023

andrewrk modified the milestones: 0.13.0, 0.12.0 Jul 9, 2023

kprotty mentioned this issue Jul 22, 2023

langref: improve atomics documentation #16483

Closed

rohlem mentioned this issue Jun 29, 2024

Global write optimized-out even though side effect is possible #20452

Open

rohlem mentioned this issue Sep 12, 2024

while(true) getting removed when building for release mode (?) #21396

Closed

Rexicon226 mentioned this issue Sep 20, 2024

llvm: Emit fence for zero-sized atomic loads/stores. #21461

Closed

andrewrk modified the milestones: 0.14.0, 0.15.0 Feb 9, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adopt a Zig Memory Model #6396

Adopt a Zig Memory Model #6396

mb64 commented Sep 23, 2020

andrewrk commented Sep 23, 2020

rohlem commented Sep 23, 2020

HugoFlorentino commented Aug 12, 2021 •

edited

Loading

matu3ba commented Nov 6, 2021

matu3ba commented Nov 29, 2021 •

edited

Loading

mb64 commented Nov 30, 2021

matu3ba commented Nov 30, 2021 •

edited

Loading

matu3ba commented Feb 13, 2022 •

edited

Loading

matu3ba commented Mar 20, 2022 •

edited

Loading

matu3ba commented Apr 12, 2022 •

edited

Loading

matu3ba commented Apr 24, 2022 •

edited

Loading

matu3ba commented Apr 25, 2022

matu3ba commented Aug 8, 2022 •

edited

Loading

matu3ba commented Aug 10, 2022 •

edited

Loading

matu3ba commented Jul 22, 2023 •

edited

Loading

matu3ba commented Oct 11, 2023 •

edited

Loading

zxubian commented Dec 2, 2023

Adopt a Zig Memory Model #6396

Adopt a Zig Memory Model #6396

Comments

mb64 commented Sep 23, 2020

The problem

Proposed solution

andrewrk commented Sep 23, 2020

rohlem commented Sep 23, 2020

HugoFlorentino commented Aug 12, 2021 • edited Loading

matu3ba commented Nov 6, 2021

matu3ba commented Nov 29, 2021 • edited Loading

mb64 commented Nov 30, 2021

matu3ba commented Nov 30, 2021 • edited Loading

matu3ba commented Feb 13, 2022 • edited Loading

matu3ba commented Mar 20, 2022 • edited Loading

matu3ba commented Apr 12, 2022 • edited Loading

matu3ba commented Apr 24, 2022 • edited Loading

matu3ba commented Apr 25, 2022

matu3ba commented Aug 8, 2022 • edited Loading

matu3ba commented Aug 10, 2022 • edited Loading

matu3ba commented Jul 22, 2023 • edited Loading

matu3ba commented Oct 11, 2023 • edited Loading

zxubian commented Dec 2, 2023

HugoFlorentino commented Aug 12, 2021 •

edited

Loading

matu3ba commented Nov 29, 2021 •

edited

Loading

matu3ba commented Nov 30, 2021 •

edited

Loading

matu3ba commented Feb 13, 2022 •

edited

Loading

matu3ba commented Mar 20, 2022 •

edited

Loading

matu3ba commented Apr 12, 2022 •

edited

Loading

matu3ba commented Apr 24, 2022 •

edited

Loading

matu3ba commented Aug 8, 2022 •

edited

Loading

matu3ba commented Aug 10, 2022 •

edited

Loading

matu3ba commented Jul 22, 2023 •

edited

Loading

matu3ba commented Oct 11, 2023 •

edited

Loading