WebAssembly · titzer · Sep 28, 2016 · Sep 22, 2016 · Sep 22, 2016 · Sep 22, 2016
diff --git a/FutureFeatures.md b/FutureFeatures.md
@@ -315,7 +315,7 @@ operators the possibility of having side effects.
 Debugging techniques are also important, but they don't necessarily need to be
 in the spec itself. Implementations are welcome (and encouraged) to support
 non-standard execution modes, enabled only from developer tools, such as modes
-with alternate rounding, or evaluation of floating point expressions at greater
+with alternate rounding, or evaluation of floating point operators at greater
 precision, to support [techniques for detecting numerical instability]
 (https://www.cs.berkeley.edu/~wkahan/Mindless.pdf), or modes using alternate
 NaN bitpattern rules, to carry diagnostic information and help developers track
@@ -370,8 +370,8 @@ general-purpose use on several of today's popular hardware architectures.
 ## Better feature testing support
 
 The [MVP feature testing situation](FeatureTest.md) could be improved by
-allowing unknown/unsupported AST operators to decode and validate. The runtime
-semantics of these unknown operators could either be to trap or call a
+allowing unknown/unsupported instructions to decode and validate. The runtime
+semantics of these unknown instructions could either be to trap or call a
 same-signature module-defined polyfill function. This feature could provide a
 lighter-weight alternative to load-time polyfilling (approach 2 in
 [FeatureTest.md](FeatureTest.md)), especially if the [specific layer](BinaryEncoding.md)
@@ -442,7 +442,7 @@ see [JavaScript's `WebAssembly.Table` API](JS.md#webassemblytable-objects)).
 It would be useful to be able to do everything from within WebAssembly so, e.g.,
 it was possible to write a WebAssembly dynamic loader in WebAssembly. As a
 prerequisite, WebAssembly would need first-class support for 
-[GC references](GC.md) in expressions and locals. Given that, the following
+[GC references](GC.md) on the stack and in locals. Given that, the following
 could be added:
 * `get_table`/`set_table`: get or set the table element at a given dynamic
   index; the got/set value would have a GC reference type

diff --git a/JS.md b/JS.md
@@ -66,7 +66,7 @@ asynchronous, background, streaming compilation.
 A `WebAssembly.Module` object represents the stateless result of compiling a
 WebAssembly binary-format module and contains one internal slot:
  * [[Module]] : an [`Ast.module`](https://github.com/WebAssembly/spec/blob/master/ml-proto/spec/ast.ml#L208)
-   which is the spec definition of a validated module AST
+   which is the spec definition of a validated module
 
 ### `WebAssembly.Module` Constructor
 
@@ -82,8 +82,8 @@ If the given `bytes` argument is not a
 a `TypeError` exception is thrown.
 
 Otherwise, this function performs synchronous compilation of the `BufferSource`:
-* The byte range delimited by the `BufferSource` is first logically decoded into
-  an AST according to [BinaryEncoding.md](BinaryEncoding.md) and then validated
+* The byte range delimited by the `BufferSource` is first logically decoded 
+  according to [BinaryEncoding.md](BinaryEncoding.md) and then validated
   according to the rules in [spec/check.ml](https://github.com/WebAssembly/spec/blob/master/ml-proto/spec/check.ml#L325).
 * The spec `string` values inside `Ast.module` are decoded as UTF8 as described in 
   [Web.md](Web.md#names).

diff --git a/MVP.md b/MVP.md
@@ -12,14 +12,14 @@ The major design components of the MVP have been broken up into separate
 documents:
 * The distributable, loadable and executable unit of code in WebAssembly
   is called a [module](Modules.md).
-* The behavior of WebAssembly code in a module is specified in terms of an
-  [AST](AstSemantics.md).
+* The behavior of WebAssembly code in a module is specified in terms of 
+  [instructions](AstSemantics.md) for a structured stack machine.
 * The WebAssembly binary format, which is designed to be natively decoded by 
   WebAssembly implementations, is specified as a 
-  [binary serialization](BinaryEncoding.md) of a module's AST.
+  [binary encoding](BinaryEncoding.md) of a module's structure and code.
 * The WebAssembly text format, which is designed to be read and written when
   using tools (e.g., assemblers, debuggers, profilers), is specified as a
-  [textual projection](TextFormat.md) of a module's AST.
+  [textual projection](TextFormat.md) of a module's structure and code.
 * WebAssembly is designed to be implemented both [by web browsers](Web.md)
   and [completely different execution environments](NonWeb.md).
 * To ease the transition to WebAssembly while native support is still

diff --git a/Rationale.md b/Rationale.md
@@ -13,10 +13,18 @@ codebases, we'll revisit the alternatives listed below, reevaluate the tradeoffs
 and update the [design](AstSemantics.md) before the MVP is finalized.
 
 
-## Why AST?
-
-Why not a register- or SSA-based bytecode?
-* Trees allow a smaller binary encoding: [JSZap][], [Slim Binaries][].
+## Why a stack machine?
+
+Why not an AST, or a register- or SSA-based bytecode?
+
+* We started with an AST and generalized to a [structured stack machine](AstSemantics.md). ASTs allow a 
+  dense encoding and efficient decoding, compilation, and interpretation.
+  The structured stack machine of WebAssembly is a generalization of ASTs allowed in previous versions while allowing
+  efficiency gains in interpretation and baseline compilation, as well as a straightforward 
+  design for multi-return functions.
+* The stack machine allows smaller binary encoding than registers or SSA [JSZap][], [Slim Binaries][],
+  and structured control flow allows simpler and more efficient verification, including decoding directly
+  to a compiler's internal SSA form.
 * [Polyfill prototype][] shows simple and efficient translation to asm.js.
 
   [JSZap]: https://research.microsoft.com/en-us/projects/jszap/
@@ -26,15 +34,12 @@ Why not a register- or SSA-based bytecode?
 
 ## Why not a fully-general stack machine?
 
-Stack machines have all the code size advantages as expression trees represented
-in post-order. However, we wish to avoid requiring an explicit expression stack at
-runtime, because many implementations will want to use registers rather than an
-actual stack for evaluation. Consequently, while it's possible to think about
-wasm expression evaluation in terms of a conceptual stack machine, the stack
-machine would be constrained such that one can always statically know the types,
-definitions, and uses of all operands on the stack, so that an implementation can
-connect definitions with their uses through whatever mechanism they see fit.
-
+The WebAssembly stack machine is restricted to structured control flow and structured
+use of the stack. This greatly simplifies one-pass verification, avoiding a fixpoint computation
+like that of other stack machines such as the Java Virtual Machine (prior to [stack maps](https://docs.oracle.com/javase/specs/jvms/se7/html/jvms-4.html)).
+This also simplifies compilation and manipulation of WebAssembly code by other tools.
+Further generalization of the WebAssembly stack machine is planned post-MVP, such as the
+addition of multiple return values from control flow constructs and function calls.
 
 ## Basic Types Only
 
@@ -44,7 +49,7 @@ WebAssembly only represents [a few types](AstSemantics.md#Types).
   language compiler to express its own types in terms of the basic machine
   types. This allows WebAssembly to present itself as a virtual ISA, and lets
   compilers target it as they would any other ISA.
-* These types are efficiently executed by all modern CPU architectures.
+* These types are directly representable on all modern CPU architectures.
 * Smaller types (such as `i8` and `i16`) are usually no more efficient and in
   languages like C/C++ are only semantically meaningful for memory accesses
   since arithmetic get widened to `i32` or `i64`. Avoiding them at least for MVP
@@ -177,7 +182,7 @@ See [#107](https://github.com/WebAssembly/spec/pull/107).
 ## Control Flow
 
 Structured control flow provides simple and size-efficient binary encoding and
-compilation. Any control flow—even irreducible—can be transformed into structured
+compilation. Any control flow--even irreducible--can be transformed into structured
 control flow with the
 [Relooper](https://github.com/kripken/emscripten/raw/master/docs/paper.pdf)
 [algorithm](http://dl.acm.org/citation.cfm?id=2048224&CFID=670868333&CFTOKEN=46181900),
@@ -280,17 +285,18 @@ segregating the table per signature to require only a bounds check could be cons
 in the future. Also, if tables are small enough, an engine can internally use per-signature
 tables filled with failure handlers to avoid one check.
 
-## Expressions with Control Flow
+## Control Flow Instructions with Values
 
-Expression trees offer significant size reduction by avoiding the need for
-`set_local`/`get_local` pairs in the common case of an expression with only one
-immediate use. Control flow "statements" are in fact expressions with result
-values, thus allowing even more opportunities to build bigger
-expression trees and further reduce `set_local`/`get_local` usage (which
-constitute 30-40% of total bytes in the
+Control flow instructions such as `br`, `br_if`, `br_table`, `if` and `if-else` can 
+transfer stack values in WebAssembly. These primitives are useful building blocks for 
+WebAssembly producers, e.g. in compiling expression languages. It offers significant 
+size reduction by avoiding the need for `set_local`/`get_local` pairs in the common case 
+of an expression with only one immediate use. Control flow instructions can then model
+expressions with result values, thus allowing even more opportunities to further reduce
+`set_local`/`get_local` usage (which constitute 30-40% of total bytes in the
 [polyfill prototype](https://github.com/WebAssembly/polyfill-prototype-1)).
-Additionally, these primitives are useful building blocks for
-WebAssembly-generators (including the JavaScript polyfill prototype).
+`br`-with-value and `if` constructs that return values can model also model `phis` which
+appear in SSA representations of programs.
 
 
 ## Limited Local Nondeterminism
@@ -324,12 +330,11 @@ and local manner. This prevents the entire program from being invalid, as would
 be the case with C++ undefined behavior.
 
 As WebAssembly gets implemented and tested with multiple languages on multiple
-architectures there may be a need to revisit some of the decisions:
+architectures we may revisit some of the design decisions:
 
-* When all relevant hardware implement features the same way then there's no
-  need to add nondeterminism to WebAssembly when realistically there's only one
-  mapping from WebAssembly expression to ISA-specific operators. One such
-  example is floating-point: at a high-level most basic instructions follow
+* When all relevant hardware implements an operation the same way, there's no
+  need for nondeterminism in WebAssembly semantics. One such
+  example is floating-point: at a high-level most operators follow
   IEEE-754 semantics, it is therefore not necessary to specify WebAssembly's
   floating-point operators differently from IEEE-754.
 * When different languages have different expectations then it's unfortunate if
@@ -470,20 +475,20 @@ Yes:
     [this demo](https://github.com/lukewagner/AngryBotsPacked), comparing
     *just* parsing in SpiderMonkey (no validation, IR generation) to *just*
     decoding in the polyfill (no asm.js code generation).
-* A binary format enables optimizations that reduce the memory usage of decoded
-  ASTs without increasing size or reducing decode speed.
+* A binary format allows many optimizations for code size and decoding speed that would
+  not be possible on a source form.
 
 
 ## Why a layered binary encoding?
-* We can do better than generic compression because we are aware of the AST
+* We can do better than generic compression because we are aware of the code
   structure and other details:
   * For example, macro compression that
     [deduplicates AST trees](https://github.com/WebAssembly/design/issues/58#issuecomment-101863032)
-    can focus on AST nodes + their children, thus having `O(nodes)` entities
+    can focus on ASTs + their children, thus having `O(nodes)` entities
     to worry about, compared to generic compression which in principle would
     need to look at `O(bytes*bytes)` entities.  Such macros would allow the
     logical equivalent of `#define ADD1(x) (x+1)`, i.e., to be
-    parametrized. Simpler macros (`#define ADDX1 (x+1)`) can implement useful
+    parameterized. Simpler macros (`#define ADDX1 (x+1)`) can implement useful
     features like constant pools.
   * Another example is reordering of functions and some internal nodes, which
     we know does not change semantics, but