WebAssembly · lukewagner · Aug 3, 2015 · Jul 17, 2015 · Jul 18, 2015 · Jul 18, 2015
diff --git a/FAQ.md b/FAQ.md
@@ -90,7 +90,7 @@ even before there is any native support.
 
 As explained in the [high-level goals](HighLevelGoals.md), to achieve a Minimum Viable Product, the
 initial focus is on [C/C++](CAndC++.md).
-However, by [integrating with JS at the ES6 Module interface](MVP.md#modules),
+However, by [integrating with JS at the ES6 Module interface](Modules.md#integration-with-es6-modules),
 web developers don't need to write C++ to take advantage of libraries that others have written; 
 reusing a modular C++ library can be as simple as [using a module from JS](http://jsmodules.io).
 

diff --git a/MVP.md b/MVP.md
@@ -11,84 +11,15 @@ even on mobile devices, which leads to roughly the same functionality as
 This document explains the contents of the MVP at a high-level. There are also
 separate docs with more precise descriptions of:
 
+ * [Modules](Modules.md)
  * [Polyfill to JavaScript](Polyfill.md);
  * [AST semantics](AstSemantics.md);
  * [Binary encoding](BinaryEncoding.md);
+ * [Text format](TextFormat.md);
  * Implementation [in the browser](Web.md) and [outside the browser](NonWeb.md).
 
 **Note**: This content is still in flux and open for discussion.
 
-## Modules
-
-* The primary unit of loadable, executable code is a **module**.
-* A module can declare a subset of its functions and global variables to be
-  **exports**. The meaning of exports (how and when they are called) is defined
-  by the host environment. For example, `_start` and `init` can be the only
-  meaningful exports.
-* A module can declare a set of **imports**. An import is a tuple containing a
-  module name, export name, and the type to use for the import within the
-  module. The host environment controls the mapping from module name to which
-  module is loaded.
-* The spec defines the semantics of loading and calling exports of a *single*
-  module. The meaning of a call to an import is defined by the host environment.
-  * In a minimal shell environment, imports could be limited to builtin modules
-    (implemented by the shell) and/or shell scripts.
-  * The [dynamic linking](FutureFeatures.md#dynamic-linking) post-MVP feature
-    would extend the semantics to include multiple modules and thus allow sharing 
-linear memory and pointers. Dynamic linking would be semantically distinct from
-    importing, however.
-* When compiling from C++, imports would be generated for unresolved `extern`
-  functions and calls to those `extern` functions would call the import.
-* Host environments can define builtin modules that are implemented natively but
-  can otherwise be imported like [other modules](MVP.md#modules). As examples:
-  * A WebAssembly shell might define a builtin `stdio` library with an export
-    `puts`.
-  * In the browser, the WebIDL support mentioned in
-    [future features](FutureFeatures.md).
-* Any [ABI](https://en.wikipedia.org/wiki/Application_binary_interface) for
-  statically linked libraries will be specific to your source language compiler.
-  In the future, [standard ABIs may be defined](FutureFeatures.md#dynamic-linking)
-  to allow for compatibility between compilers and versions of compilers.
-* **TODO**: there is more to discuss here concerning APIs.
-
-## Module structure
-
-* At the top level, a module is ELF-like: a sequence of sections which declare
-  their type and byte-length.
- * Sections with unknown types would be skipped without error. 
- * Standardized section types:
-  * module import section;
-  * globals section (constants, signatures, variables);
-  * code section;
-  * memory initialization section.
-
-## Code section
-
-* The code section begins with a table of functions containing the signatures
-   and offsets of each function followed by the list of function bodies. This
-   allows parallel and streaming decoding, validation and compilation.
- * A function body consists of a set of typed variable bindings and an AST
-   closed under these bindings.
-  * The [Abstract Syntax Tree](AstSemantics.md) is composed of two primary kinds
-    of nodes: statements and expressions.
- * [Control flow](AstSemantics.md#control-flow-structures) is structured (no
-   `goto`).
-
-## Binary format
-
-* A [binary format](BinaryEncoding.md) provides efficiency: it reduces download
-  size and accelerates decoding, thus enabling even very large codebases to have
-  quick startup times. Towards that goal, the binary format will be natively
-  decoded by browsers.
-* The binary format has an equivalent and isomorphic
-  [text format](MVP.md#text-format).  Conversion from one format to the other is
-  both straightforward and causes no loss of information in either direction.
-
-## Text format
-
-The [text format](TextFormat.md) provides readability to developers, and is
-isomorphic to the [binary format](BinaryEncoding.md).
-
 ## Linear Memory
 
 * In the MVP, when a WebAssembly module is loaded, it creates a new linear memory which
@@ -105,6 +36,21 @@ isomorphic to the [binary format](BinaryEncoding.md).
     detaches any existent `ArrayBuffer`.
 * See the [AST Semantics linear memory section](AstSemantics.md#linear-memory)
   for more details.
+
+## Binary format
+
+* A [binary format](BinaryEncoding.md) provides efficiency: it reduces download
+  size and accelerates decoding, thus enabling even very large codebases to have
+  quick startup times. Towards that goal, the binary format will be natively
+  decoded by browsers.
+* The binary format has an equivalent and isomorphic
+  [text format](MVP.md#text-format).  Conversion from one format to the other is
+  both straightforward and causes no loss of information in either direction.
+
+## Text format
+
+The [text format](TextFormat.md) provides readability to developers, and is
+isomorphic to the [binary format](BinaryEncoding.md).
 
 ## Security
 

diff --git a/Modules.md b/Modules.md
@@ -0,0 +1,127 @@
+# Modules
+
+The distributable, loadable, and executable unit of code in WebAssembly
+is called a **module**. A module contains:
+* a set of [imports and exports](Modules.md#imports-and-exports);
+* a section defining the [initial state of linear memory](Modules.md#initial-state-of-linear-memory);
+* a section containing [code](Modules.md#code-section);
+* after the MVP, sections containing [debugging/symbol information](Tooling.md) or
+  a reference to separate files containing them; and
+* possibly other sections in the future.
+Sections declare their type and byte-length. Sections with unknown types are
+silently ignored.
+
+While WebAssembly modules are designed to interoperate with ES6 modules
+in a Web environment (more details [below](Modules.md#integration-with-es6-modules)),
+WebAssembly modules are defined independently of JavaScript and do not require
+the host environment to include a JavaScript VM.
+
+## Imports and Exports
+
+A module defines a set of functions in its
+[code section](Modules.md#code-section) and can declare and name a subset of
+these functions to be **exports**. The meaning of exports (how and when they are
+called) is defined by the host environment. For example, a minimal shell
+environment might only probe for and call a `_start` export when given a module
+to execute.
+
+A module can declare a set of **imports**. An import is a tuple containing a
+module name, the name of an exported function to import from the named module,
+and the signature to use for that import within the importing module. Within a
+module, the import can be [directly called](AstSemantics.md#calls) like a
+function (according to the signature of the import). When the imported
+module is also WebAssembly, it would be an error if the signature of the import
+doesn't match the signature of the export.
+
+The WebAssembly spec does not define how imports are interpreted:
+* the host environment can interpret the module name as a file path, a URL,
+  a key in a fixed set of builtin modules or the host environment may invoke a
+  user-defined hook to resolve the module name to one of these;
+* the module name does not need to resolve to a WebAssembly module; it
+  could resolve to a builtin module (implemented by the host environment) or a
+  module written in another, compatible language; and
+* the meaning of calling an imported function is host-defined.
+
+The open-ended nature of module imports allow them to be used to expose
+arbitrary host environment functionality to WebAssembly code, similar to a
+native `syscall`. For example, a shell environment could define a builtin
+`stdio` module with an export `puts`.
+
+In C/C++, an undefined `extern` declaration (perhaps only when given the
+magic `__attribute__` or declared in a separate list of imports) could be
+compiled to an import and C/C++ calls to this `extern` would then be compiled
+to calls to this import. This is one way low-level C/C++ libraries could call
+out of WebAssembly in order to implement portable source-level interfaces
+(e.g., POSIX, OpenGL or SDL) in terms of host-specific functionality.
+
+### Integration with ES6 modules
+
+While ES6 defines how to parse, link and execute a module, ES6 does not
+define when this parsing/linking/execution occurs. An additional extension
+to the HTML spec is required to say when a script is parsed as a module instead
+of normal global code. This work is [ongoing](http://TODO:link-to-loader-level-0-repo).
+Currently, the following entry points for modules are being considered:
+* `<script type="module">`;
+* an overload to the `Worker` constructor;
+* an overload to the `importScripts` Worker API;
+
+Additionally, an ES6 module can recursively import other modules via `import`
+statements.
+
+For WebAssembly/ES6 module integration, the idea is that all the above module
+entry points could also load WebAssembly modules simply by passing the URL of a
+WebAssembly module. The distinction of whether the module was WebAssembly or ES6
+code could be made by namespacing or by content sniffing the first bytes of the
+fetched resource (which, for WebAssembly, would be a non-ASCII&mdash;and thus
+illegal as JavaScript&mdash;[magic number](https://en.wikipedia.org/wiki/Magic_number_%28programming%29)).
+Thus, the whole module-loading pipeline (resolving the name to a URL, fetching
+the URL, any other [loader hooks](http://whatwg.github.io/loader/)) would be
+shared and only the final stage would fork into either the JavaScript parser or
+the WebAssembly decoder.
+
+Any non-builtin imports from within a WebAssembly module would be treated as
+if they were `import` statements of an ES6 module. If an ES6 module `import`ed
+a WebAssembly module, the WebAssembly module's exports would be linked as if
+they were the exports of an ES6 module. Once parsing and linking phases
+were complete, a WebAssembly module would have its `_start` function called in
+place of executing the ES6 module top-level script. By default, multiple 
+loads of the same module URL (in the same realm) reuse the same singleton
+module instance. It may be worthwhile in the future to consider extensions to
+allow applications to load/compile/link a module once and instantiate multiple
+times (each with a separate heap and global state).
+
+This integration strategy should allow WebAssembly modules to be fairly
+interchangeable with ES6 modules (ignoring 
+[GC/Web API](FutureFeatures.md#gc/dom-integration) signature restrictions of the
+WebAssembly MVP) and thus it should be natural to compose a single application
+from both kinds of code. This goal motivates the 
+[semantic design](AstSemantics.md#linear-memory) of giving each WebAssembly
+module its own disjoint linear memory. Otherwise, if all modules shared a single
+linear memory (all modules with the same realm? origin? window?&mdash;even the
+scope of "all" is a nuanced question), a single app using multiple
+independent libraries would have to hope that all the WebAssembly modules
+transitively used by those libraries "played well" together (e.g., explicitly
+shared `malloc` and coordinated global address ranges). Instead, the
+[dynamic linking future feature](FutureFeatures.md#dynamic-linking) is intended
+to allow *explicitly* sharing linear memory between multiple modules.
+
+## Initial state of linear memory
+
+A module will contain a section declaring the linear memory size (initial and
+maximum size allowed by `sbrk`) and the initial contents of memory (analogous
+to `.data`, `.rodata`, `.bss` sections in native executables).
+
+## Code section
+
+The WebAssembly spec defines the code section of a module in terms of an
+[Abstract Syntax Tree](AstSemantics.md) (AST). Additionally, the spec defines
+two concrete representations of the AST: a [binary format](BinaryEncoding.md)
+which is natively decoded by the browser and a [text format](TextFormat.md)
+which is intended to be read and written by humans. A WebAssembly environment
+is only required to understand the binary format; the text format is defined so
+that WebAssembly modules can be written by hand (and then converted to binary
+with an offline tool) and so that developer tools have a well-defined text
+projection of a binary WebAssembly module. This design separates the concerns
+of specifying and reasoning about behavior, over-the-wire size and compilation
+speed, and ergonomic syntax.
+
diff --git a/NonWeb.md b/NonWeb.md
@@ -20,7 +20,8 @@ JavaScript VM present.
 The WebAssembly spec will not try to define any large portable libc-like
 library. However, certain features that are core to WebAssembly semantics that
 are found in native libc *would* be part of the core WebAssembly spec as either
-primitive opcodes or a special builtin module (e.g., `sbrk`, `dlopen`).
+primitive opcodes or a function exported by a
+[builtin module](Modules.md#imports-and-exports) (e.g., `sbrk`, `dlopen`).
 
 Where there is overlap between the Web and popular non-Web environments,
 shared specs could be proposed, but these would be separate from the WebAssembly
@@ -32,8 +33,9 @@ However, for most cases it is expected that, to achieve portability at the
 source code level, communities would build libraries that mapped from a 
 source-level interface to the host environment's builtin capabilities
 (either at build time or runtime).  WebAssembly would provide the raw building
-blocks (feature testing, dynamic loading) to make these libraries possible.
-Two early expected examples are POSIX and SDL.
+blocks (feature testing, [builtin modules](Modules.md#imports-and-exports) and
+dynamic loading) to make these libraries possible. Two early expected examples
+are POSIX and SDL.
 
 In general, by keeping the non-Web path such that it doesn't require
 Web APIs, WebAssembly could be used as a portable binary format on many

diff --git a/Web.md b/Web.md
@@ -9,36 +9,22 @@ the Web's security model, preserving the Web's portability, and designing in
 room for evolutionary development. Many of these goals are clearly
 reflected in WebAssembly's [high-level goals](HighLevelGoals.md).
 
-# Implementation Details
-
-We've identified interesting implementation approaches which help convince us
-that the design, especially that of the [MVP](MVP.md), are sensible:
+More concretely, the following is a list of points of contact between WebAssembly
+and the rest of the Web platform that have been considered:
 
+* WebAssembly's [modules](Modules.md) allow for natural [integration with
+  the ES6 module system](Modules.md#integration-with-es6-modules) and allow
+  synchronous calling to and from JavaScript.
 * WebAssembly's security model should depend on [CORS][] and
   [subresource integrity][] to enable distribution, especially through content
   distribution networks and to implement
   [dynamic linking](FutureFeatures.md#dynamic-linking).
-* A [module](MVP.md#modules) can be loaded in the same way as an ES6 module
-  (`import` statements, `Reflect` API, `Worker` constructor, etc) and the result
-  is reflected to JS as an ES6 module object.
-  - Exports are the ES6 module object exports.
-  - An import first passes the module name to the [module loader pipeline][] and
-    resulting ES6 module (which could be implemented in JS or WebAssembly) is
-    queried for the export name.
-  - There is no special case for when one WebAssembly module imports another:
-    they have separate [memory](MVP.md#linear-memory) and pointers cannot be passed
-    between the two. Module imports encapsulate the importer and
-    importee. [Dynamic linking](FutureFeatures.md#dynamic-linking) should be
-    used to share memory and pointers across modules.
-  - To synchronously call into JavaScript from C++, the C++ code would declare
-    and call an undefined `extern` function and the target JavaScript function
-    would be given the (mangled) name of the `extern` and put inside the
-    imported ES6 module.
 * Once [threads are supported](PostMVP.md#threads), a WebAssembly module would
-  initially be distributed between workers via `postMessage()`.
+  shared (including its heap) between workers via `postMessage()`.
   - This also has the effect of explicitly sharing code so that engines don't
     perform N fetches and compile N copies.
-  - May later standardize a more direct way to create a thread from WebAssembly.
+  - WebAssembly may later standardize a more direct way to create a thread that
+    doesn't involve creating a new Worker.
 * Once [SIMD is supported](PostMVP.md#fixed-width-simd), a Web implementation of
   WebAssembly would:
   - Be statically typed analogous to [SIMD.js-in-asm.js][];
@@ -47,6 +33,5 @@ that the design, especially that of the [MVP](MVP.md), are sensible:
 
   [CORS]: https://www.w3.org/TR/cors/
   [subresource integrity]: https://www.w3.org/TR/SRI/
-  [module loader pipeline]: https://whatwg.github.io/loader
   [SIMD.js-in-asm.js]: http://discourse.specifiction.org/t/request-for-comments-simd-js-in-asm-js