-
Notifications
You must be signed in to change notification settings - Fork 695
Consolidate explanation of modules into a new Modules.md and improve explanation #270
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,127 @@ | ||
# Modules | ||
|
||
The distributable, loadable, and executable unit of code in WebAssembly | ||
is called a **module**. A module contains: | ||
* a set of [imports and exports](Modules.md#imports-and-exports); | ||
* a section defining the [initial state of linear memory](Modules.md#initial-state-of-linear-memory); | ||
* a section containing [code](Modules.md#code-section); | ||
* after the MVP, sections containing [debugging/symbol information](Tooling.md) or | ||
a reference to separate files containing them; and | ||
* possibly other sections in the future. | ||
Sections declare their type and byte-length. Sections with unknown types are | ||
silently ignored. | ||
|
||
While WebAssembly modules are designed to interoperate with ES6 modules | ||
in a Web environment (more details [below](Modules.md#integration-with-es6-modules)), | ||
WebAssembly modules are defined independently of JavaScript and do not require | ||
the host environment to include a JavaScript VM. | ||
|
||
## Imports and Exports | ||
|
||
A module defines a set of functions in its | ||
[code section](Modules.md#code-section) and can declare and name a subset of | ||
these functions to be **exports**. The meaning of exports (how and when they are | ||
called) is defined by the host environment. For example, a minimal shell | ||
environment might only probe for and call a `_start` export when given a module | ||
to execute. | ||
|
||
A module can declare a set of **imports**. An import is a tuple containing a | ||
module name, the name of an exported function to import from the named module, | ||
and the signature to use for that import within the importing module. Within a | ||
module, the import can be [directly called](AstSemantics.md#calls) like a | ||
function (according to the signature of the import). When the imported | ||
module is also WebAssembly, it would be an error if the signature of the import | ||
doesn't match the signature of the export. | ||
|
||
The WebAssembly spec does not define how imports are interpreted: | ||
* the host environment can interpret the module name as a file path, a URL, | ||
a key in a fixed set of builtin modules or the host environment may invoke a | ||
user-defined hook to resolve the module name to one of these; | ||
* the module name does not need to resolve to a WebAssembly module; it | ||
could resolve to a builtin module (implemented by the host environment) or a | ||
module written in another, compatible language; and | ||
* the meaning of calling an imported function is host-defined. | ||
|
||
The open-ended nature of module imports allow them to be used to expose | ||
arbitrary host environment functionality to WebAssembly code, similar to a | ||
native `syscall`. For example, a shell environment could define a builtin | ||
`stdio` module with an export `puts`. | ||
|
||
In C/C++, an undefined `extern` declaration (perhaps only when given the | ||
magic `__attribute__` or declared in a separate list of imports) could be | ||
compiled to an import and C/C++ calls to this `extern` would then be compiled | ||
to calls to this import. This is one way low-level C/C++ libraries could call | ||
out of WebAssembly in order to implement portable source-level interfaces | ||
(e.g., POSIX, OpenGL or SDL) in terms of host-specific functionality. | ||
|
||
### Integration with ES6 modules | ||
|
||
While ES6 defines how to parse, link and execute a module, ES6 does not | ||
define when this parsing/linking/execution occurs. An additional extension | ||
to the HTML spec is required to say when a script is parsed as a module instead | ||
of normal global code. This work is [ongoing](http://TODO:link-to-loader-level-0-repo). | ||
Currently, the following entry points for modules are being considered: | ||
* `<script type="module">`; | ||
* an overload to the `Worker` constructor; | ||
* an overload to the `importScripts` Worker API; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Are these what wasm would want? We kind of discussed this in #84, it would be good to update that issue and close it if we have a believable solution. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As I mention below, using an ES module loading mechanism to load wasm is something I don't see fitting into a level 0 Loader spec, so I think it still might be prudent for wasm to offer some more explicit hooks for loading from HTML (such as those discussed in #84). That really depends on the timelines for various parts of this: loading wasm code, importing ES from wasm, importing wasm from ES. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @jfbastien Ideally yes, in my mind. That would mean, e.g., it should be possible to swap out a JS module with a wasm module (or vice versa) by simply changing the contents of the file. If we agree on this, happy to comment on #84. @ajklein If we know what we want in "level 1", then it doesn't seem great to add a separate way to load a wasm module (unless we also wanted that separate mechanism to load ES6). My expectation here is that we can do "level 0" first (purely in terms of ES6 modules), get that shipped, and follow on fast with wasm. |
||
|
||
Additionally, an ES6 module can recursively import other modules via `import` | ||
statements. | ||
|
||
For WebAssembly/ES6 module integration, the idea is that all the above module | ||
entry points could also load WebAssembly modules simply by passing the URL of a | ||
WebAssembly module. The distinction of whether the module was WebAssembly or ES6 | ||
code could be made by namespacing or by content sniffing the first bytes of the | ||
fetched resource (which, for WebAssembly, would be a non-ASCII—and thus | ||
illegal as JavaScript—[magic number](https://en.wikipedia.org/wiki/Magic_number_%28programming%29)). | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's beautiful and hacky at the same time :-)
:-) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As previously discussed offline, this needs some work on the ES Loader side to ensure that it can be specified in terms of whatever hooks end up in Loader "level 1", rather than requiring a dependency from the ES Loader on WebAssembly. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. IIUC, this is a spec-factoring problem, not a fundamental implementation problem, is that right? That is, if these were all the same spec, I don't think there would be an issue. Assuming that's right, then I expect we could define some host hooks where the ES6 spec says "if the first byte isn't ASCII, then an error is raised unless the host environment has a separate way to decode this file". There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The point is that none of these are the same spec, and we want to keep it that way (so saying it's just a spec factoring problem seems overly dismissive). I'd like to make sure that ES proper need know nothing about wasm, and I'd also hope that the ES Loader wouldn't need to know about it either: whatever sits right above the ES Loader should be able to add the right hooks to the loader (conceptually, anyway). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I didn't mean to be dismissive but if we can agree on an intended behavior, then it seems like it's then becomes a matter of spec engineering to decide how to cut up the spec so that we don't have unintended dependencies. E.g., there is definitely going to be an HTML5 portion of the whole specified loader pipeline so it seems like that is where we could mention both JS and wasm. |
||
Thus, the whole module-loading pipeline (resolving the name to a URL, fetching | ||
the URL, any other [loader hooks](http://whatwg.github.io/loader/)) would be | ||
shared and only the final stage would fork into either the JavaScript parser or | ||
the WebAssembly decoder. | ||
|
||
Any non-builtin imports from within a WebAssembly module would be treated as | ||
if they were `import` statements of an ES6 module. If an ES6 module `import`ed | ||
a WebAssembly module, the WebAssembly module's exports would be linked as if | ||
they were the exports of an ES6 module. Once parsing and linking phases | ||
were complete, a WebAssembly module would have its `_start` function called in | ||
place of executing the ES6 module top-level script. By default, multiple | ||
loads of the same module URL (in the same realm) reuse the same singleton | ||
module instance. It may be worthwhile in the future to consider extensions to | ||
allow applications to load/compile/link a module once and instantiate multiple | ||
times (each with a separate heap and global state). | ||
|
||
This integration strategy should allow WebAssembly modules to be fairly | ||
interchangeable with ES6 modules (ignoring | ||
[GC/Web API](FutureFeatures.md#gc/dom-integration) signature restrictions of the | ||
WebAssembly MVP) and thus it should be natural to compose a single application | ||
from both kinds of code. This goal motivates the | ||
[semantic design](AstSemantics.md#linear-memory) of giving each WebAssembly | ||
module its own disjoint linear memory. Otherwise, if all modules shared a single | ||
linear memory (all modules with the same realm? origin? window?—even the | ||
scope of "all" is a nuanced question), a single app using multiple | ||
independent libraries would have to hope that all the WebAssembly modules | ||
transitively used by those libraries "played well" together (e.g., explicitly | ||
shared `malloc` and coordinated global address ranges). Instead, the | ||
[dynamic linking future feature](FutureFeatures.md#dynamic-linking) is intended | ||
to allow *explicitly* sharing linear memory between multiple modules. | ||
|
||
## Initial state of linear memory | ||
|
||
A module will contain a section declaring the linear memory size (initial and | ||
maximum size allowed by `sbrk`) and the initial contents of memory (analogous | ||
to `.data`, `.rodata`, `.bss` sections in native executables). | ||
|
||
## Code section | ||
|
||
The WebAssembly spec defines the code section of a module in terms of an | ||
[Abstract Syntax Tree](AstSemantics.md) (AST). Additionally, the spec defines | ||
two concrete representations of the AST: a [binary format](BinaryEncoding.md) | ||
which is natively decoded by the browser and a [text format](TextFormat.md) | ||
which is intended to be read and written by humans. A WebAssembly environment | ||
is only required to understand the binary format; the text format is defined so | ||
that WebAssembly modules can be written by hand (and then converted to binary | ||
with an offline tool) and so that developer tools have a well-defined text | ||
projection of a binary WebAssembly module. This design separates the concerns | ||
of specifying and reasoning about behavior, over-the-wire size and compilation | ||
speed, and ergonomic syntax. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should keep this note for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy to, I was just thinking that this was originally added when V1.md had a bunch of text and so now it seems a bit ad hoc and asymmetric since most files don't have it. The important thing is the Readme.md has it bold. Still want to keep?