Skip to content

Add JS Memory and Table API, support dynamic linking #682

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 22 commits into from
Jun 28, 2016
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
170 changes: 118 additions & 52 deletions AstSemantics.md
Original file line number Diff line number Diff line change
Expand Up @@ -80,29 +80,37 @@ operators.

## Linear Memory

The main storage of a WebAssembly instance, called the *linear memory*, is a
contiguous, byte-addressable range of memory spanning from offset `0` and
extending up to a varying *memory size*.
This size always is a multiple of the WebAssembly page size,
which is 64KiB on all engines (though large page support may be added in the
[future](FutureFeatures.md#large-page-support)).
The initial state of linear memory is specified by the
[module](Modules.md#linear-memory-section), and it can be dynamically grown by
the [`grow_memory`](AstSemantics.md#resizing) operator.

The linear memory can be considered
to be an untyped array of bytes, and it is unspecified how embedders map this
array into their process' own [virtual memory][]. The linear memory is
sandboxed; it does not alias the execution engine's internal data structures,
the execution stack, local variables, or other process memory.
A *linear memory* is a contiguous, byte-addressable range of memory spanning
from offset `0` and extending up to a varying *memory size*. This size is always
Copy link
Member

@jfbastien jfbastien Jun 8, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can split this into a follow-up: why not have a user-defined mapping for memory, mapping the numbers to strings? i.e. a module would say "I have a memory called HEAP which I map to 1, and a memory called SECRETS which I map to 2". That way different toolchains don't have to agree on magical numbers but just on strings. Two dynamic objects can use different numbering schemes if they want, and things will align fine if they agree on string names.

And for now, we'd only have HEAP which maps to 1 (or default or whatever), anything else would be invalid (and we'd add support for later).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a sense, the PR already does this (if we're talking about imports): the string is the imported name, the index is the immutable global it is stored in. (But there is this special "default" memory which avoids the need to give every load/store an immediate index which was the objection practically everyone had to the previous iteration of the PR.)

a multiple of the WebAssembly page size, which is fixed to 64KiB (though large
page support may be added in an opt-in manner in the
[future](FutureFeatures.md#large-page-support)). The initial state of a linear
memory is defined by the module's [linear memory](Modules.md#linear-memory-section) and
[data](Modules.md#data-section) sections. The memory size can be dynamically
increased by the [`grow_memory`](AstSemantics.md#resizing) operator.

A linear memory can be considered to be an untyped array of bytes, and it is
unspecified how embedders map this array into their process' own [virtual
memory][]. Linear memory is sandboxed; it does not alias other linear memories,
the execution engine's internal data structures, the execution stack, local
variables, or other process memory.

[virtual memory]: https://en.wikipedia.org/wiki/Virtual_memory

In the MVP, linear memory is not shared between threads of execution. Separate
instances can execute in separate threads but have their own linear memory and can
only communicate through messaging, e.g. in browsers using `postMessage`. It
will be possible to share linear memory between threads of execution when
[threads](PostMVP.md#threads) are added.
Every WebAssembly [instance](Modules.md) has one specially-designated *default*
linear memory which is the linear memory accessed by all the
[memory operators below](#linear-memory-access). In the MVP, there are *only*
default linear memories but [new memory operators](FutureFeatures.md#multiple-tables-and-memories)
may be added after the MVP which can also access non-default memories.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If these non-default linear memories are not visible in the MVP design then perhaps just leave out mention of them.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are "left out", insofar as there is no extra impl burden. But it's useful to explain the target state, though, since it motivates other aspects of the design.

Linear memories (default or otherwise) can either be [imported](Modules.md#imports)
or [defined inside the module](Modules.md#linear-memory-section), with defaultness
indicated by a flag on the import or definition. After import or definition,
there is no difference when accessing a linear memory whether it was imported or
defined internally.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So linear memories can be shared between instances in the MVP? E.g. Created in one instance and then imported as the default in another.

Lets generalize this to allow a sub-range of one linear memory to be the linear memory of another child instance. Lets add an API to allow the wasm code to create this child instance and to pass it a sub-range of the parents linear memory. This would address many uses cases, including protecting the parent from the child instance accessing its memory outside the range allocated.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather consider that in a separate issue; there are non-trivial performance and implementation implications.

Copy link

@ghost ghost May 30, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about considering it first as it meets many of the use cases for multiple linear memories. What are the 'non-trivial performance and implementation implications'? From what I understand these would have better performance than multiple linear memories because there is just one base per instance?

So if the performance of this approach has 'non-trivial performance and implementation implications' then multiple linear memories is even more of a burden, plus the burden of managing and supporting these multiple linear memories. Please, lets look at the performance and implementation implications? Lets look at the plan.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Importing a subrange means that you can't rely on any guard pages being at the end since you could well be looking into the middle of a bigger linear memory. You could imagine a separate kind of import that says "this is an import of a subrange of a bigger linear memory" that then impeded guard-page-based optimizations, but that'd be an additional kind of import to be considered separately.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I guess so. Fwiw, guard pages are not necessary for the index masking strategy, just a spill area, so this strategy could be viable using index masking.

In the MVP, linear memory cannot be shared between threads of execution.
The addition of [threads](PostMVP.md#threads) will allow this.

### Linear Memory Accesses

Expand Down Expand Up @@ -143,6 +151,8 @@ size in which case integer wrapping is implied.
In addition to storing to memory, store instructions produce a value which is their
`value` input operand before wrapping.

The above operators operate on the [default linear memory](#linear-memory).

### Addressing

Each linear memory access operator has an address operand and an unsigned
Expand Down Expand Up @@ -210,7 +220,7 @@ the [future](FutureFeatures.md#large-page-support)).
* `grow_memory` : grow linear memory by a given unsigned delta of pages.
Return the previous memory size in units of pages or -1 on failure.

When a maximum memory size is declared in the [memory section](Module.md#linear-memory-section),
When a linear memory has a declared [maximum memory size](Modules.md#linear-memory-section),
`grow_memory` must fail if it would grow past the maximum. However,
`grow_memory` may still fail before the maximum if it was not possible to
reserve the space up front or if enabling the reserved memory fails.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where would the linear memory maximum be declared for an imported memory? In the module in which it was created? Seems that they would need to be allowed to be created elsewhere for many use cases.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See Modules.md#imports which describes that import statements specify an initial and maximum which constrain the imported memory.

Expand All @@ -232,12 +242,53 @@ operator may be added. However, due to normal fragmentation, applications are
instead expected release unused physical pages from the working set using the
[`discard`](FutureFeatures.md#finer-grained-control-over-memory) future feature.

The above operators operate on the [default linear memory](#linear-memory).

## Table

A *table* is similar to a linear memory whose elements, instead of being bytes,
are opaque values of a particular *table element type*. This allows the table to
contain values—like GC references, raw OS handles, or native pointers—that are
accessed by WebAssembly code indirectly through an integer index. This feature
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above: should these integer indices be user-mapped using strings?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same answer as above :)

bridges the gap between low-level, untrusted linear memory and high-level
opaque handles/references at the cost of a bounds-checked table indirection.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like a good description of the use case and rationale.

The table's element type constrains the type of elements stored
in the table and allows engines to avoid some type checks on table use.
When a WebAssembly value is stored in a table, the value's type must precisely
match the element type. Just like linear memory, updates to a table are
observed immediately by all instances that reference the table. Depending on the
operator/API used to store the value, this check may be static or dynamic. Host
environments may also allow storing non-WebAssembly values in tables in which
case, as with [imports](Modules.md#imports), the meaning of using the value is
defined by the host environment.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, but would the coercions need to be spelled out. For example are the arguments and results of imported functions coerced; are floats coerced to integers; strings to numbers; missing arguments to zero?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is spelled out in call_table, which requires a precise signature match. I realized that I forgot to spell out what happens if a non-wasm (viz., JS) function is assigned to a table element, though (it's host-defined and, in the case of JS, always succeeds). I'll add something for that.

Every WebAssembly [instance](Modules.md) has one specially-designated *default*
table which is indexed by [`call_indirect`](#calls) and other future
table operators. Tables can either be [imported](Modules.md#imports) or
[defined inside the module](Modules.md#table-section), with defaultness
indicated by a flag on the import or definition. After import or definition,
there is no difference when calling into a table whether it was imported or
defined internally.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tables of functions seems a use case worthy of special attention, and would warrant allowing multiple function tables in the MVP or well before tables of other objects.

Could this note that the table element type would be defined in the module instance. If this were not static then it would not be of user for optimize of AOT compiled code.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are new sections in FutureFeatures.md on supporting multiple tables and multiple element kinds.

In the MVP, the primary purpose of tables is to implement indirect function
calls in C/C++ using an integer index as the pointer-to-function and the table
to hold the array of indirectly-callable functions. Thus, in the MVP:
* tables may only be accessed from WebAssembly code via [`call_indirect`](#calls);
* the only allowed table element type is `anyfunc` (function with any signature);
* tables may not be directly mutated or resized from WebAssembly code;
this can only be done through the host environment (e.g., the
the `WebAssembly` [JavaScript API](JS.md#webassemblytable-objects)).

These restrictions may be relaxed in the
[future](FutureFeatures.md#more-table-operators-and-types).

## Local variables

Each function has a fixed, pre-declared number of local variables which occupy a single
Each function has a fixed, pre-declared number of *local variables* which occupy a single
index space local to the function. Parameters are addressed as local variables. Local
variables do not have addresses and are not aliased by linear memory. Local
variables have value types and are initialized to the appropriate zero value for their
variables have [value types](#types) and are initialized to the appropriate zero value for their
type at the beginning of the function, except parameters which are initialized to the values
of the arguments passed to the function.

Expand All @@ -248,6 +299,32 @@ The details of index space for local variables and their types will be further c
e.g. whether locals with type `i32` and `i64` must be contiguous and separate from
others, etc.

## Global variables

A *global variable* stores a single value of a fixed [value type](#types) and may be
declared either *mutable* or *immutable*. This provides WebAssembly with memory
locations that are disjoint from any [linear memory](#linear-memory) and thus
cannot be arbitrarily aliased as bits.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the intent that when we support multiple memories those can also be loaded as immutable? That seems very useful for a bunch of usecases, and orthogonal to mprotect.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correctamundo


Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, see also #333 These might also address some code path selection based on features.

But if only immutable globals are need then perhaps just implements these and not mutable globals?

Do we know if the compiler really need un-aliased locations, or is the linear memory just fine?

Problem with mutable globals is that they need a pointer outside of the linear memory, increasing register pressure. The 32-bit x86 might be an exception where the offset can be baked in, but with code sharing even that would not be possible.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For these global variables to be dynamically importable, or 'instantiation-time immutable values', it would appear that they can not be baked into the code in AOT compilation. If so then it is not clear what utility they have, and they would not appear to really be necessary?

I see some potential for the compiler to optimize code knowing that they will not change, similar to a const var, but that does not appear to be a core part of this PR? Also let variables have this property too, so if these were added then these might help address some of these optimization opportunities.

I recall mention of patchable values in prior discussion, and are these global constant variables intended to meet this use case? This would appear to require each instance to have a separate copy of the code, but that would somewhat defeat the purpose of sharing code.

For immutable global variables defined in the module, could they be baked into the code AOT?

There might be some significant performance burdens for shared code, extra dereferences. Does the wasm model allow the web developer to make the choice on these trade offs. Should a module have a flag to indicate that it is likely to be shared code, so that only in this case the runtime might avoid re-compiling or copying the code for each instance.

Copy link
Member Author

@lukewagner lukewagner May 30, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For immutable globals, a compiler can either perform instantiation-time patching of immediates or use indirect through the global table. This has the same tradeoffs (dirtying pages vs. GOT-indirection) as with normal dynamic linking.

Mutable global variables need to be supported for engines that compile asm.js to wasm anyway, so it's no big deal to support in the MVP and might actually be useful for a tool (e.g., asm.js2wasm) that needs to acquire a fixed-size region memory without perturbing linear memory. Importing/Exporting mutable global variables is an extra burden, though, and that is explicitly ruled out in the MVP if you look at JS.md.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since no use case for mutable global variables has been made I do not support them being added back here. This matter had already been considered, and they were removed. Implementations are welcome to support them internally for asm.js but this need not be part of wasm. No new performance number have been presented after the decision to remove the global variables that suggests they were needed?

Global variables are accessed via an integer index into the module-defined
[global index space](Modules.md#global-index-space). Global variables can
either be [imported](Modules.md#imports) or [defined inside the module](Modules.md#global-section).
After import or definition, there is no difference when accessing a global.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Global imports also specify mutability?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's right; this is currently mentioned in Modules.md#imports for "global variable imports".


* `get_global`: get the current value of a global variable
* `set_global`: set the current value of a global variable

It is a validation error for a `set_global` to index an immutable global variable.

In the MVP, the primary use case of global variables is to represent
instantiation-time immutable values as a useful building block for
[dynamic linking](DynamicLinking.md).

After the MVP, when [reference types](GC.md) are added to the set of [value types](#types),
global variables will be necessary to allow sharing reference types between
[threads](PostMVP.md#threads) since shared linear memory cannot load or store
references.

## Control flow structures

WebAssembly offers basic structured control flow with the following constructs.
Expand Down Expand Up @@ -314,43 +391,32 @@ explicit accesses to linear memory.
In the MVP, the length of the return types sequence may only be 0 or 1. This
restriction may be lifted in the future.

Direct calls to a function specify the callee by index into a *main function table*.
Direct calls to a function specify the callee by an index into the
[function index space](Modules.md#function-index-space).

* `call`: call function directly

A direct call to a function with a mismatched signature is a module verification error.

Like direct calls, calls to [imports](Modules.md#imports-and-exports) specify
the callee by index into an *imported function table* defined by the sequence of import
declarations in the module import section. A direct call to an imported function with a
mismatched signature is a module verification error.

* `call_import` : call imported function directly

Indirect calls allow calling target functions that are unknown at compile time.
The target function is an expression of value type `i32` and is always the first
input into the indirect call.

A `call_indirect` specifies the *expected* signature of the target function with
an index into a *signature table* defined by the module. An indirect call to a
function with a mismatched signature causes a trap.
Indirect calls to a function indicate the callee with an `i32` index into
a [table](#table). The *expected* signature of the target function (specified
by its index in the [types section](BinaryEncoding.md#type-section)) is given as
a second immediate.

* `call_indirect`: call function indirectly

Functions from the main function table are made addressable by defining an
*indirect function table* that consists of a sequence of indices into the
module's main function table. A function from the main table may appear more
than once in the indirect function table. Functions not appearing in the
indirect function table cannot be called indirectly.

In the MVP, indices into the indirect function table are local to a single
module, so wasm modules may use `i32` constants to refer to entries in their own
indirect function table. The [dynamic linking](DynamicLinking.md) feature is
necessary for two modules to pass function pointers back and forth. This will
mean concatenating indirect function tables and adding an operator `address_of`
that computes the absolute index into the concatenated table from an index in a
module's local indirect table. JITing may also mean appending more functions to
the end of the indirect function table.
Unlike `call`, which checks that the caller and callee signatures match
statically as part of validation, `call_indirect` checks for signature match
*dynamically*, comparing the caller's expected signature with the callee function's
signature and and trapping if there is a mismatch. Since the callee may be in a
different module which necessarily has a separate [types section](BinaryEncoding.md#type-section),
and thus index space of types, the signature match must compare the underlying
[`func_type`](https://github.com/WebAssembly/spec/blob/master/ml-proto/spec/types.ml#L5).
As noted [above](#table), table elements may also be host-environment-defined
values in which case the meaning of a call (and how the signature is checked)
is defined by the host-environment, much like calling an import.

In the MVP, the single `call_indirect` operator accesses the [default table](#table).

Multiple return value calls will be possible, though possibly not in the
MVP. The details of multiple-return-value calls needs clarification. Calling a
Expand Down
130 changes: 84 additions & 46 deletions DynamicLinking.md
Original file line number Diff line number Diff line change
@@ -1,49 +1,87 @@
# Dynamic linking

Dynamic loading of code is in [the MVP](MVP.md) in the form of
[modules](Modules.md), but all loaded modules have their own separate
[linear memory](AstSemantics.md#linear-memory) by default and cannot share
[function pointers](AstSemantics.md#calls). Limited collaboration between
modules is possible in the MVP by having two modules share the same linear
memory and invoke each other through the host environment.

True dynamic linking will allow developers to share memory, function pointers,
and future non-memory state such as global variables and thread-local variables
between WebAssembly dynamic libraries.

WebAssembly will support both load-time and run-time (`dlopen`) dynamic linking
of libraries.

One important requirement of dynamic linking is to allow the linked module
to have its own position-independent global data segment. This could be achieved
by specifying a new kind of link-time-initialized immutable global variable
which would be initialized with the address (in linear memory) of the modules'
global data segment. These immutable globals could also be used to provide
a linked module with the offsets of its function pointers in the instance's
function pointer tables. An important aspect of immutable globals is that they
could either be patched directly as constant values or implemented through a
[Global Offset Table](https://en.wikipedia.org/wiki/Position-independent_code)
in position-independent code.

Dynamic linking is especially useful when combined with a Content Distribution
Network (CDN) such as [hosted libraries][] because the library is only ever
downloaded and compiled once per user device. It can also allow for smaller
differential updates, which could be implemented in collaboration with
[service workers][].

We would like to standardize a single [ABI][] per source language, allowing for
WebAssembly libraries to interface with each other regardless of compiler. While
it is highly recommended for compilers targeting WebAssembly to adhere to the
specified ABI for interoperability, WebAssembly runtimes will be ABI agnostic,
so it will be possible to use a non-standard ABI for specialized purposes.

Although dynamic linking is not part of the MVP, it has significant implications
on many aspects of the design that do impact the MVP, such as the way linear
memory is managed, how module imports and exports are specified, and how globals
and function pointers work. Therefore we want to have some viable ideas to
ensure we don't standardize a design that unnecessarily complicates the design
or implementation of dynamic linking.

[hosted libraries]: https://developers.google.com/speed/libraries/
[service workers]: https://www.w3.org/TR/service-workers/
WebAssembly enables load-time and run-time (`dlopen`) dynamic linking in the
MVP by having multiple [instantiated modules](Modules.md)
share functions, [linear memories](AstSemantics.md#linear-memory),
[tables](AstSemantics.md#table) and [constants](AstSemantics.md#constants)
using module [imports](Modules.md#imports) and [exports](Modules.md#exports). In
particular, since all (non-local) state that a module can access can be imported
and exported and thus shared between separate modules' instances, toolchains
have the building blocks to implement dynamic loaders.

Since the manner in which modules are loaded and instantiated is defined by the
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we settle on the term embedding environment?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grepping, we use "host environment" a bunch, "embedding environment" not at all, although we do refer to Web/non-Web/browser embeddings in 6 cases. Maybe we could normalize in a separate PR?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We used to have "embedder" in a few places, looks like only 2 are left now.

host environment (e.g., the [JavaScript API](JS.md)), dynamic linking requires
use of host-specific functionality to link two modules. At a minimum, the host
environment must provide a way to dynamically instantiate modules while
connecting exports to imports.

The simplest load-time dynamic linking scheme between modules A and B can be
achieved by having module A export functions, tables and memories that are
imported by B. A C++ toolchain can expose this functionality by using the
same function attributes currently used to export/import symbols from
native DSOs/DLLs:
```
#ifdef _WIN32
# define EXPORT __declspec(dllexport)
# define IMPORT __declspec(dllimport)
#else
# define EXPORT __attribute__ ((visibility ("default")))
# define IMPORT __attribute__ ((visibility ("default")))
#endif

typedef void (**PF)();

IMPORT PF imp();
EXPORT void exp() { (*imp())(); }
```
This code would, at a minimum, generate a WebAssembly module with imports for:
* the function `imp`
* the heap used to perfom the load, when dereferencing the return value of `imp`
* the table used to perform the pointer-to-function call

and exports for:
* the function `exp`

A more realistic module using libc would have more imports including:
* an immutable `i32` global import for the offset in linear memory to place
global [data segments](Modules.md#data-section) and later use as a constant
base address when loading and storing from globals
* an immutable `i32` global import for the offset into the indirect function
table at which to place the modules' indirectly called functions and later
compute their indices for address-of

One extra detail is what to use as the [module name](Modules.md#imports) for
imports (since WebAssembly has a two-level namespace). One option is to have a
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reference for the two-level namespace stuff? I can't seem to find it, maybe just file a bug to add it later so the PR doesn't get bogged down?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Righto: #706.

single default module name for all C/C++ imports/exports (which then allows the
toolchain to put implementation-internal names in a separate namespace, avoiding
the need for `__`-prefix conventions).

To implement run-time dynamic linking (e.g., `dlopen` and `dlsym`):
* `dlopen` would compile and instantiate a new module, storing the compiled
instance in a host-environment table, returning the index to the caller.
* `dlsym` would be given this index, pull the instance out of the table,
search the instances's exports, append the found function to the function
table (using host-defined functionality in the MVP, but directly from
WebAssembly code in the
[future](FutureFeatures.md#more-table-operators-and-types)) and return the
table index of the appended element to the caller.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, and more than I expected for the MVP.

Note that the representation of a C function-pointer in WebAssembly is an index
into a function table, so the above scheme lines up perfectly with the
function-pointer return value of `dlsym`.

More complicated dynamic linking functionality (e.g., interposition, weak
symbols, etc) can be simulated efficiently by assigning a function table
index to each weak/mutable symbol, calling the symbol via `call_indirect` on that
index, and mutating the underlying element as needed.

After the MVP, we would like to standardize a single [ABI][] per source
language, allowing for WebAssembly libraries to interface with each other
regardless of compiler. Specifying an ABI requires that all ABI-related
future features (like SIMD, multiple return values and exception handling)
have been implemented. While it is highly recommended for compilers targeting
WebAssembly to adhere to the specified ABI for interoperability, WebAssembly
runtimes will be ABI agnostic, so it will be possible to use a non-standard ABI
for specialized purposes.

[ABI]: https://en.wikipedia.org/wiki/Application_binary_interface
Loading