-
Notifications
You must be signed in to change notification settings - Fork 695
Add JS Memory and Table API, support dynamic linking #682
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
803a1c2
b6eb6b9
7d78308
5c9ba01
666150c
7d0a259
7ad54dc
9a28b97
ff3a88f
06d67ca
9e1809b
74966ae
aaaecfa
fb4ecd1
c0141b1
7e89c86
23375a1
f045e01
8d8e75d
902c8f3
5f7368c
c7d34f7
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -80,29 +80,37 @@ operators. | |
|
||
## Linear Memory | ||
|
||
The main storage of a WebAssembly instance, called the *linear memory*, is a | ||
contiguous, byte-addressable range of memory spanning from offset `0` and | ||
extending up to a varying *memory size*. | ||
This size always is a multiple of the WebAssembly page size, | ||
which is 64KiB on all engines (though large page support may be added in the | ||
[future](FutureFeatures.md#large-page-support)). | ||
The initial state of linear memory is specified by the | ||
[module](Modules.md#linear-memory-section), and it can be dynamically grown by | ||
the [`grow_memory`](AstSemantics.md#resizing) operator. | ||
|
||
The linear memory can be considered | ||
to be an untyped array of bytes, and it is unspecified how embedders map this | ||
array into their process' own [virtual memory][]. The linear memory is | ||
sandboxed; it does not alias the execution engine's internal data structures, | ||
the execution stack, local variables, or other process memory. | ||
A *linear memory* is a contiguous, byte-addressable range of memory spanning | ||
from offset `0` and extending up to a varying *memory size*. This size is always | ||
a multiple of the WebAssembly page size, which is fixed to 64KiB (though large | ||
page support may be added in an opt-in manner in the | ||
[future](FutureFeatures.md#large-page-support)). The initial state of a linear | ||
memory is defined by the module's [linear memory](Modules.md#linear-memory-section) and | ||
[data](Modules.md#data-section) sections. The memory size can be dynamically | ||
increased by the [`grow_memory`](AstSemantics.md#resizing) operator. | ||
|
||
A linear memory can be considered to be an untyped array of bytes, and it is | ||
unspecified how embedders map this array into their process' own [virtual | ||
memory][]. Linear memory is sandboxed; it does not alias other linear memories, | ||
the execution engine's internal data structures, the execution stack, local | ||
variables, or other process memory. | ||
|
||
[virtual memory]: https://en.wikipedia.org/wiki/Virtual_memory | ||
|
||
In the MVP, linear memory is not shared between threads of execution. Separate | ||
instances can execute in separate threads but have their own linear memory and can | ||
only communicate through messaging, e.g. in browsers using `postMessage`. It | ||
will be possible to share linear memory between threads of execution when | ||
[threads](PostMVP.md#threads) are added. | ||
Every WebAssembly [instance](Modules.md) has one specially-designated *default* | ||
linear memory which is the linear memory accessed by all the | ||
[memory operators below](#linear-memory-access). In the MVP, there are *only* | ||
default linear memories but [new memory operators](FutureFeatures.md#multiple-tables-and-memories) | ||
may be added after the MVP which can also access non-default memories. | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If these non-default linear memories are not visible in the MVP design then perhaps just leave out mention of them. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. They are "left out", insofar as there is no extra impl burden. But it's useful to explain the target state, though, since it motivates other aspects of the design. |
||
Linear memories (default or otherwise) can either be [imported](Modules.md#imports) | ||
or [defined inside the module](Modules.md#linear-memory-section), with defaultness | ||
indicated by a flag on the import or definition. After import or definition, | ||
there is no difference when accessing a linear memory whether it was imported or | ||
defined internally. | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So linear memories can be shared between instances in the MVP? E.g. Created in one instance and then imported as the default in another. Lets generalize this to allow a sub-range of one linear memory to be the linear memory of another child instance. Lets add an API to allow the wasm code to create this child instance and to pass it a sub-range of the parents linear memory. This would address many uses cases, including protecting the parent from the child instance accessing its memory outside the range allocated. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd rather consider that in a separate issue; there are non-trivial performance and implementation implications. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How about considering it first as it meets many of the use cases for multiple linear memories. What are the 'non-trivial performance and implementation implications'? From what I understand these would have better performance than multiple linear memories because there is just one base per instance? So if the performance of this approach has 'non-trivial performance and implementation implications' then multiple linear memories is even more of a burden, plus the burden of managing and supporting these multiple linear memories. Please, lets look at the performance and implementation implications? Lets look at the plan. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Importing a subrange means that you can't rely on any guard pages being at the end since you could well be looking into the middle of a bigger linear memory. You could imagine a separate kind of import that says "this is an import of a subrange of a bigger linear memory" that then impeded guard-page-based optimizations, but that'd be an additional kind of import to be considered separately. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, I guess so. Fwiw, guard pages are not necessary for the index masking strategy, just a spill area, so this strategy could be viable using index masking. |
||
In the MVP, linear memory cannot be shared between threads of execution. | ||
The addition of [threads](PostMVP.md#threads) will allow this. | ||
|
||
### Linear Memory Accesses | ||
|
||
|
@@ -143,6 +151,8 @@ size in which case integer wrapping is implied. | |
In addition to storing to memory, store instructions produce a value which is their | ||
`value` input operand before wrapping. | ||
|
||
The above operators operate on the [default linear memory](#linear-memory). | ||
|
||
### Addressing | ||
|
||
Each linear memory access operator has an address operand and an unsigned | ||
|
@@ -210,7 +220,7 @@ the [future](FutureFeatures.md#large-page-support)). | |
* `grow_memory` : grow linear memory by a given unsigned delta of pages. | ||
Return the previous memory size in units of pages or -1 on failure. | ||
|
||
When a maximum memory size is declared in the [memory section](Module.md#linear-memory-section), | ||
When a linear memory has a declared [maximum memory size](Modules.md#linear-memory-section), | ||
`grow_memory` must fail if it would grow past the maximum. However, | ||
`grow_memory` may still fail before the maximum if it was not possible to | ||
reserve the space up front or if enabling the reserved memory fails. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Where would the linear memory maximum be declared for an imported memory? In the module in which it was created? Seems that they would need to be allowed to be created elsewhere for many use cases. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. See Modules.md#imports which describes that import statements specify an initial and maximum which constrain the imported memory. |
||
|
@@ -232,12 +242,53 @@ operator may be added. However, due to normal fragmentation, applications are | |
instead expected release unused physical pages from the working set using the | ||
[`discard`](FutureFeatures.md#finer-grained-control-over-memory) future feature. | ||
|
||
The above operators operate on the [default linear memory](#linear-memory). | ||
|
||
## Table | ||
|
||
A *table* is similar to a linear memory whose elements, instead of being bytes, | ||
are opaque values of a particular *table element type*. This allows the table to | ||
contain values—like GC references, raw OS handles, or native pointers—that are | ||
accessed by WebAssembly code indirectly through an integer index. This feature | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same as above: should these integer indices be user-mapped using strings? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same answer as above :) |
||
bridges the gap between low-level, untrusted linear memory and high-level | ||
opaque handles/references at the cost of a bounds-checked table indirection. | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sounds like a good description of the use case and rationale. |
||
The table's element type constrains the type of elements stored | ||
in the table and allows engines to avoid some type checks on table use. | ||
When a WebAssembly value is stored in a table, the value's type must precisely | ||
match the element type. Just like linear memory, updates to a table are | ||
observed immediately by all instances that reference the table. Depending on the | ||
operator/API used to store the value, this check may be static or dynamic. Host | ||
environments may also allow storing non-WebAssembly values in tables in which | ||
case, as with [imports](Modules.md#imports), the meaning of using the value is | ||
defined by the host environment. | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sounds good, but would the coercions need to be spelled out. For example are the arguments and results of imported functions coerced; are floats coerced to integers; strings to numbers; missing arguments to zero? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is spelled out in |
||
Every WebAssembly [instance](Modules.md) has one specially-designated *default* | ||
table which is indexed by [`call_indirect`](#calls) and other future | ||
table operators. Tables can either be [imported](Modules.md#imports) or | ||
[defined inside the module](Modules.md#table-section), with defaultness | ||
indicated by a flag on the import or definition. After import or definition, | ||
there is no difference when calling into a table whether it was imported or | ||
defined internally. | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Tables of functions seems a use case worthy of special attention, and would warrant allowing multiple function tables in the MVP or well before tables of other objects. Could this note that the table element type would be defined in the module instance. If this were not static then it would not be of user for optimize of AOT compiled code. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There are new sections in FutureFeatures.md on supporting multiple tables and multiple element kinds. |
||
In the MVP, the primary purpose of tables is to implement indirect function | ||
calls in C/C++ using an integer index as the pointer-to-function and the table | ||
to hold the array of indirectly-callable functions. Thus, in the MVP: | ||
* tables may only be accessed from WebAssembly code via [`call_indirect`](#calls); | ||
* the only allowed table element type is `anyfunc` (function with any signature); | ||
* tables may not be directly mutated or resized from WebAssembly code; | ||
this can only be done through the host environment (e.g., the | ||
the `WebAssembly` [JavaScript API](JS.md#webassemblytable-objects)). | ||
|
||
These restrictions may be relaxed in the | ||
[future](FutureFeatures.md#more-table-operators-and-types). | ||
|
||
## Local variables | ||
|
||
Each function has a fixed, pre-declared number of local variables which occupy a single | ||
Each function has a fixed, pre-declared number of *local variables* which occupy a single | ||
index space local to the function. Parameters are addressed as local variables. Local | ||
variables do not have addresses and are not aliased by linear memory. Local | ||
variables have value types and are initialized to the appropriate zero value for their | ||
variables have [value types](#types) and are initialized to the appropriate zero value for their | ||
type at the beginning of the function, except parameters which are initialized to the values | ||
of the arguments passed to the function. | ||
|
||
|
@@ -248,6 +299,32 @@ The details of index space for local variables and their types will be further c | |
e.g. whether locals with type `i32` and `i64` must be contiguous and separate from | ||
others, etc. | ||
|
||
## Global variables | ||
|
||
A *global variable* stores a single value of a fixed [value type](#types) and may be | ||
declared either *mutable* or *immutable*. This provides WebAssembly with memory | ||
locations that are disjoint from any [linear memory](#linear-memory) and thus | ||
cannot be arbitrarily aliased as bits. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is the intent that when we support multiple memories those can also be loaded as immutable? That seems very useful for a bunch of usecases, and orthogonal to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Correctamundo |
||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Great, see also #333 These might also address some code path selection based on features. But if only immutable globals are need then perhaps just implements these and not mutable globals? Do we know if the compiler really need un-aliased locations, or is the linear memory just fine? Problem with mutable globals is that they need a pointer outside of the linear memory, increasing register pressure. The 32-bit x86 might be an exception where the offset can be baked in, but with code sharing even that would not be possible. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For these global variables to be dynamically importable, or 'instantiation-time immutable values', it would appear that they can not be baked into the code in AOT compilation. If so then it is not clear what utility they have, and they would not appear to really be necessary? I see some potential for the compiler to optimize code knowing that they will not change, similar to a I recall mention of patchable values in prior discussion, and are these global constant variables intended to meet this use case? This would appear to require each instance to have a separate copy of the code, but that would somewhat defeat the purpose of sharing code. For immutable global variables defined in the module, could they be baked into the code AOT? There might be some significant performance burdens for shared code, extra dereferences. Does the wasm model allow the web developer to make the choice on these trade offs. Should a module have a flag to indicate that it is likely to be shared code, so that only in this case the runtime might avoid re-compiling or copying the code for each instance. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For immutable globals, a compiler can either perform instantiation-time patching of immediates or use indirect through the global table. This has the same tradeoffs (dirtying pages vs. GOT-indirection) as with normal dynamic linking. Mutable global variables need to be supported for engines that compile asm.js to wasm anyway, so it's no big deal to support in the MVP and might actually be useful for a tool (e.g., asm.js2wasm) that needs to acquire a fixed-size region memory without perturbing linear memory. Importing/Exporting mutable global variables is an extra burden, though, and that is explicitly ruled out in the MVP if you look at JS.md. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Since no use case for mutable global variables has been made I do not support them being added back here. This matter had already been considered, and they were removed. Implementations are welcome to support them internally for asm.js but this need not be part of wasm. No new performance number have been presented after the decision to remove the global variables that suggests they were needed? |
||
Global variables are accessed via an integer index into the module-defined | ||
[global index space](Modules.md#global-index-space). Global variables can | ||
either be [imported](Modules.md#imports) or [defined inside the module](Modules.md#global-section). | ||
After import or definition, there is no difference when accessing a global. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Global imports also specify mutability? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's right; this is currently mentioned in Modules.md#imports for "global variable imports". |
||
|
||
* `get_global`: get the current value of a global variable | ||
* `set_global`: set the current value of a global variable | ||
|
||
It is a validation error for a `set_global` to index an immutable global variable. | ||
|
||
In the MVP, the primary use case of global variables is to represent | ||
instantiation-time immutable values as a useful building block for | ||
[dynamic linking](DynamicLinking.md). | ||
|
||
After the MVP, when [reference types](GC.md) are added to the set of [value types](#types), | ||
global variables will be necessary to allow sharing reference types between | ||
[threads](PostMVP.md#threads) since shared linear memory cannot load or store | ||
references. | ||
|
||
## Control flow structures | ||
|
||
WebAssembly offers basic structured control flow with the following constructs. | ||
|
@@ -314,43 +391,32 @@ explicit accesses to linear memory. | |
In the MVP, the length of the return types sequence may only be 0 or 1. This | ||
restriction may be lifted in the future. | ||
|
||
Direct calls to a function specify the callee by index into a *main function table*. | ||
Direct calls to a function specify the callee by an index into the | ||
[function index space](Modules.md#function-index-space). | ||
|
||
* `call`: call function directly | ||
|
||
A direct call to a function with a mismatched signature is a module verification error. | ||
|
||
Like direct calls, calls to [imports](Modules.md#imports-and-exports) specify | ||
the callee by index into an *imported function table* defined by the sequence of import | ||
declarations in the module import section. A direct call to an imported function with a | ||
mismatched signature is a module verification error. | ||
|
||
* `call_import` : call imported function directly | ||
|
||
Indirect calls allow calling target functions that are unknown at compile time. | ||
The target function is an expression of value type `i32` and is always the first | ||
input into the indirect call. | ||
|
||
A `call_indirect` specifies the *expected* signature of the target function with | ||
an index into a *signature table* defined by the module. An indirect call to a | ||
function with a mismatched signature causes a trap. | ||
Indirect calls to a function indicate the callee with an `i32` index into | ||
a [table](#table). The *expected* signature of the target function (specified | ||
by its index in the [types section](BinaryEncoding.md#type-section)) is given as | ||
a second immediate. | ||
|
||
* `call_indirect`: call function indirectly | ||
|
||
Functions from the main function table are made addressable by defining an | ||
*indirect function table* that consists of a sequence of indices into the | ||
module's main function table. A function from the main table may appear more | ||
than once in the indirect function table. Functions not appearing in the | ||
indirect function table cannot be called indirectly. | ||
|
||
In the MVP, indices into the indirect function table are local to a single | ||
module, so wasm modules may use `i32` constants to refer to entries in their own | ||
indirect function table. The [dynamic linking](DynamicLinking.md) feature is | ||
necessary for two modules to pass function pointers back and forth. This will | ||
mean concatenating indirect function tables and adding an operator `address_of` | ||
that computes the absolute index into the concatenated table from an index in a | ||
module's local indirect table. JITing may also mean appending more functions to | ||
the end of the indirect function table. | ||
Unlike `call`, which checks that the caller and callee signatures match | ||
statically as part of validation, `call_indirect` checks for signature match | ||
*dynamically*, comparing the caller's expected signature with the callee function's | ||
signature and and trapping if there is a mismatch. Since the callee may be in a | ||
different module which necessarily has a separate [types section](BinaryEncoding.md#type-section), | ||
and thus index space of types, the signature match must compare the underlying | ||
[`func_type`](https://github.com/WebAssembly/spec/blob/master/ml-proto/spec/types.ml#L5). | ||
As noted [above](#table), table elements may also be host-environment-defined | ||
values in which case the meaning of a call (and how the signature is checked) | ||
is defined by the host-environment, much like calling an import. | ||
|
||
In the MVP, the single `call_indirect` operator accesses the [default table](#table). | ||
|
||
Multiple return value calls will be possible, though possibly not in the | ||
MVP. The details of multiple-return-value calls needs clarification. Calling a | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,49 +1,87 @@ | ||
# Dynamic linking | ||
|
||
Dynamic loading of code is in [the MVP](MVP.md) in the form of | ||
[modules](Modules.md), but all loaded modules have their own separate | ||
[linear memory](AstSemantics.md#linear-memory) by default and cannot share | ||
[function pointers](AstSemantics.md#calls). Limited collaboration between | ||
modules is possible in the MVP by having two modules share the same linear | ||
memory and invoke each other through the host environment. | ||
|
||
True dynamic linking will allow developers to share memory, function pointers, | ||
and future non-memory state such as global variables and thread-local variables | ||
between WebAssembly dynamic libraries. | ||
|
||
WebAssembly will support both load-time and run-time (`dlopen`) dynamic linking | ||
of libraries. | ||
|
||
One important requirement of dynamic linking is to allow the linked module | ||
to have its own position-independent global data segment. This could be achieved | ||
by specifying a new kind of link-time-initialized immutable global variable | ||
which would be initialized with the address (in linear memory) of the modules' | ||
global data segment. These immutable globals could also be used to provide | ||
a linked module with the offsets of its function pointers in the instance's | ||
function pointer tables. An important aspect of immutable globals is that they | ||
could either be patched directly as constant values or implemented through a | ||
[Global Offset Table](https://en.wikipedia.org/wiki/Position-independent_code) | ||
in position-independent code. | ||
|
||
Dynamic linking is especially useful when combined with a Content Distribution | ||
Network (CDN) such as [hosted libraries][] because the library is only ever | ||
downloaded and compiled once per user device. It can also allow for smaller | ||
differential updates, which could be implemented in collaboration with | ||
[service workers][]. | ||
|
||
We would like to standardize a single [ABI][] per source language, allowing for | ||
WebAssembly libraries to interface with each other regardless of compiler. While | ||
it is highly recommended for compilers targeting WebAssembly to adhere to the | ||
specified ABI for interoperability, WebAssembly runtimes will be ABI agnostic, | ||
so it will be possible to use a non-standard ABI for specialized purposes. | ||
|
||
Although dynamic linking is not part of the MVP, it has significant implications | ||
on many aspects of the design that do impact the MVP, such as the way linear | ||
memory is managed, how module imports and exports are specified, and how globals | ||
and function pointers work. Therefore we want to have some viable ideas to | ||
ensure we don't standardize a design that unnecessarily complicates the design | ||
or implementation of dynamic linking. | ||
|
||
[hosted libraries]: https://developers.google.com/speed/libraries/ | ||
[service workers]: https://www.w3.org/TR/service-workers/ | ||
WebAssembly enables load-time and run-time (`dlopen`) dynamic linking in the | ||
MVP by having multiple [instantiated modules](Modules.md) | ||
share functions, [linear memories](AstSemantics.md#linear-memory), | ||
[tables](AstSemantics.md#table) and [constants](AstSemantics.md#constants) | ||
using module [imports](Modules.md#imports) and [exports](Modules.md#exports). In | ||
particular, since all (non-local) state that a module can access can be imported | ||
and exported and thus shared between separate modules' instances, toolchains | ||
have the building blocks to implement dynamic loaders. | ||
|
||
Since the manner in which modules are loaded and instantiated is defined by the | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should we settle on the term embedding environment? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Grepping, we use "host environment" a bunch, "embedding environment" not at all, although we do refer to Web/non-Web/browser embeddings in 6 cases. Maybe we could normalize in a separate PR? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We used to have "embedder" in a few places, looks like only 2 are left now. |
||
host environment (e.g., the [JavaScript API](JS.md)), dynamic linking requires | ||
use of host-specific functionality to link two modules. At a minimum, the host | ||
environment must provide a way to dynamically instantiate modules while | ||
connecting exports to imports. | ||
|
||
The simplest load-time dynamic linking scheme between modules A and B can be | ||
achieved by having module A export functions, tables and memories that are | ||
imported by B. A C++ toolchain can expose this functionality by using the | ||
same function attributes currently used to export/import symbols from | ||
native DSOs/DLLs: | ||
``` | ||
#ifdef _WIN32 | ||
# define EXPORT __declspec(dllexport) | ||
# define IMPORT __declspec(dllimport) | ||
#else | ||
# define EXPORT __attribute__ ((visibility ("default"))) | ||
# define IMPORT __attribute__ ((visibility ("default"))) | ||
#endif | ||
|
||
typedef void (**PF)(); | ||
|
||
IMPORT PF imp(); | ||
EXPORT void exp() { (*imp())(); } | ||
``` | ||
This code would, at a minimum, generate a WebAssembly module with imports for: | ||
* the function `imp` | ||
* the heap used to perfom the load, when dereferencing the return value of `imp` | ||
* the table used to perform the pointer-to-function call | ||
|
||
and exports for: | ||
* the function `exp` | ||
|
||
A more realistic module using libc would have more imports including: | ||
* an immutable `i32` global import for the offset in linear memory to place | ||
global [data segments](Modules.md#data-section) and later use as a constant | ||
base address when loading and storing from globals | ||
* an immutable `i32` global import for the offset into the indirect function | ||
table at which to place the modules' indirectly called functions and later | ||
compute their indices for address-of | ||
|
||
One extra detail is what to use as the [module name](Modules.md#imports) for | ||
imports (since WebAssembly has a two-level namespace). One option is to have a | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is there a reference for the two-level namespace stuff? I can't seem to find it, maybe just file a bug to add it later so the PR doesn't get bogged down? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Righto: #706. |
||
single default module name for all C/C++ imports/exports (which then allows the | ||
toolchain to put implementation-internal names in a separate namespace, avoiding | ||
the need for `__`-prefix conventions). | ||
|
||
To implement run-time dynamic linking (e.g., `dlopen` and `dlsym`): | ||
* `dlopen` would compile and instantiate a new module, storing the compiled | ||
instance in a host-environment table, returning the index to the caller. | ||
* `dlsym` would be given this index, pull the instance out of the table, | ||
search the instances's exports, append the found function to the function | ||
table (using host-defined functionality in the MVP, but directly from | ||
WebAssembly code in the | ||
[future](FutureFeatures.md#more-table-operators-and-types)) and return the | ||
table index of the appended element to the caller. | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Cool, and more than I expected for the MVP. |
||
Note that the representation of a C function-pointer in WebAssembly is an index | ||
into a function table, so the above scheme lines up perfectly with the | ||
function-pointer return value of `dlsym`. | ||
|
||
More complicated dynamic linking functionality (e.g., interposition, weak | ||
symbols, etc) can be simulated efficiently by assigning a function table | ||
index to each weak/mutable symbol, calling the symbol via `call_indirect` on that | ||
index, and mutating the underlying element as needed. | ||
|
||
After the MVP, we would like to standardize a single [ABI][] per source | ||
language, allowing for WebAssembly libraries to interface with each other | ||
regardless of compiler. Specifying an ABI requires that all ABI-related | ||
future features (like SIMD, multiple return values and exception handling) | ||
have been implemented. While it is highly recommended for compilers targeting | ||
WebAssembly to adhere to the specified ABI for interoperability, WebAssembly | ||
runtimes will be ABI agnostic, so it will be possible to use a non-standard ABI | ||
for specialized purposes. | ||
|
||
[ABI]: https://en.wikipedia.org/wiki/Application_binary_interface |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can split this into a follow-up: why not have a user-defined mapping for memory, mapping the numbers to strings? i.e. a module would say "I have a memory called
HEAP
which I map to1
, and a memory calledSECRETS
which I map to2
". That way different toolchains don't have to agree on magical numbers but just on strings. Two dynamic objects can use different numbering schemes if they want, and things will align fine if they agree on string names.And for now, we'd only have
HEAP
which maps to1
(ordefault
or whatever), anything else would be invalid (and we'd add support for later).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In a sense, the PR already does this (if we're talking about imports): the string is the imported name, the index is the immutable global it is stored in. (But there is this special "default" memory which avoids the need to give every load/store an immediate index which was the objection practically everyone had to the previous iteration of the PR.)