You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add working (post-order) encoder & decoder, and several additions to make it useful. For example, you can now invoke the interpreter like
```
wasm module.wast -o module.wasm
```
to convert from text to binary. In a while, the same will also be possible in the inverse direction.
I also extended the script language with (input <file>) and (output <file>) commands. Both are supposed to be able to handle both wast and wasm eventually. The former allows including other scripts or binary modules, the latter allows conversion as part of a script.
Finally, the command line now supports an -e <script> option, which enables to give commands directly. This is useful, for example, when intermingled with binary module arguments, e.g., to invoke exports:
```
wasm module.wasm -e '(invoke "foo")'
```
Furthermore, extends the test runner with encoding & decoding of all test files via a wast->wasm transcoding and loading the resulting modules. Works for all current tests.
There are still a number of smaller TODOs left, to be addressed in follow-ups. Other follow-up work: wasm->wast reverse transcoding, more aggressive tests and testing capabilities. We'll probably also need to revise the use of different integer types throughout the spec, which is not always consistent with what the binary format supports.
Copy file name to clipboardExpand all lines: ml-proto/README.md
+39-11Lines changed: 39 additions & 11 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,10 +5,13 @@ This repository implements a prototypical reference interpreter for WebAssembly.
5
5
Currently, it can
6
6
7
7
**parse* a simple S-expression format,
8
+
**decode* the binary format (work in progress),
8
9
**validate* modules defined in it,
9
-
**execute* invocations to functions exported by a module.
10
+
**execute* invocations to functions exported by a module,
11
+
**encode* the binary format,
12
+
**prettyprint* the S-expression format (work in progress).
10
13
11
-
The file format is a (very dumb) form of *script* that cannot just define a module, but also batch a sequence of invocations.
14
+
The S-expression format is a (very dumb) form of *script* that cannot just define a module, but in fact a sequence of them, and a batch of invocations, assertions, and conversions to each one. As such it is different from the binary format, with the additional functionality purely intended as testing infrastructure. (See [below](#scripts) for details.)
12
15
13
16
The interpreter can also be run as a REPL, allowing to enter pieces of scripts interactively.
14
17
@@ -61,17 +64,34 @@ Either way, in order to run the test suite you'll need to have Python installed.
61
64
You can call the executable with
62
65
63
66
```
64
-
wasm [option] [file ...]
67
+
wasm [option | file ...]
65
68
```
66
69
67
-
where `file` is a script file (see below) to be run. If no file is given, you'll get into the REPL and can enter script commands interactively. You can also get into the REPL by explicitly passing `-` as a file name. You can do that in combination to giving a module file, so that you can then invoke its exports interactively, e.g.:
70
+
where `file`, depending on its extension, either should be an S-expression script file (see below) to be run, or a binary module file to be loaded.
71
+
72
+
A file prefixed by `-o` is taken to be an output file. Depending on its extension, this will write out the preceding module definition in either S-expression or binary format. This option can be used to convert between the two in both directions, e.g.:
68
73
69
74
```
70
-
./wasm module.wast -
75
+
wasm -d module.wast -o module.wasm
76
+
wasm -d module.wasm -o module.wast
71
77
```
72
-
Note however that the REPL currently is too dumb to allow multi-line input. :)
73
78
74
-
See `wasm -h` for (the few) options.
79
+
The `-d` option selects "dry mode" and ensures that the input module is not run, even if it has a start section.
80
+
In the second case, the produced script contains exactly one module definition (work in progress).
81
+
82
+
Finally, the option `-e` allows to provide arbitrary script commands directly on the command line. For example:
83
+
84
+
```
85
+
wasm module.wasm -e '(invoke "foo")'
86
+
```
87
+
88
+
If neither a file nor any of the previous options is given, you'll land in the REPL and can enter script commands interactively. You can also get into the REPL by explicitly passing `-` as a file name. You can do that in combination to giving a module file, so that you can then invoke its exports interactively, e.g.:
89
+
90
+
```
91
+
wasm module.wast -
92
+
```
93
+
94
+
See `wasm -h` for (the few) additional options.
75
95
76
96
77
97
## S-Expression Syntax
@@ -168,9 +188,13 @@ cmd:
168
188
( assert_return_nan (invoke <name> <expr>* )) ;; assert return with floating point nan result of invocation
169
189
( assert_trap (invoke <name> <expr>* ) <failure> ) ;; assert invocation traps with given failure string
170
190
( assert_invalid <module> <failure> ) ;; assert invalid module with given failure string
191
+
( input <string> ) ;; read script or module from file
192
+
( output <string> ) ;; output module to file
171
193
```
172
194
173
-
Invocation is only possible after a module has been defined.
195
+
Commands are executed in sequence. Invocation, assertions, and output apply to the most recently defined module (the _current_ module), and are only possible after a module has been defined. Note that there only ever is one current module, the different module definitions cannot interact.
196
+
197
+
The input and output commands determine the requested file format from the file name extension. They can handle both `.wast` and `.wasm` files. In the case of input, a `.wast` script will be recursively executed.
174
198
175
199
Again, this is only a meta-level for testing, and not a part of the language proper.
176
200
@@ -202,11 +226,15 @@ The implementation consists of the following parts:
202
226
203
227
**Parser* (`lexer.mll`, `parser.mly`, `desguar.ml[i]`). Generated with ocamllex and ocamlyacc. The lexer does the opcode encoding (non-trivial tokens carry e.g. type information as semantic values, as declared in `parser.mly`), the parser the actual S-expression parsing. The parser generates a full AST that is desugared into the kernel AST in a separate pass.
204
228
229
+
**Pretty Printer* (`prettyprint.ml[i]`). Turns a module AST back into the textual S-expression format. (Work in progress)
230
+
231
+
**Decoder*/*Encoder* (`decode.ml[i]`, `encode.ml[i]`). The former (work in progress) parses the binary format and turns it into an AST, the latter does the inverse.
232
+
205
233
**Validator* (`check.ml[i]`). Does a recursive walk of the kernel AST, passing down the *expected* type for expressions, and checking each expression against that. An expected empty type can be matched by any result, corresponding to implicit dropping of unused values (e.g. in a block).
206
234
207
235
**Evaluator* (`eval.ml[i]`, `values.ml`, `arithmetic.ml[i]`, `int.ml`, `float.ml`, `memory.ml[i]`, and a few more). Evaluation of control transfer (`br` and `return`) is implemented using local exceptions as "labels". While these are allocated dynamically in the code and addressed via a stack, that is merely to simplify the code. In reality, these would be static jumps.
208
236
209
-
**Driver* (`main.ml`, `script.ml[i]`, `error.ml`, `print.ml[i]`, `flags.ml`). Executes scripts, reports results or errors, etc.
237
+
**Driver* (`main.ml`, `run.ml[i]`, `script.ml[i]`, `error.ml`, `print.ml[i]`, `flags.ml`). Executes scripts, reports results or errors, etc.
210
238
211
239
The most relevant pieces are probably the validator (`check.ml`) and the evaluator (`eval.ml`). They are written to look as much like a "specification" as possible. Hopefully, the code is fairly self-explanatory, at least for those with a passing familiarity with functional programming.
212
240
@@ -215,6 +243,6 @@ In typical FP convention (and for better readability), the code tends to use sin
0 commit comments