Stack machine semantics #323

rossberg · 2016-08-24T15:15:47Z

This implements the stack machine semantics for Wasm:

Extends syntax to allow raw instruction sequences. Prior expression forms remain as syntax macros. See README for syntax summary.
Changes AST to consist of instruction sequences. Eliminate the kernel/AST distinction and corresponding desugaring.
Reimplements type checking in terms of that. Skips dead code (for now).
Reimplements evaluation as a small-step rewriting semantics over instruction sequences plus administrative forms.
Adjust & extend tests (though we should add more).
Various clean-ups on the way.

I apologise for the monolithic PR. Breaking the move over to a stack machine semantics into meaningful smaller steps is kind of impossible.

The main downside of this change is that evaluation got much slower, because it is now implemented by term rewriting, literally modelling a small-step reduction semantics as on paper. The upside is that this makes it suitable for explicitly modelling threads later on.

drom · 2016-09-02T05:44:53Z

@rossberg-chromium with stack machine, parenthesis don't carry any function anymore. Require putting them around every instruction sounds like unnecessary decoration. New text stack format better off without parenthesis at all. Mixing two styles in the same file format confuses people.

rossberg · 2016-09-02T06:35:06Z

@drom, mixing two styles in one format is exactly what is happening currently, what people have complained about, and which I'm suggesting to get rid of.

AndrewScheidecker · 2016-09-02T12:24:30Z

The goal for the text format in the MVP seems to be a "linear opcode" format to mirror the binary format. This PR makes fundamental changes to the expression syntax, but still includes some form of it. Is this change an incremental step toward removing expressions, or will expressions remain in WAST?

ghost · 2016-09-02T12:49:38Z

@AndrewScheidecker Mozilla appear to have committed to continuing to explore a structured text format, see WebAssembly/design#704 (comment)

There is a difference between not standardizing the text format for the MVP which I can understand, and ignoring the use case of a structure text format which I hope will not pass.

This PR has a format for testing purposes, so does it really need to be representative?

kripken · 2016-09-02T17:23:44Z

@AndrewScheidecker: the decision as mentioned in your link is to ship the wasm MVP with a linear list. The spec repo's s-expression language will be a superset of that, but not part of the wasm spec, and not shown in browsers. However, after the MVP experimentation might lead to further developments and possible spec additions.

AndrewScheidecker · 2016-09-02T17:37:14Z

It makes sense for the ml-proto text format to include non-standard functionality for the test suite. But some subset of a future state of the ml-proto text format will be the MVP standard text format, right? Will that subset include expression trees in addition to linear operator sequences, or will expression trees be a non-standard part of the ml-proto syntax?

eholk · 2016-09-02T17:41:21Z

I'm in favor of requiring bracketing. In the example @rossberg-chromium gave the bracketed version seems more readable to me (although some more newlines might make the non-bracketed version look better). The parenthesis make it clear which things are meant to be taken as a unit, which I think will be especially helpful for instructions that take a number of optional arguments.

kripken · 2016-09-02T17:49:45Z

@AndrewScheidecker: The MVP will only standardize the linear part. In other words, expression trees will be a non-standard thing used in the spec repo.

AndrewScheidecker · 2016-09-02T18:01:12Z

Thank you, that's what I was trying to figure out.

A big part of the value of ml-proto to other implementations right now is its test suite. If everybody implements the non-standard expression tree syntax to access the test suite, it will be de facto standard. So IMO ml-proto should either remove expression trees (doesn't need to happen all at once), or the standard should include expression trees.

sunfishcode · 2016-09-02T18:05:50Z

It seems with the recent change, code like this:

    i32.const 3
    (i32.add (i32.const 4))

is now accepted. I understand why this makes the text format simpler, but it's also surprising. I wouldn't be comfortable with a language that visually looks like it has one structure, but actually has a very different structure, spreading, which is plausible here.

ghost · 2016-09-02T20:22:12Z

@kripken I did not see a decision to 'standardize the linear part' in the MVP rather to not have a standard in this area for the MVP. I would not object to also having a linear presentation of the code.

The core issue is surely keeping the design consistent with the use case of a structured presentation.

I don't think it matters if a linear code or some macros are used for testing, even if this becomes a de-facto standard format, this is a small matter.

AndrewScheidecker · 2016-09-08T14:39:35Z

Since it sounds like the expression tree syntax will stay in ml-proto, it would be nice if there was a stronger separation between it and the stack machine syntax. For example:

An operator expression is always parenthesized, and expects a fixed number of subexpressions of a type defined by its context (as ml-proto has been to this point).
A sequence expression is a semi-colon separated sequence of subexpressions:

((nop); (i32.add (i32.const 3) (i32.const 4)))

It's a little bit more flexible than the existing implicit sequences, since it allows the result of the sequence to be pushed by any subexpression within it, not just the final subexpression:

((nop); (i32.add (i32.const 3) (i32.const 4)); (nop))
A stack expression is a non-parenthesized opcode along with only its immediate operands. It may only occur as part of a sequence expression, and must only pop operands that are pushed within the same sequence. @sunfishcode's example above would be:

(i32.const 3; i32.const 4; i32.add)
A sequence expression can embed a sequence anywhere as a subexpression:

(i32.eqz (i32.const 3; i32.const 4; i32.add))
Function bodies, loop bodies, block bodies, and if clauses are implicit sequences:

(loop $break $loop; (i32.eqz (get_local $num)); br_if $break; ...; br $loop)

I think a syntax like that would be a nice incremental step that exposes the additional flexibility of the stack machine, but without crossing wires in our intuition about the expression syntax: implicit stack operands don't cross parentheses.

… merge

…orts/exports

ghost · 2016-09-09T12:10:09Z

@AndrewScheidecker A semi-colon is not generally used like that in an s-exp, does not fit the representation. For a stronger separation a stack operator could be added to the s-exp format (not an opcode). For example:

(i32.eqz (stack (i32.const 3) (i32.const 4) (i32.add)))

It raises the problem of what to do if there are more or less values than expected by the consumer of the stack operator - probably just a syntax error.

When pick is added it will help to be able to name the stack values, and that is not pretty in the linear stack code. I'd just leave it to @rossberg-chromium for now.

AndrewScheidecker · 2016-09-09T13:56:17Z

A semi-colon is not generally used like that in an s-exp, does not fit the representation.

I agree, but the format already departs from a standard S-expression syntax. Previously the departure was pretty small: e.g. align=<x>, but as of the merge of this PR the syntax now allows unparenthesized "instructions":

i32.const 3 i32.const 4 i32.add

For a stronger separation a stack operator could be added to the s-exp format (not an opcode). For example:
(i32.eqz (stack (i32.const 3) (i32.const 4) (i32.add)))

That would still not distinguish between an operator expression and an "linear operator". (i32.add) could be a malformed expression or a linear operator that consumes its operands from the results of preceding operators, and you have to look at the context to figure out how to interpret it. I don't care if it's done exactly as I showed (with parentheses), but I do think there should be a more clear distinction than whether it's missing operands.

ghost · 2016-09-09T14:37:21Z

@AndrewScheidecker That would still not distinguish between an operator expression and an "linear operator".

Could add another operator expression to flip back, or have operators with only immediate arguments be interpreted as popping all arguments, but what's the point, it's never going to be a nice format to use except for testing.

The names are still pre WebAssembly#297, once that's finalize I can fix this up (after the sync WebAssembly#323). With this change, test/core/run.py passes on all test cases in simd/.

rossberg added 30 commits July 6, 2016 11:28

Allow nullary blocks

5623731

Allow n-ary loop

c25d4ac

Allow n-ary if

c062222

Allow n-ary func bodies

97c2443

Make return primitive

27f4e1b

Make memory operators primitive

d23a769

Make loop's break label primitive

0a6d707

Stack kernel

b155302

Adapt AST

a706df5

Supprt raw stack syntax

f692d30

Tiny test of stack input

a227c7d

Merge branch 'binary-0xc' into stack

7200155

Adjust negative tests

c7ed3f6

Remove break label from loops

72352ac

Make If block semantics primitive

64d4132

Reunify ASTs

96f233a

Clean up naming conventions

8ca2e77

Remove some code duplication

d8f7bd1

Adapt encoder

2647de7

Adjust text conversion

4dd20ef

Sketch of formal spec

67718f7

Formal rules for calls, returns, locals

21ceedf

Convert calls to small-step

7353bc6

New tests for stack machine

dd5dd72

Dead code is dead to the spec

90e7a47

Don't type unreachable operators; simplify typing

8a04907

Clean up arity checking

a0d777c

Merge branch 'binary-0xc' into stack

1b364d9

Merge branch 'binary-0xc' into stack

da5f917

Tweak S-expr grammar

d7f4d02

rossberg added 3 commits September 2, 2016 11:52

Numeric section names (PR 740)

0acc8b0

Move element section before code section (PR 779)

2482e44

Require END opcode for functions; simplify streams

fb3be97

rossberg added 4 commits September 5, 2016 15:01

Check length & value of var(u)ints

ab65e4d

Abstract length limit

025fa0b

Make variables i32

cddb36b

Remove kernel.ml

9e4b6a9

rossberg added 3 commits September 8, 2016 18:01

Eliminate administrative expressions, in preparation of import/export…

5440555

… merge

Merge branch 'binary-0xc' into stack; resolve many conflicts with imp…

de0af62

…orts/exports

Fix remaining merge fall-out

f99e5ac

rossberg added 2 commits September 9, 2016 14:20

Merge branch 'mo-0xc' into stack

47420c8

Merge branch 'binary-0xc' into stack

a7a5fef

rossberg merged commit a7a5fef into binary-0xc Sep 9, 2016

rossberg deleted the stack branch May 18, 2017 11:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Stack machine semantics #323

Stack machine semantics #323

Uh oh!

rossberg commented Aug 24, 2016

Uh oh!

drom commented Sep 2, 2016

Uh oh!

rossberg commented Sep 2, 2016 •

edited

Loading

Uh oh!

AndrewScheidecker commented Sep 2, 2016

Uh oh!

ghost commented Sep 2, 2016 •

edited by ghost

Loading

Uh oh!

kripken commented Sep 2, 2016

Uh oh!

AndrewScheidecker commented Sep 2, 2016

Uh oh!

eholk commented Sep 2, 2016

Uh oh!

kripken commented Sep 2, 2016

Uh oh!

AndrewScheidecker commented Sep 2, 2016

Uh oh!

sunfishcode commented Sep 2, 2016

Uh oh!

ghost commented Sep 2, 2016 •

edited by ghost

Loading

Uh oh!

AndrewScheidecker commented Sep 8, 2016

Uh oh!

ghost commented Sep 9, 2016

Uh oh!

AndrewScheidecker commented Sep 9, 2016

Uh oh!

ghost commented Sep 9, 2016

Uh oh!

Uh oh!

Stack machine semantics #323

Stack machine semantics #323

Uh oh!

Conversation

rossberg commented Aug 24, 2016

Uh oh!

drom commented Sep 2, 2016

Uh oh!

rossberg commented Sep 2, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

AndrewScheidecker commented Sep 2, 2016

Uh oh!

ghost commented Sep 2, 2016 • edited by ghost Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kripken commented Sep 2, 2016

Uh oh!

AndrewScheidecker commented Sep 2, 2016

Uh oh!

eholk commented Sep 2, 2016

Uh oh!

kripken commented Sep 2, 2016

Uh oh!

AndrewScheidecker commented Sep 2, 2016

Uh oh!

sunfishcode commented Sep 2, 2016

Uh oh!

ghost commented Sep 2, 2016 • edited by ghost Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

AndrewScheidecker commented Sep 8, 2016

Uh oh!

ghost commented Sep 9, 2016

Uh oh!

AndrewScheidecker commented Sep 9, 2016

Uh oh!

ghost commented Sep 9, 2016

Uh oh!

Uh oh!

rossberg commented Sep 2, 2016 •

edited

Loading

ghost commented Sep 2, 2016 •

edited by ghost

Loading

ghost commented Sep 2, 2016 •

edited by ghost

Loading