Skip to content

Move to post-order encoding of syntax trees #611

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

naturaltransformation
Copy link
Contributor

I would like to work through the rationale and explanation of a post-order encoding of syntax trees that would simplify robust decoding. The goal is to avoid having to lean on recursive descent or shift-reduce parsers.


####Examples:

* Given a simple AST node: I32Add(left: AstNode, right: AstNode)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use code quotes?

@ghost
Copy link

ghost commented Mar 16, 2016

Thank you for opening a discussion. Here's a show-stopper I see for post-order encoding. Humble apologies in advance if I lack technical understanding of the proposal.

Let an operator or function be defined that accepts one value and returns none, let it be named $pop. Consider its use in the following AST:

(func
  (call $pop (i32.const 1))
  (call $pop (i32.const 2))
  (call $pop (i32.const 3)))

Let the following be its post-order encoding:

i32.const 1
call pop
i32.const 2
call pop
i32.const 3
call pop

Now how are you going to validate post-order that the following is invalid?

i32.const 3
i32.const 2
i32.const 1
call pop
call pop
call pop

Pre-order has some nice properties for the wasm AST design. Each sub-expression can be validated on it's own. It might cost a shift-reduce algorithm, but can you propose a post-order validation algorithm that is simpler?

@binji
Copy link
Member

binji commented Mar 16, 2016

Seems straightforward if you keep a type stack and push a void type in that case:

valid:
op            type stack
i32.const 1   i32
call pop      void
i32.const 2   void i32
call pop      void void
i32.const 3   void void i32
call pop      void void void

invalid:
i32.const 3   i32
i32.const 2   i32 i32
i32.const 1   i32 i32 i32
call pop      i32 i32 void
call pop      error, pop needs an i32 param, not void
call pop

Maybe there's a nicer way, but this seems reasonable to me.

@sunfishcode
Copy link
Member

A possible variant of that rule: In the MVP, where return value counts are limited to 0 or 1 and there're no projection/destructure operators: after seeing an expression with 0 return types, discard all values from the top of the stack that are defined in the current block (or block-like thing, etc.).

Discussion: If those i32.const nodes were instead nodes with side effects, evaluating the sequence as a stack machine would observe the side effects in a different order than evaluating it as an AST. For example, consider code like this:

  foo(red())
  bar(green())

The post-order encoding might look like this:

  push red()
  foo(pop)
  push green()
  foo(pop)

However, what if someone encoded this?

  push red()
  push green()
  foo(pop)
  foo(pop)

A stack machine would evaluate red() first, then green(). And AST approach would build a tree, connecting green() to the first foo call and red() to the second foo call, so it would evaluate green() first, then red().

The rule above eliminates those cases where the two evaluation strategies would differ, because it requires that if a node has no result value, it doesn't appear between a node that does have a result value and that node's user.

Looking ahead, there are more questions, but with a rule like what's described here, we wouldn't need to answer all of them right away.

@ghost
Copy link

ghost commented Mar 16, 2016

@binji Yes, thank you. So the entries on the parse stack are an object representing the AST, even one with no result values, and it seems natural that there would also be only one entry for multiple values - that's one matter cleared up, thanks.

@sunfishcode I don't think that block-zero-value optimization would work as some operators consume zero value ast nodes, or might? For example

(block (i32.add (i32.const 1) (br 5 (nop)))

block-start
  i32.const 1
  nop   <<< can't discard the stack above??
  br 5
  i32.add
block-end

I have another query about the top level expressions of a block or block-like operator. The AST defines the results of each expression to be discarded, except the last which is returned from the block. With a post-order encoding it would not know the end of each of each of these expressions while it is parsing, rather it seems it would only know the end of each of these top-level expressions at the end of the block when it would check the AST refs remaining on the stack between the start of the block and the end and assign each to one block top level expression?

That seems to explain why a block needs and end marker and can not use a pre-order count of the number of top level block expressions - because it can not count them while parsing them?

I guess the block start marker is needed to support the one pass SSA conversion, otherwise the block could be post-order encoded with an immediate count of the number of expressions to consume from the top of the stack?

With all the block top-level expression being consumed only at the end of the block, it might be a property of this design that the expression stack could grow as large as the block. Might the stack be larger than for a pre-order encoding which could discard values earlier. Might it lead to producers using dead operators to consume values, or to limiting block sizes and breaking them up.

- Recursively write the first operand of the outer Call: 0
- Recursively write the second operand: CALL_START 1 2 I32CONST 3 I32CONST CALL_END
- Recursively the write third operand: 4 I32CONST
- Write the opcode CALL_END
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could some rationale be given for the need to have a call_start marker? I can see the need to know the start of control flow forks, for the single-pass SSA conversion, and for blocks to know the break depths and perhaps other reasons, but no need comes to mind for calls and it seems adequate to just post-order encode them?

@rossberg
Copy link
Member

On 16 March 2016 at 03:30, JSStats [email protected] wrote:

Thank you for opening a discussion. Here's a show-stopper I see for
post-order encoding. Humble apologies in advance if I lack technical
understanding of the proposal.

Let an operator or function be defined that accepts one value and returns
none, let it be named $pop. Consider its use in the following AST:

(func
(call $pop (i32.const 1))
(call $pop (i32.const 2))
(call $pop (i32.const 2)))

Let the following be its post-order encoding:

i32.const 1
call pop
i32.const 2
call pop
i32.const 3
call pop

Now how are you going to validate post-order that the following is invalid?

i32.const 3
i32.const 2
i32.const 1
call pop
call pop
call pop

Because it does not parse. There is no problem. It's just a variant of RPN.
Any old parser can handle your examples with a grammar like

expr ::= i32.const num | expr call id
func ::= expr*

There is no ambiguity, and there is no problem syntactically detecting the
latter example as malformed.

Moreover, there already is a prototype decoder for V8 (
https://codereview.chromium.org/1694303002/) and a WIP one for the spec (
https://github.com/WebAssembly/spec/blob/binary/ml-proto/spec/binary.ml).
Both are currently out of sync, but after GDC we will finalise them.

@AndrewScheidecker
Copy link

Is that example invalid? If the deserializer is just a stack machine then it should be equivalent to the other ordering.

There are a few things that are harder to validate with post-order, though. For example, a branch needs to know the argument type of a target, which is only known from the use of the node introducing the target:

(iXX.add A... (block $break
    (br $break B...)))

In post-order:

A...
begin_block $break
    B...
    br $break
end_block
iXX.add

Bottom up validation here would have to derive a type for the block from its tail expression, or branches that target it. You could make it work, but maybe it would be easier to just explicitly declare the type in the begin_block.

@rossberg
Copy link
Member

On 16 March 2016 at 12:31, Andrew Scheidecker [email protected]
wrote:

Is that example invalid? If the deserializer is just a stack machine then
it should be equivalent to the other ordering.

We are specifying a binary format, not a deserializer. The stack machine is
merely a detail of one possible implementation. So yes, that example is
malformed.

There are a few things that are harder to validate with post-order, though.

For example, a branch needs to know the argument type of a target, which is
only known from the use of the node introducing the target:

(iXX.add A... (block $break
(br $break B...)))

In post-order:

A...
begin_block $break
B...
br $break
end_block
iXX.add

Bottom up validation here would have to derive a type for the block from
its tail expression, or branches that target it. You could make it work,
but maybe it would be easier to just explicitly declare the type in the
begin_block.

Yes, if you do decoding and validation in a single pass, then post-order
forces you to derive types bottom-up, whereas pre-order allows top-down.
It's a slight complication for the implementation, but doable. The cost of
requiring type annotations on every block might be pretty substantial.

@rossberg
Copy link
Member

@AndrewScheidecker, ah, pardon me, now I see what you mean. Yes you are right, the latter example is not actually invalid. Instead, it is the encoding of a different program, namely

(func
  (i32.const 3)
  (i32.const 2)
  (call $pop (call $pop (call $pop (i32.const 1))))
)

@AndrewScheidecker
Copy link

You were right the first time: I meant that you could define the deserializer to make both orderings equivalent. @sunfishcode's comments on evaluation order apply if the i32.const had side-effects, but I was actually thinking that the serialization order would be the same as evaluation order. It implies you have an additional degree of control over evaluation order that you don't have in an AST, but as you observed, it would require changing evaluation semantics rather than just the binary format.

@ghost
Copy link

ghost commented Mar 16, 2016

@rossberg-chromium Thank you. I think your answer is basically the same as @binji and I hope I understand it now. The key for me was that it is not a stack of values (not like the JVM) rather a stack of AST node references, and there is always a single result AST node even if it has zero result values. I also understand the v8 implementation a little better: it records the stack_depth at the start of a block and at the end PopUpTo pops all values back to the initial stack_depth returning the top node and discarding the rest.

@titzer
Copy link

titzer commented Mar 16, 2016

The postorder prototype that I developed for V8 works exactly as Andreas
described; it maintains a stack of AST nodes and does bottom-up
verification. The stack contains types for each AST node. Even nodes that
have type "void" are pushed onto the stack as well. Blocks require a
bracketed representation, where a pre-order "Block" opcode introduces a
block and an "End" opcode ends a block. That ensures that "Br" opcodes can
always be associated with their blocks in a single pass. An in-order
encoded of if-then and if-then-else also works out quite well, being
efficient to decode and can still support single-pass SSA construction.

Postorder is also better for interpretation, since the serialization order
exactly matches the evaluation order. An efficient in-place interpreter is
possible with the help of an auxilliary map of br/if/else opcodes to
bytecode offsets.

I plan to officially propose a switch to postorder along the lines of my V8
prototype soon.

-B

On Wed, Mar 16, 2016 at 1:08 PM, JSStats [email protected] wrote:

@rossberg-chromium https://github.com/rossberg-chromium Thank you. I
think your answer is basically the same as @binji
https://github.com/binji and I hope I understand it now. The key for me
was that it is not a stack of values (not like the JVM) rather a stack of
AST node references, and there is always a single result AST node even if
it has zero result values. I also understand the v8 implementation a little
better: it records the stack_depth at the start of a block and at the end
PopUpTo pops all values back to the initial stack_depth returning the top
node and discarding the rest.


You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub
#611 (comment)

@ghost
Copy link

ghost commented Mar 16, 2016

@titzer Thank you. One point that is not yet clear to me in the v8 implementation is if it catches an empty stack wrt the stack_depth at the start of the block. E.g. does it detect the following invalid code:

i32.cont 1
block_start
  i32.eqz
block_end

@titzer
Copy link

titzer commented Mar 16, 2016

On Wed, Mar 16, 2016 at 1:30 PM, JSStats [email protected] wrote:

@titzer https://github.com/titzer Thank you. One point that is not yet
clear to me in the v8 implementation is if it catches an empty stack wrt
the stack_depth at the start of the block. E.g. does it detect the
following invalid code:

i32.cont 1
block_start
i32.eqz
block_end

Ah ha! Thanks for your example. No we don't actually detect that case.
Have to think about that.


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#611 (comment)

@titzer
Copy link

titzer commented Mar 24, 2016

The solution to the "expressions crossing block boundaries" problem is to disallow popping expressions across a block boundary. I.e. every block begins with an "empty" stack.

@ghost
Copy link

ghost commented Mar 24, 2016

The latest v8 patch https://codereview.chromium.org/1830663002/

* Recursively write the first subnode: `BLOCK_START 2 I32CONST 3 I32CONST BLOCK_END`
* Recursively the write second subnode: `4 I32CONST`
* Write the opcode `BLOCK_END`


Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add an example of how high-level if is encoded:

I propose it be encoded as follows:

(expr)
if
(true-exprs)
else
(false-exprs)
end

The else bytecode and clause is optional.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Certainly, I'll get right on it.

@titzer
Copy link

titzer commented Mar 29, 2016

I was going to start a PR that postorder in its full glory, but I think we can mold this one into the the right shape, or we can land it as is and iterate.

I have a separate PR that proposes bytecodes for the encoding.

@naturaltransformation
Copy link
Contributor Author

@titzer I'm happy and ready to mold this to the consensus design.

* Then write the opcode for `i32.add`, which is sufficient since `i32.add` is a binary opcode.

* Given an if-expression: `if_else(expr: AstNode, true-exprs: AstNode, false_exprs: AstNode)`
* Recursively encode expr
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it!

Can you also give an example for an if with no else?

@sunfishcode
Copy link
Member

@titzer's solution works for @titzer's proposal; the solution I mentioned above works in the proposal I am preparing.

@sunfishcode
Copy link
Member

With #675 now merged, this is done.

@sunfishcode
Copy link
Member

Or actually, it looks like BinaryEncoding.md still talks about a "Pre-order encoding". Reopening this.

@sunfishcode sunfishcode reopened this Apr 29, 2016
@naturaltransformation
Copy link
Contributor Author

Is there further feedback on this one? Is there perhaps value to adding some additional examples to summarize the above discussion or perhaps just keep the current set of examples?

I would like to work through the rationale and explanation of a post-order encoding of syntax trees that would simplify robust decoding. The goal is to avoid having to lean on recursive descent or shift-reduce parsers.
@naturaltransformation
Copy link
Contributor Author

The BinaryEncoding doc's current explanation of post-order encoding is a good start. This PR does round it out with a coverage of all the exceptions to post-order along with more examples.

@sunfishcode sunfishcode modified the milestone: MVP Jul 8, 2016
@naturaltransformation
Copy link
Contributor Author

It looks like this is a moot point now with all the implementations and other documentation out there, plus the convergence on stack machine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants