-
Notifications
You must be signed in to change notification settings - Fork 696
Move to post-order encoding of syntax trees #611
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -57,19 +57,48 @@ A single-byte unsigned integer indicating a [value type](AstSemantics.md#types). | |
|
||
# Definitions | ||
|
||
### Post-order encoding | ||
### "Post-order" syntax tree encoding | ||
|
||
Refers to an approach for encoding syntax trees, where each node begins with an identifying binary | ||
sequence, then followed recursively by any child nodes. | ||
|
||
* Examples | ||
* Given a simple AST node: `i32.add(left: AstNode, right: AstNode)` | ||
* First recursively write the left and right child nodes. | ||
* Then write the opcode for `i32.add` (uint8) | ||
WASM syntax trees are encoded in a variation of post-order traversal. Nodes with fixed arity will be encoded in post-order. | ||
Nodes with variadic control-flow opcodes, such as but not limited to `block`s, `loop`s, `if`, and `if_else`, will be bracketed with start and end markers (in the `if` and `if_else` case, this applies to the branches). Other variadic opcodes such as `call` will be post-order without brackets since the arity immediate is sufficient. Thus, for fixed arity and non-control flow variadic opcodes, the binary sequence begins with child subtrees followed by the opcode. | ||
For variadic control-flow opcodes, the binary sequence begins with an opcode-specific start marker, followed by the the child subtrees (encoded in this same variant of post-order), then the opcode, which serves as an end marker. | ||
This encoding is immediately amenable to one-pass decoding with an explicit stack without need for shift-reduce parsing. | ||
|
||
####Examples: | ||
|
||
* Given a simple AST node: `i32.add(left: AstNode, right: AstNode)` | ||
* First recursively write the left and right child nodes. | ||
* Then write the opcode for `i32.add` (uint8) | ||
|
||
* Given a call AST node: `call(args: AstNode[], callee_index: varuint32)` | ||
* First recursively write each argument node. | ||
* Then write the (variable-length) integer `callee_index` (varuint32) | ||
* Finally write the opcode of `Call` (uint8) | ||
|
||
* Given an if_else-expression: `if_else(expr: AstNode, thenExpr: AstNode, elseExpr: AstNode)` | ||
* Recursively encode expr | ||
* Write the opcode for `if` | ||
* Recursively encode thenExpr | ||
* Write the opcode for `else` | ||
* Recursively encode elseExpr | ||
* Write the opcode for `end` | ||
|
||
* Given an if-expression: `if(expr: AstNode, thenExpr: AstNode)` | ||
* Recursively encode expr | ||
* Write the opcode for `if` | ||
* Recursively encode thenExpr | ||
* Write the opcode for `end` | ||
|
||
* Given nested block AST nodes: `block(2, [Block(2, [I32.Const 2, I32.Const 3]), I32.Const 4])` | ||
* First, write the opcode marker `block` | ||
* Write the the arity immediate of the outer block: `2` | ||
* Recursively write the first subnode: `block 2 i32const 3 i32const end` | ||
* Recursively the write second subnode: `4 i32const` | ||
* Write the opcode for `end` | ||
|
||
* Given a call AST node: `call(args: AstNode[], callee_index: varuint32)` | ||
* First recursively write each argument node. | ||
* Then write the (variable-length) integer `callee_index` (varuint32) | ||
* Finally write the opcode of `Call` (uint8) | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you add an example of how high-level if is encoded: I propose it be encoded as follows: (expr) The There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Certainly, I'll get right on it. |
||
# Module structure | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like it!
Can you also give an example for an if with no else?