Skip to content

Streamline tableswitch notation #443

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

Conversation

rossberg
Copy link
Member

@rossberg rossberg commented Nov 3, 2015

As discussed in #427, this streamlines the notation of tableswitch such that it is written with just one table instead of two and avoids auxiliary labels. It does not change what can be expressed. Advantages of this notation are:

  • Easier to write & read.
  • More amenable to specifying semantics.
  • Does not require a new form of label with special scoping rules.
  • Avoids questions like that raised in tableswitch case labels? #439.

This matches what is currently implemented in WebAssembly/spec#153.

@rossberg rossberg mentioned this pull request Nov 3, 2015
@sunfishcode
Copy link
Member

This has the same bug I found in the earlier version here, that case and default can apparently fall through to case_br and default_br, which is not in the current tableswitch semantics.

Also, what is the intended relationship between the s-expression language and the binary format? The current tableswitch description can map fairly directly to a simple and efficient binary encoding. The structure proposed here in direct translation seems like it would require special opcode bytes for differentiating between the different kinds of targets inside the body. If the idea is that the binary encoding would rearrange the parts into a different structure to simplify the encoding, then we have different assumptions.

My assumption is that the place for streamlined syntax is either the official text syntax or a macro-assembler layer, with the purpose of the s-expression syntax being to make the binary format's structure and concepts readable to humans working at that level. This is because the binary format is the thing that most producers and consumers will be communicating through in practice, so its structure and concepts are things that most producers and consumers will need to be aware of, so humans working at this level will need to be aware of them too.

@rossberg
Copy link
Member Author

rossberg commented Nov 3, 2015

Right, a fallthrough to a _br target would jsut continue to fall through until it hits a regular target. But indeed, the desugaring in my spec PR does not implement that correctly. Another option would be to require all _br targets to come first.

Text syntax or a macro assembler do not solve the main problem with the current form, namely the significant (and unnecessary) semantic complication it introduces on the AST level. The prose glosses over that entirely, but the way the internal case labels would work is entirely different from other labels. In fact, it violates the very statement (at least as written) in the following paragraph, about labels having to be bound by enclosing constructs. Pretending they are regular labels requires to partition the label namespace into two types. The seeming notational liberty also raises questions like #439. Wouldn't you agree it's preferable to avoid all these issues by construction?

@sunfishcode
Copy link
Member

I do agree. In fact, all these issues are avoided by construction in the binary format that I expect this design to translate into. I initially overlooked the issue in #439 because I was mainly thinking in terms of the binary format and how this will be decoded, and this problem doesn't arise there (unless we choose to make the design to include it, but I don't think we need to).

I do agree that the current wording uses "label" in a confusing way, and can be generally simplified. I've now submitted #444 to propose a simpler, clearer wording which also avoids #439 by construction. It does so at the cost of introducing a new concept, but it's a simple concept that actually has an even closer affinity to what I'm envisioning in the binary encoding.

What is the intended relationship between the s-expression language and the binary format?

@rossberg
Copy link
Member Author

rossberg commented Nov 4, 2015

The way I view it, there are 3 relevant levels of code format:

  1. The AST. Literally, an abstract tree format. Should be designed to match the semantic structure best. That is, optimised towards making the semantics easy to express.
  2. The binary format. A concrete linear encoding of the AST. As such, equivalent to the AST only up to non-trivial isomorphisms. Should be designed to be compact and support deserialisation and code generation best.
  3. Generated code. An execution format, normally generated from the binary. Optimised for fast execution, obviously. Will again have (vastly) different structure.

S-expressions are merely a direct textual representation of the AST, so should mirror its structure.

With respect to switch, the presence of a concrete jump table is primarily relevant for (3) -- and of course, its semantics is designed to enable that. A (more abstract form of) table probably also is the best choice for (2) and even (1). But that doesn't mean that the isomorphism between them has to be pointwise.

@sunfishcode
Copy link
Member

The AST. Literally, an abstract tree format. Should be designed to match the semantic structure best. That is, optimised towards making the semantics easy to express.

One way to measure the match of the semantic structure is the degree to which it achieves its results by construction, which is mentioned above. An explicit table structure is dense and zero-based by construction. Specifying the default target as an attribute of the tableswitch makes the requirement that there be exactly one default target correct by construction. The patch here implements these requirements as separate rules, so by that measure, the patch is a step backwards in terms of matching the semantic structure, particularly relative to the revised wording in #444.

@rossberg
Copy link
Member Author

Closing this, since there is no consensus and I found a way to spec tableswitch with only desugaring and no complication to the kernel.

@rossberg rossberg closed this Nov 13, 2015
@lukewagner lukewagner deleted the streamline-tableswitch branch April 29, 2016 13:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants