-
Notifications
You must be signed in to change notification settings - Fork 13
Change alias sugar, remove local indices, give module/instance types fresh index spaces #26
Conversation
I added a second commit to this PR which addresses comments in #21. I had actually already accidentally started making this change in the code samples of the first commit, using |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me! I've got one question on the binary encoding of aliases but otherwise everything here seems pretty reasonable to me.
One slight gotcha to the new alias syntax is that we have both: (alias (func $funcname $instancename "exportname"))
(alias (func $instancename "exportname")) so that when parsing this you can't eagerly take the first identifier as the name of the function. You need to look forward ahead to see if there's another identifier (or |
Hmm, yes, I guess that was a nice thing about the |
I think I'd have a slight preference for |
This commit Implements the sugar syntax proposed in WebAssembly/module-linking#26. The change here was to introduce a new `ItemRef` construct into the `wast` crate which represents a parenthesized reference to an item, either in the current module or a different module (e.g. optional `outer` and optional export names). Additionally an `IndexOrRef` type was added for parsing references which are either a bare index or an explicit reference to an item. All existing references to items were updated to use `ItemRef` where necessary. Additionally all locations which previously took an `Index` now store `ItemRef` and are parsed as `IndexOrRef`.
This continues to sync this implementation with WebAssembly/module-linking#26 by implementing the ability for outer aliases to have an arbitrary depth listed on them.
Hmm, yes, on second thought, I guess that seems like the best. I'll switch back and update the original comment to match. |
(FWIW, I'll be out next week.) |
This commit Implements the sugar syntax proposed in WebAssembly/module-linking#26. The change here was to introduce a new `ItemRef` construct into the `wast` crate which represents a parenthesized reference to an item, either in the current module or a different module (e.g. optional `outer` and optional export names). Additionally an `IndexOrRef` type was added for parsing references which are either a bare index or an explicit reference to an item. All existing references to items were updated to use `ItemRef` where necessary. Additionally all locations which previously took an `Index` now store `ItemRef` and are parsed as `IndexOrRef`.
This continues to sync this implementation with WebAssembly/module-linking#26 by implementing the ability for outer aliases to have an arbitrary depth listed on them.
* Implement new alias shorthand sugar for module linking This commit Implements the sugar syntax proposed in WebAssembly/module-linking#26. The change here was to introduce a new `ItemRef` construct into the `wast` crate which represents a parenthesized reference to an item, either in the current module or a different module (e.g. optional `outer` and optional export names). Additionally an `IndexOrRef` type was added for parsing references which are either a bare index or an explicit reference to an item. All existing references to items were updated to use `ItemRef` where necessary. Additionally all locations which previously took an `Index` now store `ItemRef` and are parsed as `IndexOrRef`. * Implement relative-depth aliases This continues to sync this implementation with WebAssembly/module-linking#26 by implementing the ability for outer aliases to have an arbitrary depth listed on them. * More rename of parent => outer
Latest commit from WebAssembly/module-linking#26
Don't have an optional field name and additionally remove the surrounding `(arg ..)` since it's not present in WebAssembly/module-linking#26
This commit updates the various tooling used by wasmtime which has new updates to the module linking proposal. This is done primarily to sync with WebAssembly/module-linking#26. The main change implemented here is that wasmtime now supports creating instances from a set of values, nott just from instantiating a module. Additionally subtyping handling of modules with respect to imports is now properly handled by desugaring two-level imports to imports of instances. A number of small refactorings are included here as well, but most of them are in accordance with the changes to `wasmparser` and the updated binary format for module linking.
Thanks, looks good, up to some syntactic nits:
WDYT? |
Heh, for the alias syntax, I also proposed switching to exactly that at some point, but then thought maybe it looked weird and talked myself out of it. And for the instantiate syntax, I was on the fence whether the extra verbosity was justified, so it's good to have a second opinion. SGTM on both changes. @alexcrichton ? |
Seems reasonable to me! |
Ultra-nit: is |
Btw, with this syntax for aliases, we could also allow the analogous "inverted shorthand" for aliases that we already have for imports, e.g., writing
Yeah, |
Ah right, the "inverted shorthand" syntax makes sense. And fair point about "arg" corresponding to a "param" vs. "import". What about "export"? |
Hm, an export is something a module provides to the outside. But here it's the inverse direction, the outside client supplying something to a module. The difference comes from Wasm's model requiring an intermediary to connect imports and exports, and that imports can be provided through values that aren't exports of another module. So export as opposite end of import does not seem right. |
Fair enough, |
Following the final updates on WebAssembly/module-linking#26
Getting around to implement this, FWIW inline aliases are particularly tricky on modules I think. For example I think this should parse: (module
(module (alias outer 0 0))
) but when parsing the inner module after seeing For inline imports you simply need to see It's not really the end of the world but I think I'll probably avoid implementing inline |
Ooh, good point. Just to confirm, the ambiguity is between the "inverted shorthand" recently added and a normal alias definition. This seems like a complete ambiguity, not just an issue of lookahead. (For example, validity aside, is your example a module-containing-a-module-containing-an-alias or a module-containing-an-alias?) @rossberg Should we just remove the inverted shorthand? (On a side note, setting aside the inverted shorthand ambiguity, I think the |
Hm, is there really an ambiguity? Let me know if I'm overlooking something, but one form is
while the other would be
This is disambiguated at the first token after the indices. Combining it with the sugar for inline aliases shouldn't change that, I believe? Wrt parsing, this difference is LALR(1), since neither form is a prefix of the other, and their common prefix has the exact same structure (unless I'm missing something). So no problem for parser generators. And even a recursive descent parser can just parse the common prefix and then disambiguate, which is sort of common. Would that be difficult here? |
The ambiguity I'm worried about is not the (module (alias $instance "x")) where this can be nested arbitrarily deep as: (module (alias (instance $other_instance "other_export") "x")) and it can keep repeating. From a LALR perspective I think it's fine but at least so far the rest of the grammar is quite easy to parse with a recursive descent parser where you typically figure out what to parse next by peeking at most 3 or so tokens ahead. This would involve, however, parsing perhaps an arbitrary number of tokens ahead. |
I realize though that this shorthand is always deterministically an invalid wasm module if it was actually an I mostly wanted to just bring this up though b/c it's sort of a weird thing about parsing. It's not necessarily the weirdest thing though. |
Oh, sorry, I totally missed the point×2. Could it make sense to only allow an |
Yeah I think that'd fix it as well, albeit it's a bit weird in the grammar but as a shorthand that's not really the end of the world |
Yes, but why does this matter? The recursion is still the same for both productions, so you can safely perform it before deciding which construct it is. In general, recursive descent has no issues with overlapping productions of the form
even if A, B, C are arbitrary constructs, as long as D and E can be distinguished by their initial token. Am I missing something that makes that impossible here? |
To be clear I'm not saying anything is impossible, just some parts are harder than before. I am also not aware of a precise definition of a recursive descent parser, so to clarify what I mean is that the parser in the I'm not talking about an ambiguity in parsing, I believe you're right in that there's zero ambiguity. My point is that this is making parsing more difficult in one strategy of implementing a parser relative to how difficult it was before. |
Recursive descent roughly means that you need an LL(1) grammar -- or one that you can transform into LL(1) easily enough. In practice, very few grammars implemented in recursive descent parsers are LL(1) as given, but they are implicitly transformed into one. Taking my earlier example,
this can be turned into LL(1) via
which corresponds to how a recursive descent parser would handle X without needing special look-ahead: the function to parse X first parses A,B,C and then makes a decision to parse either D or E -- essentially inlining the call to the parsing function for the auxiliary X_tail. The Wasm spec could make these transformations explicit, but they are routine, so it shouldn't be necessary to obfuscate the spec for it. As you say, a similar kind of transformation is already needed in other places -- the only difference here is that you are parsing not just terminals but also non-terminals in the prefix, but usually that shouldn't be any harder. |
This PR addresses issues #19 and #20 by:
alias
andinstantiate
; instead,<name>
s are used:(alias $f (func $instance "export"))
with inline sugar(func $instance "export")
(instance $i (instantiate $M "foo" (instance $foo) "bar" (func $bar)))
<name> <instance-arg>
because I couldn't think of a concrete reason to introduce a more verbose syntax that explicitly grouped them (e.g.,(arg <name> <instance-arg>)
), but I'm open to hearing other thoughts.(alias $t (type outer $PARENT $Type))
with inline sugar(type outer $PARENT $Type)
Although it isn't explicitly called out in the current explainer, these changes remove the need to add
$identifier
names to module/instance types. Updating all the code examples, I found this a nice change, removing what always had felt like a superfluous name for each imported module's/instance's export.