Change alias sugar, remove local indices, give module/instance types fresh index spaces #26

lukewagner · 2020-12-21T18:33:27Z

This PR addresses issues #19 and #20 by:

removing the use of local indices (into local module/instance typedefs) in alias and instantiate; instead, <name>s are used:
- (alias $f (func $instance "export")) with inline sugar (func $instance "export")
- (instance $i (instantiate $M "foo" (instance $foo) "bar" (func $bar)))
  - Note: I used a flat list of <name> <instance-arg> because I couldn't think of a concrete reason to introduce a more verbose syntax that explicitly grouped them (e.g., (arg <name> <instance-arg>)), but I'm open to hearing other thoughts.
renaming "parent" to "outer" alias
adding an explicit outer module to outer aliases that can be any enclosing module:
- (alias $t (type outer $PARENT $Type)) with inline sugar (type outer $PARENT $Type)

Although it isn't explicitly called out in the current explainer, these changes remove the need to add $identifier names to module/instance types. Updating all the code examples, I found this a nice change, removing what always had felt like a superfluous name for each imported module's/instance's export.

lukewagner · 2020-12-21T23:23:56Z

I added a second commit to this PR which addresses comments in #21. I had actually already accidentally started making this change in the code samples of the first commit, using outer aliases unnecessarily in module/instance imports. The second commit establishes the rules and, most significantly, changes the encoding of module/instance types in Binary.md.

alexcrichton

Looks good to me! I've got one question on the binary encoding of aliases but otherwise everything here seems pretty reasonable to me.

proposals/module-linking/Binary.md

proposals/module-linking/Explainer.md

alexcrichton · 2021-01-07T21:50:28Z

One slight gotcha to the new alias syntax is that we have both:

(alias (func $funcname $instancename "exportname"))
(alias (func $instancename "exportname"))

so that when parsing this you can't eagerly take the first identifier as the name of the function. You need to look forward ahead to see if there's another identifier (or (instance ...)) to figure out whether the first identifier is the name of the function rather than the instance specifier. Not really the end of the world one way or another, but wanted to be sure to note it.

lukewagner · 2021-01-07T23:05:49Z

Hmm, yes, I guess that was a nice thing about the (alias $f (func $instance "export")) syntax; it kept things quite separate. At the risk of more verbosity, we could have: (alias (func $f export $instance "export"))), using export|outer as a discriminant for what comes next? I don't have a strong opinion here.

alexcrichton · 2021-01-08T15:23:32Z

I think I'd have a slight preference for (alias $f (...)) to keep things separate, although not particularly strongly. One small bonus is that it mirrors the sugar elsewhere to inject aliases.

This commit Implements the sugar syntax proposed in WebAssembly/module-linking#26. The change here was to introduce a new `ItemRef` construct into the `wast` crate which represents a parenthesized reference to an item, either in the current module or a different module (e.g. optional `outer` and optional export names). Additionally an `IndexOrRef` type was added for parsing references which are either a bare index or an explicit reference to an item. All existing references to items were updated to use `ItemRef` where necessary. Additionally all locations which previously took an `Index` now store `ItemRef` and are parsed as `IndexOrRef`.

This continues to sync this implementation with WebAssembly/module-linking#26 by implementing the ability for outer aliases to have an arbitrary depth listed on them.

lukewagner · 2021-01-08T22:49:17Z

Hmm, yes, on second thought, I guess that seems like the best. I'll switch back and update the original comment to match.

lukewagner · 2021-01-08T22:51:08Z

(FWIW, I'll be out next week.)

This commit Implements the sugar syntax proposed in WebAssembly/module-linking#26. The change here was to introduce a new `ItemRef` construct into the `wast` crate which represents a parenthesized reference to an item, either in the current module or a different module (e.g. optional `outer` and optional export names). Additionally an `IndexOrRef` type was added for parsing references which are either a bare index or an explicit reference to an item. All existing references to items were updated to use `ItemRef` where necessary. Additionally all locations which previously took an `Index` now store `ItemRef` and are parsed as `IndexOrRef`.

This continues to sync this implementation with WebAssembly/module-linking#26 by implementing the ability for outer aliases to have an arbitrary depth listed on them.

* Implement new alias shorthand sugar for module linking This commit Implements the sugar syntax proposed in WebAssembly/module-linking#26. The change here was to introduce a new `ItemRef` construct into the `wast` crate which represents a parenthesized reference to an item, either in the current module or a different module (e.g. optional `outer` and optional export names). Additionally an `IndexOrRef` type was added for parsing references which are either a bare index or an explicit reference to an item. All existing references to items were updated to use `ItemRef` where necessary. Additionally all locations which previously took an `Index` now store `ItemRef` and are parsed as `IndexOrRef`. * Implement relative-depth aliases This continues to sync this implementation with WebAssembly/module-linking#26 by implementing the ability for outer aliases to have an arbitrary depth listed on them. * More rename of parent => outer

Latest commit from WebAssembly/module-linking#26

Don't have an optional field name and additionally remove the surrounding `(arg ..)` since it's not present in WebAssembly/module-linking#26

This commit updates the various tooling used by wasmtime which has new updates to the module linking proposal. This is done primarily to sync with WebAssembly/module-linking#26. The main change implemented here is that wasmtime now supports creating instances from a set of values, nott just from instantiating a module. Additionally subtyping handling of modules with respect to imports is now properly handled by desugaring two-level imports to imports of instances. A number of small refactorings are included here as well, but most of them are in accordance with the changes to `wasmparser` and the updated binary format for module linking.

rossberg · 2021-02-10T08:20:37Z

Thanks, looks good, up to some syntactic nits:

For aliases, I suggest (alias $instance "func" (func $f)) and (alias outer $t (type $t)), respectively. That would be more consistent with import syntax and its binding position. In particular, we currently have that a binding occurrence for a symbolic id is always directly preceded by a keyword denoting its namespace, and it would be nice to not break that.
For instantiate, I'd prefer something like (instantiate $M (import "foo" (instance $foo)) (import "bar" (func $bar))) -- the wat form does not have flattened-list-of-tuples anywhere else, and IMO they are somewhat out of place, given that the core wat format is meant to reflect the structure of the AST.

WDYT?

lukewagner · 2021-02-10T17:07:51Z

Heh, for the alias syntax, I also proposed switching to exactly that at some point, but then thought maybe it looked weird and talked myself out of it. And for the instantiate syntax, I was on the fence whether the extra verbosity was justified, so it's good to have a second opinion. SGTM on both changes. @alexcrichton ?

alexcrichton · 2021-02-10T17:27:28Z

Seems reasonable to me!

lukewagner · 2021-02-10T18:03:36Z

Ultra-nit: is import the best token for (import "foo" (instance $foo))? How about arg? (It's shorter :)

rossberg · 2021-02-11T08:07:15Z

Btw, with this syntax for aliases, we could also allow the analogous "inverted shorthand" for aliases that we already have for imports, e.g., writing (func $f (alias $m "f")).

Ultra-nit: is import the best token for (import "foo" (instance $foo))? How about arg? (It's shorter :)

Yeah, import isn't ideal. FWIW, I keep struggling with there being no good dual term to "import" in our model (supply? provide?). I would avoid "arg", though, since we do not call imports "params" and generally don't invoke the modules-are-functions notion elsewhere.

lukewagner · 2021-02-11T17:23:52Z

Ah right, the "inverted shorthand" syntax makes sense.

And fair point about "arg" corresponding to a "param" vs. "import". What about "export"?

rossberg · 2021-02-16T10:31:38Z

Hm, an export is something a module provides to the outside. But here it's the inverse direction, the outside client supplying something to a module.

The difference comes from Wasm's model requiring an intermediary to connect imports and exports, and that imports can be provided through values that aren't exports of another module. So export as opposite end of import does not seem right.

lukewagner · 2021-02-16T16:00:22Z

Fair enough, import it is.

Following the final updates on WebAssembly/module-linking#26

alexcrichton · 2021-02-17T20:20:23Z

Getting around to implement this, FWIW inline aliases are particularly tricky on modules I think. For example I think this should parse:

(module
  (module (alias outer 0 0))
)

but when parsing the inner module after seeing (alias at that point the parser needs to figure out if that's an alias item or whether it's an inline alias annotation. This is the same problem with inline import statements as well (I think I opened up an issue on this awhile back), but the difference is that import could be solved with simple lookahead whereas alias can't.

For inline imports you simply need to see (import "string" "string"?) and you know it's an inline import, but for alias annotations you could see any (alias (instance $i "x") "y") which can also be arbitrarily nested. This means that to parse an inline alias annotation on a module you'd need infinite lookahead in theory I think?

It's not really the end of the world but I think I'll probably avoid implementing inline alias annotations for modules for now because of this.

lukewagner · 2021-02-18T03:49:20Z

Ooh, good point. Just to confirm, the ambiguity is between the "inverted shorthand" recently added and a normal alias definition. This seems like a complete ambiguity, not just an issue of lookahead. (For example, validity aside, is your example a module-containing-a-module-containing-an-alias or a module-containing-an-alias?)

@rossberg Should we just remove the inverted shorthand?

(On a side note, setting aside the inverted shorthand ambiguity, I think the (alias outer 0 0) in the example would fail validation b/c it would be a self reference: outer aliases can only refer to preceding modules. (module (module) (module (alias outer 0 0))) would be fine.)

rossberg · 2021-02-18T08:44:46Z

Hm, is there really an ambiguity? Let me know if I'm overlooking something, but one form is

(module (alias outer $i $j))

while the other would be

(module
  (alias outer $i $j (func))
)

This is disambiguated at the first token after the indices. Combining it with the sugar for inline aliases shouldn't change that, I believe?

Wrt parsing, this difference is LALR(1), since neither form is a prefix of the other, and their common prefix has the exact same structure (unless I'm missing something). So no problem for parser generators. And even a recursive descent parser can just parse the common prefix and then disambiguate, which is sort of common. Would that be difficult here?

alexcrichton · 2021-02-18T15:32:26Z

The ambiguity I'm worried about is not the outer form which has a deterministically small number of tokens, but rather the alternate form of:

(module (alias $instance "x"))

where this can be nested arbitrarily deep as:

(module (alias (instance $other_instance "other_export") "x"))

and it can keep repeating.

From a LALR perspective I think it's fine but at least so far the rest of the grammar is quite easy to parse with a recursive descent parser where you typically figure out what to parse next by peeking at most 3 or so tokens ahead. This would involve, however, parsing perhaps an arbitrary number of tokens ahead.

alexcrichton · 2021-02-18T15:34:55Z

I realize though that this shorthand is always deterministically an invalid wasm module if it was actually an alias item inside a nested module since there can't be any instances defined, so a fix could be that if alias ( is seen it's deduced as inline annotation. With alias outer ... it would see if there was any token after the indices to deduce whether it's an inline annotation or an inline module.

I mostly wanted to just bring this up though b/c it's sort of a weird thing about parsing. It's not necessarily the weirdest thing though.

lukewagner · 2021-02-18T20:52:29Z

Oh, sorry, I totally missed the point×2. Could it make sense to only allow an $instance identifier in the parse rule for the inverted abbreviation (so not an inline instance alias)? (Which, I expect, is what @alexcrichton has now implemented in practice?)

alexcrichton · 2021-02-18T21:07:47Z

Yeah I think that'd fix it as well, albeit it's a bit weird in the grammar but as a shorthand that's not really the end of the world

rossberg · 2021-02-18T22:05:51Z

The ambiguity I'm worried about is not the outer form which has a deterministically small number of tokens, but rather the alternate form of:
[...]
where this can be nested arbitrarily deep as:
(module (alias (instance $other_instance "other_export") "x"))

Yes, but why does this matter? The recursion is still the same for both productions, so you can safely perform it before deciding which construct it is. In general, recursive descent has no issues with overlapping productions of the form

X ::= A B C D
X ::= A B C E

even if A, B, C are arbitrary constructs, as long as D and E can be distinguished by their initial token. Am I missing something that makes that impossible here?

alexcrichton · 2021-02-19T20:23:20Z

To be clear I'm not saying anything is impossible, just some parts are harder than before. I am also not aware of a precise definition of a recursive descent parser, so to clarify what I mean is that the parser in the wast Rust crate works by basically looking at the next token and committing to what to parse at that point. There's a few cases it needs to look maybe 2-3 tokens ahead. This is the first case in the parser where it would have to look ahead an arbitrary number of tokens.

I'm not talking about an ambiguity in parsing, I believe you're right in that there's zero ambiguity. My point is that this is making parsing more difficult in one strategy of implementing a parser relative to how difficult it was before.

rossberg · 2021-02-23T10:42:57Z

Recursive descent roughly means that you need an LL(1) grammar -- or one that you can transform into LL(1) easily enough.

In practice, very few grammars implemented in recursive descent parsers are LL(1) as given, but they are implicitly transformed into one. Taking my earlier example,

X ::= A B C D
X ::= A B C E

this can be turned into LL(1) via

X ::= A B C X_tail
X_tail ::= D | E

which corresponds to how a recursive descent parser would handle X without needing special look-ahead: the function to parse X first parses A,B,C and then makes a decision to parse either D or E -- essentially inlining the call to the parsing function for the auxiliary X_tail.

The Wasm spec could make these transformations explicit, but they are routine, so it shouldn't be necessary to obfuscate the spec for it. As you say, a similar kind of transformation is already needed in other places -- the only difference here is that you are parsing not just terminals but also non-terminals in the prefix, but usually that shouldn't be any harder.

Change alias sugar, remove indices from aliases/instantiate (#19 #20)

ff45ea4

lukewagner requested a review from alexcrichton December 21, 2020 18:33

Put module and instance types into their own index spaces (#21)

0bd68f7

lukewagner changed the title ~~Change alias sugar, remove local indices from aliases/instantiate~~ Change alias sugar, remove local indices, give module/instance types fresh index spaces Dec 21, 2020

alexcrichton approved these changes Jan 4, 2021

View reviewed changes

proposals/module-linking/Binary.md Outdated Show resolved Hide resolved

Address Alex's comments

3df2aa7

lukewagner requested a review from rossberg January 4, 2021 22:30

alexcrichton mentioned this pull request Jan 5, 2021

Update the module linking proposal encoding bytecodealliance/wasm-tools#176

Merged

alexcrichton reviewed Jan 6, 2021

View reviewed changes

proposals/module-linking/Explainer.md Outdated Show resolved Hide resolved

Fix typo Alex pointed out

5f95c25

alexcrichton mentioned this pull request Jan 8, 2021

Sync syntax/binary of aliases with upstream spec bytecodealliance/wasm-tools#191

Merged

Switch back to (alias (func ...))

a08211d

alexcrichton added a commit to alexcrichton/wasm-tools that referenced this pull request Jan 11, 2021

Switch-up alias identifier location

c299421

Latest commit from WebAssembly/module-linking#26

alexcrichton mentioned this pull request Jan 11, 2021

Switch-up alias identifier location bytecodealliance/wasm-tools#192

Merged

This was referenced Jan 11, 2021

Sync syntax and binary for instantiate arguments bytecodealliance/wasm-tools#193

Merged

Aliasing outer imported modules forces modules to have per-instance runtime state #28

Closed

alexcrichton mentioned this pull request Jan 14, 2021

Update support for the module linking proposal bytecodealliance/wasmtime#2585

Merged

lukewagner mentioned this pull request Jan 20, 2021

Introduce arbitrary import/export depth #29

Closed

Address Rossberg's feedback, tweaking alias/instantiate syntax

84060f9

Add inverted alias definition form

3365dc8

lukewagner merged commit 63cd6c0 into master Feb 16, 2021

lukewagner deleted the fix-issue-19 branch February 16, 2021 16:00

alexcrichton added a commit to alexcrichton/wasm-tools that referenced this pull request Feb 17, 2021

Update text format for aliases/instantiate args

048113f

Following the final updates on WebAssembly/module-linking#26

alexcrichton mentioned this pull request Feb 17, 2021

Update text format for aliases/instantiate args bytecodealliance/wasm-tools#224

Merged

This was referenced Feb 19, 2021

Alias shorthand syntax #19

Closed

Nested parent aliases #20

Closed

Module types need their own index spaces #21

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Change alias sugar, remove local indices, give module/instance types fresh index spaces #26

Change alias sugar, remove local indices, give module/instance types fresh index spaces #26

lukewagner commented Dec 21, 2020 •

edited

Loading

lukewagner commented Dec 21, 2020

alexcrichton left a comment

alexcrichton commented Jan 7, 2021

lukewagner commented Jan 7, 2021

alexcrichton commented Jan 8, 2021

lukewagner commented Jan 8, 2021

lukewagner commented Jan 8, 2021

rossberg commented Feb 10, 2021

lukewagner commented Feb 10, 2021

alexcrichton commented Feb 10, 2021

lukewagner commented Feb 10, 2021 •

edited

Loading

rossberg commented Feb 11, 2021

lukewagner commented Feb 11, 2021

rossberg commented Feb 16, 2021

lukewagner commented Feb 16, 2021

alexcrichton commented Feb 17, 2021

lukewagner commented Feb 18, 2021

rossberg commented Feb 18, 2021

alexcrichton commented Feb 18, 2021

alexcrichton commented Feb 18, 2021

lukewagner commented Feb 18, 2021

alexcrichton commented Feb 18, 2021

rossberg commented Feb 18, 2021

alexcrichton commented Feb 19, 2021

rossberg commented Feb 23, 2021

Change alias sugar, remove local indices, give module/instance types fresh index spaces #26

Change alias sugar, remove local indices, give module/instance types fresh index spaces #26

Conversation

lukewagner commented Dec 21, 2020 • edited Loading

lukewagner commented Dec 21, 2020

alexcrichton left a comment

Choose a reason for hiding this comment

alexcrichton commented Jan 7, 2021

lukewagner commented Jan 7, 2021

alexcrichton commented Jan 8, 2021

lukewagner commented Jan 8, 2021

lukewagner commented Jan 8, 2021

rossberg commented Feb 10, 2021

lukewagner commented Feb 10, 2021

alexcrichton commented Feb 10, 2021

lukewagner commented Feb 10, 2021 • edited Loading

rossberg commented Feb 11, 2021

lukewagner commented Feb 11, 2021

rossberg commented Feb 16, 2021

lukewagner commented Feb 16, 2021

alexcrichton commented Feb 17, 2021

lukewagner commented Feb 18, 2021

rossberg commented Feb 18, 2021

alexcrichton commented Feb 18, 2021

alexcrichton commented Feb 18, 2021

lukewagner commented Feb 18, 2021

alexcrichton commented Feb 18, 2021

rossberg commented Feb 18, 2021

alexcrichton commented Feb 19, 2021

rossberg commented Feb 23, 2021

lukewagner commented Dec 21, 2020 •

edited

Loading

lukewagner commented Feb 10, 2021 •

edited

Loading