-
Notifications
You must be signed in to change notification settings - Fork 13
Order for testing subtyping between modules #7
Comments
Ouch, yeah, that's a bit nasty. The canonical semantics would be treating multiple imports from the same name as a form of intersection type. That implies the rule that for each import in the subtype, there must exist a suitable import in the supertype. This is independent of any order. It may relate multiple sub-imports to the same super-import (or vice versa), but that's actually what you want. What's annoying about this, though, is that for each sub-import you may have to try multiple super ones, i.e., it can be quadratic in the number of overlapping imports. That may be okay in this case, but still. |
Interesting. So as currently written in the PR, an import (import "" (module $A
(import "foo" (func))
(import "foo" (func))
(import "bar" (func))
)) would accept (module $B
(func (import "foo"))
(func (import "foo"))
(func (import "bar"))
) b/c it's the same type (no subtyping rule needed). But would (module $C
(func (import "foo"))
(func (import "foo"))
) which requires subtyping? I suppose it could be made to if there was a prefix-compatibility rule added to subtyping. Presumably (module $D
(func (import "foo"))
(func (import "bar"))
(func (import "foo"))
)) Perhaps we should disallow duplicates in the imports of module imports since it leads to all this wackiness? |
@lukewagner your examples have |
Oops! Somehow I forgot that we already disallowed export duplicates and tried to "simplify" my examples :P I'll edit the comment in-place. |
@lukewagner, when interpreting same-named imports as intersection types then the order does not matter. For example,
morally would be
And The subyping rules for intersection types are as follows:
(where T and U themselves cannot be intersections). Taken together, these allow Although attractive, I doubt that we can disallow duplicates in imports, for at least two reasons:
What we could try to restrict, though, would be the subtyping rules, in order to avoid the "or". But we have to be careful to maintain crucial properties, such as reflexivity (i.e., the equivalent of One possibility avoiding the "or" would be to weaken the above rules to
where T and U themselves can not be intersections anymore. The effect would be that matching multiple imports in a supertype becomes positional per name. That is, the only subtype of a module with multiple imports for "foo" is one with no "foo" imports at all (corresponding to top), or one with the exact same number and in a compatible order. Interleaving with imports of other name does not matter. Concretely, that would allow (assume
but not
This may be somewhat restrictive, though. Ruling out (*) makes it impossible to use a module with multi-imports in a context where one sufficiently general import is actually provided. We could remedy that by reintroducing an equivalent of rule (2):
This would allow (*), that is, a module with exactly one import for "foo" may still be matched by one with multiple ones. That's still fine, because there always is a uniquely determined import in the supertype to match against, so it remains linear. I kind of suspect that we'd need this to handle multi-level imports in an actual module definition sufficiently well (and we should disallow that sugar in module types!). |
@rossberg Let me start by trying to digest the first bit with intersection types :) When you say that (module $B
(import "foo" (func) /\ (func))
(import "bar" (func))
) Given that function closures have identity, I think this would only work if the type of (import "" (module $A
(import "foo" (func) /\ (func))
(import "bar" (func))
)) With the critical implication that |
@lukewagner, in contexts identifying imports by name (such as with the JS API's instantiate function) we already "merge" imports with the same name. So nothing changes there. As for positional contexts, we don't currently have them anywhere in the standard (only in the C API), so I don't think it'd be a breaking change. Just like you proposed that subtyping can reorder imports, it could also merge imports, and However, it is true that the "or" rule would make this "coercion" more tricky (or even ambiguous?), so perhaps another reason to avoid it. |
Yes, that was my question; we can't allow: (module (import "foo" T')) <: (module (import "foo" T) (import "foo" T)) because then there would be an ambiguity when two distinct An alternative to weakening the rules to avoid "or" is to change the abstract syntax of module types to be an unordered map of names (with duplicates in the text/binary parsed into intersection types which, btw, nicely ensures the "no nested intersection types" property). Then there is never an ambiguity because there is always exactly 1 function value. The abstract syntax of |
But that's incompatible with a positional interpretation of module imports in the |
@rossberg What I was perhaps-too-briefly suggesting in my last comment was that |
@lukewagner, I see, but how would that express the type of an existing module with multiple "foo" imports? You have to represent the intersection inherent in that somehow. Whether that's by multiple entries for "foo" or by one with multiple types is just a syntactic detail. If we want to avoid the "or" then this intersection must be ordered, no matter how it is represented. |
@rossberg As long as duplicates are always collapsed into a single import of an intersection type (by necessity, if types are unordered maps), then I think there should never be any ambiguity (for any import, there will be exactly one value supplied by |
@lukewagner, oh, okay, so you are just worrying about the value ambiguity, not the "or" per se. But you'd need to introduce intersection types as an explicit concept then. You could equivalently say that in a positional interpretation of imports, duplicate entries are omitted. That has the same net effect without introducing a new construct. Btw, an unordered interpretation of imports won't work for other reasons: it is incompatible with having dependencies between imports, as with types. That is, the same reason why this proposal requires that static declarations within a module are ordered applies within module types as well. |
@rossberg Ah hah, that's a good reason for maintaining positional order. Could you explain a bit more of what you mean by saying "duplicate entries are omitted"? In all module types or just for the purposes of subtyping? |
@lukewagner, in contexts where a module type is interpreted positionally to determine argument values (e.g., an instantiate instruction), only the first occurrence of each top-level) import name would be taken into account. Of course, the value would still be checked against all the other types. Sounds a bit more hacky perhaps, but I would have the exact same effect as introducing and rewriting to intersection types. |
@rossberg Ah, ok, cool. So to check my understanding, in this module: (module $A
(import "B" (module $B
(import "a" (func (param (ref (func))))
(import "a" (func (param funcref)))
))
(func $f1 (param funcref))
(func $f2 (param (ref (func))))
(instance $i (instantiate $B (func $f1)))) ;; this would validate,
(instance $i (instantiate $B (func $f2)))) ;; and this would fail, yes?
) Then, for the instantiation-time check of module import type against the actual imported module, would the subtyping rule just say that So then, for the above example, I think all these modules would match, yes? (module $A)
(module $B
(import "a" (func (param funcref))))
(module $C
(import "a" (func (param funcref)))
(import "a" (func (param funcref))))
(module $D
(import "a" (func (param (ref (func))))
(import "a" (func (param (ref (func)))))
(module $E
(import "a" (func (param funcref)))
(import "a" (func (param (ref (func))))) |
Yes (well, except But to back up a little, neither of this does anything about the other problem, that "some import" implies that the subtype check can be quadratic in the number of duplicates, because of the "or" rule required. That's independent of whether the intersection is represented as multiple imports, or one import with multiple types. My other ramblings above were primarily concerned with that problem. |
@rossberg For For the quadratic case: yeah, I saw that too and thought we could address it with some "maximum number of duplicate imports" implementation limit. Thinking back to our "reinterpret two-level imports as single-level imports" rule, instead of having the parse/decode rules do the de-duplication, we could say that each individual two-level import is directly turned into a single-level import of an instance with a single export, and then the intersection type checking does the rest. So, e.g., a module: (module
(import "a" "x" (func))
(import "a" "y" (func))) is reinterpreted as: (module
(import "a" (instance (export "x" (func))))
(import "a" (instance (export "y" (func))))) and then it is subtyping that makes this type isomorphic to: (module
(import "a" (instance
(export "x" (func))
(export "y" (func))))) I don't know if that's actually better, but it seemed like a neat symmetry. However, writing that made me realize a snag with the previously-mentioned scheme of recasting two-level imports as single-level imports: today I can write: (module
(import "console" "log" (func (param i32)))
(import "console" "log" (func (param f64)))) but interpreting that as a single-level import of an instance: (module
(import "console" (instance
(export "log" (func (param i32)))
(export "log" (func (param f64)))))) isn't valid today because duplicate exports aren't allowed. I was thinking maybe we should allow duplicate exports, but that would seem to amount to adding wasm function overloading which seems to have some pretty ugly JS API and |
Well, I suppose that depends on what exactly you meant by "would match", in particular wrt direction. But I think we're on the same page.
Yes, that's what I was assuming we'd do. Especially since that does not require any particular rule for dealing with imports with different depths, but where one name list is a prefix of the other. But such an expansion would of course clash badly with a maximum duplicate limit -- which I think would generally cause composition issues in other ways as well.
Ouch, yeah, I hadn't considered that. We could hack around that and only allow duplicate exports in negative position, but that's ever more monster barring. Sigh, I increasingly feel like we should just ban duplicate imports altogether. I wonder if it's worth doing some use counter statistics telemetry experiment to figure out whether removing duplicate imports from the language would break any web site in the wild. |
@rossberg Hah, I've been wondering the same. I asked @kripken and he can't think of any case where duplicate imports are intentionally generated, so there is actually a chance (although still, I'm guessing, only a 50% one). I filed bug 1647791 to investigate. |
FWIW, one significant downside of removing duplicate imports would be that merging modules would become significantly more complicated, because you'd need to compute and insert glb's. |
Reading over this again today, is it right to say that the general impression here is that it'd be best to figure out how to disallow duplicate imports? But this is also seemingly in a climate where we also can't do that due to backwards compatibility-constraints? One thought I had is that in type declarations within a module (e.g. a module declaring a module type) we could disallow duplicate names of imports. That should be backwards-compatible and I was curious if it could buy us enough runway to not have to worry about duplicate imports before-module-linking (it doesn't look like the investigation has progressed much in the past few months). The idea I had is that when you instantiate a module you'd provide the same number of arguments as import statements in the declared module type, and that'd provide a clear 1:1 mapping of "here's what to call that imported value". Given those names you can then use them to satisfy the input module's own imports. This also means that you basically just wouldn't be able to import (or bundle) an MVP module using duplicate import names, but given the expected rarity of that ocurring it may not be the end of the world? The consequence of this is that we can always have a module type for all nested modules, but the top-level module may not have an expressible type. Engines would have to deal with the top-level case specially in that situation. Or would this perhaps be too onerous at the spec-level because it'd have to have a special case in a bunch of rules otherwise? For example we could do the desugaring of two-level imports to single-level import of an instance but the spec would have to say that for top-level modules the desugaring doesn't work when two imports have the same name but different types. |
Oops, right, I let this ball drop. I'll try to get the telemetry patch written and landed. In the meantime, maybe your implementation work can just operate under the assumption that we have successfully disallowed duplicates? Thinking again of the alternatives (duplicate exports; not being able to desugar import paths into single-level instance import chains), this really seems like the best option and worth doing a gradual deprecation for (in which the spec officially disallows duplicate imports, as do engines that can without breaking existing content, with engines that must support existing content unofficially allowing duplicates at top-level for a period of time, but disallowing in all new positions). |
Makes sense to me! I'll see if I can write up a PR to this repository which tries to summarize the conversation here into the desired end state of disallowing duplicate import strings, but also supporting desugaring. |
This commit has a few high-level changes: * The `Binary.md` is sync'd a bit with `Explainer.md` in terms of content and links. * The textual syntax for `alias` directives was tweaked in `Explainer.md` to match `Binary.md` * Actual section numbers were chosen in the binary format. The exception handling proposal uses 13 for an event section so the four new sections here are getting 14-17. * Handling of subtyping between instances and modules is now codified in `Subtyping.md`. This goes into more detail about how imports are elaborated, how instantiation is expected to work, and how to test for subtypes. This is hoped to describe the intended status quo given the discussion on WebAssembly#7.
Good news: the telemetry probes have now been in Firefox Beta 86 for a week and the initial results show that duplicate imports are used by 0.00014% of page loads (concretely, 2.52k page loads), so I think we're good. I was thinking of adding an agenda item to the next CG meeting to get a vote on making this breaking change. |
It seems to me like retroactively banning multiple imports of the same named object isn't something that has to be done for all modules, only modules that specifically want to use this module linking system. Could you allow existing AssemblyScript and other code using overloaded imports to keep running, and simply throw a validation error when trying to use such a module in a module linking context? This would avoid backwards compatibility issues, AND provide best practices for tool authors to support using module linking, AND can provide an obvious consistent error message on failure. |
That's also an option we've discussed, but it's unfortunate to create a mechanism for linking wasm modules that only works for some kinds of modules -- it essentially means there are two sorts of modules: "top-level" modules (that can only be instantiated by the host) and "linkable" modules (that can be instantiated by either the host or wasm). In any case: if the CG answers the root question ("should wasm modules always be able to virtualize wasm imports") in the affirmitive, then this proposal isn't the extent of the problem--the JS API and ESM-integration have the same issue. |
This commit has a few high-level changes: * The `Binary.md` is sync'd a bit with `Explainer.md` in terms of content and links. * The textual syntax for `alias` directives was tweaked in `Explainer.md` to match `Binary.md` * Actual section numbers were chosen in the binary format. The exception handling proposal uses 13 for an event section so the four new sections here are getting 14-17. * Handling of subtyping between instances and modules is now codified in `Subtyping.md`. This goes into more detail about how imports are elaborated, how instantiation is expected to work, and how to test for subtypes. This is hoped to describe the intended status quo given the discussion on #7.
This commit has a few high-level changes: * The `Binary.md` is sync'd a bit with `Explainer.md` in terms of content and links. * The textual syntax for `alias` directives was tweaked in `Explainer.md` to match `Binary.md` * Actual section numbers were chosen in the binary format. The exception handling proposal uses 13 for an event section so the four new sections here are getting 14-17. * Handling of subtyping between instances and modules is now codified in `Subtyping.md`. This goes into more detail about how imports are elaborated, how instantiation is expected to work, and how to test for subtypes. This is hoped to describe the intended status quo given the discussion on #7.
As of #35 and Module Linking being lifted to another layer, this is no longer an issue, so closing. |
This is an issue for the design proposed in #3 (assumed to be merged in the near-ish future) where I'm curious how the subtyping relationship works between modules. Talking with @lukewagner it sounds like import strings are going to be taken into account, but this raises a question about what to do with imports that import from the same string? For example I'm curious if this module is intended to validate:
A similar question would be if the imports of
$b
were swapped, would it still validate?I think one possible way to test if
$a
is a subtype of$b
would be to:$b
n
of$a
, look up its import string in this map.n
exists in$b
with only one item imported, then the subtyping check recurses (switching order for covariance/contravariance?)n
exists in$b
, but with multiple items imported, then the first unused import (so far) of$b
is tested against the import.n
doesn't exist in$b
it could do the same as above (check the first unused import) or return an error.The thinking is that on collisions (multiple imports from the same name) implementations would fall back to order-based comparisons of types. Otherwise you'd do name-based comparisons. This would allow, in general, imports to get reordered or deleted and you wouldn't break code which depends on you (as it would still assert your older type). If, however, all your import names were the same string then it'd be stricter.
I'm curious though if others have thoughts about how this might work? Or how to handle the subtyping relationship between modules with respect to their imports and multiple imports from the same name?
The text was updated successfully, but these errors were encountered: