Skip to content
This repository was archived by the owner on Aug 17, 2022. It is now read-only.

Module types need their own index spaces #21

Closed
rossberg opened this issue Dec 8, 2020 · 11 comments
Closed

Module types need their own index spaces #21

rossberg opened this issue Dec 8, 2020 · 11 comments

Comments

@rossberg
Copy link
Member

rossberg commented Dec 8, 2020

The proposal currently assumes that all type indices used inside a module or instance type refer to the enclosing module's index space. However, that is incompatible with the addition of type imports, which introduce a way of binding types local to a module type: if every type index in the inner module's type is interpreted as indexing into the enclosing module, then there is no way of actually referring to such local types.

This has various consequences:

  • To make type imports possible in the future, module types need to have their own index space:
    (module
      (type $T (func))
      (type $M
        (module
          (import "T" (type $T))
          (export "g" (global (ref $T)))  ;; local type 0, not parent's
        )
      )
    )
    
  • To be able to still reference outer types, module types need a way to define parent aliases:
    (module
      (type $T (func))
      (type $M
        (module
          (import "T" (type $T1))
          (alias $T2 parent (type $T))
          (export "g1" (global (ref $T1)))
          (export "g2" (global (ref $T2)))
        )
      )
    )
    
  • To be able to reference types from local imports, module types will need a way to define local type aliases:
    (module
      (type $T (func))
      (type $M
        (module
          (import "I" (instance $I (export "T" (type $T))))
          (alias $T (instance $I) (type $T))
          (export "g" (global (ref $T)))
        )
      )
    )
    
  • To be able to define non-trivial types that depend on imports, module types will need the ability to define types locally:
    (module
      (type $T (func))
      (type $M
        (module
          (import "T" (type $T))
          (type $F (func (param (ref $T)))
          (export "f" (func (type $F)))
        )
      )
    )
    

Obviously, this also has non-trivial effects on the definition of subtyping.

The latter two items are extensions that we can introduce once type imports are actually added. But the former two are necessary to change/add now in order to be forward-compatible with such an extension.

@lukewagner
Copy link
Member

Ah, good point. So then to handle bullets 1 and 2, is the idea to treat module types analogous to nested module definitions? viz.:

  • module types push their own module scope with their own initially-empty index spaces
  • module types must use parent aliases to refer to the enclosing modules' types

@rossberg
Copy link
Member Author

rossberg commented Dec 8, 2020

Yes, that's the idea. There would be other ways of doing it, but this seems the most consistent.

@lukewagner
Copy link
Member

Fixed with #26

@lukewagner
Copy link
Member

Thinking about this issue again from the fresh perspective of rebasing this proposal as a new layered spec, I wonder if the rules above are unnecessarily restrictive and verbose.

In particular, what if index space validation rules worked like this:

  • When validating a module type, all relevant index spaces are snapshot at the point of right before the module type.
  • While validating the module type definition, definitions added by imports or aliases append to the index spaces like normal (and are thus available for subsequent use within the module type).
  • At the end of validating the module type, all relevant index spaces are restored to their snapshot size (thereby dropping all definitions appended while validating the module type), and then the just-validated module module type is appended.

This seems equivalent to what you could express by eagerly outer-aliasing everything with the net effect being less outer-alias annotation burden in the text format (which I experienced reworking all the examples) and binary format. What I don't know is if this is harder to implement for some reason. I wondered if it might even be easier or more efficient if it allows simply reusing the module environment instead of having to set up a whole new one for each module/instance type)? Any thoughts here @alexcrichton?

@lukewagner lukewagner reopened this Nov 2, 2021
@lukewagner
Copy link
Member

Oh, and the other benefit is that, until we do add type imports or aliases-in-module-types, we don't have to do anything special; so this would just be a future plan-of-record.

@alexcrichton
Copy link
Collaborator

From a decoding point of view I think that seems easy enough and won't be hard to implement. I think from a text format perspective it won't be too too hard as well, although it may require a two-pass sort of indexing where indexes within a module type are determined relative to wherever the module type is declared and then after the outer modules' type indices are all determined the inner module could get fully numbered. Either way though I don't expect it to be too hard, the issues with repeated sections in the text format are probably trickier.

@rossberg
Copy link
Member Author

rossberg commented Nov 3, 2021

Hm, I would advice against this. When dealing with syntax that has nested binding structure, you really want the property that local bindings are not affected by outer bindings. That's why you typically have shadowing rules for named bindings, and it is what de Bruijn indexing (relative indexing from inside to outside, like Wasm labels) achieves for numeric bindings, and why that is universally preferred as a name-free representation. Without that composition property, constructing and transforming terms/types becomes more complicated and error-prone. Moreover, it prevents canonicalisation, because even a completely closed type will differ (from itself) depending on where it occurs.

@lukewagner
Copy link
Member

Ah, that's an interesting point; I hadn't thought of this in terms of de Bruijn indexing, but I suppose the "outer alias" is really just factoring out the de Bruijn index pair from what would otherwise be every use site of the alias.

Then my next question is: in the text format, could we allow outer aliases to be implicitly created (in the same manner as typeuse) by having the inner module/module type be able to simply reference the identifier of the outer definition? I know this added a lot of complexity before when we tried to do it for instance-export-aliases (and it was fundamentally incomplete too, in cases of instantiate), but I don't think any of this applies to outer aliases; it can just be plain old lexical scoping. Yes?

@alexcrichton
Copy link
Collaborator

At least for implementation-complexity I think auto-inserting alias annotations should be fine, although it's been awhile since I last did all this so I'm not 100% certain on that.

@rossberg
Copy link
Member Author

rossberg commented Nov 9, 2021

Some sugar for symbolic aliases should certainly be feasible. I admit that I don't remember what complexity existed before. :)

@lukewagner
Copy link
Member

Ok, sounds like a plan; I'll add a note to the explainer about automatic insertion of outer-aliases when used in nested modules and types.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants