Skip to content
This repository was archived by the owner on Apr 25, 2025. It is now read-only.

Replace applicative rtts (rtt.canon) with generative rtts (rtt.fresh) #137

Closed
RossTate opened this issue Sep 22, 2020 · 12 comments
Closed

Comments

@RossTate
Copy link
Contributor

Every WebAssembly application will have embedder-specific code for a variety of reasons. For every (non-monolithic) application, that embedder-specific code will include coordinating separately compiled modules. At the very least, this code will be responsible for coordinating magic numbers like exception events. Far more often than not (maybe even always), this code will also be responsible for linking types. This linking code will be application-specific for a variety of reasons. It might depend on surface-language linking semantics (like class names). It might generate types (e.g. putting together private fields declared separately in superclasses to define the heap structure of a particular class). It might utilize embedder-specific loading and linking mechanisms (e.g. loading/compiling code in the background or on demand). It might employ dynamic adaptive techniques (e.g. load one implementation of a module or another depending on aspects of the device at hand). It might JIT entire modules. The list goes on.

The point is, this infrastructure will exist. That is important because it makes rtt.canon unnecessary even for the very few systems that even have the potential to benefit from it. Just like a Java application would surround a module with context like "import X needs to be linked to the class named java.util.ArrayList", a structural language could surround a module with context like "import Y needs to be linked to a 16-arity tuple", and in both cases the linking infrastructure can resolve the links appropriately (possibly JITing the tiny module it takes to define a 16-arity tuple).

From this perspective, rtt.canon does not appear so language agnostic—rather, it seems to bake into WebAssembly a specific kind of structural linking that only a few languages stand to benefit from. Regardless, rtt.canon is complicated and (especially in the Post-MVP extension) inefficient. So I think it (or more generally applicative rtt instructions) should be replaced with generative rtt instructions (like an rtt.fresh).

@titzer
Copy link
Contributor

titzer commented Sep 24, 2020

I feel uncomfortable reasoning in detail about all future runtime systems at this point, so I don't accept the premise to this proposal. There haven't yet been any example systems built using either the existing or the proposed mechanism, so this feels insufficiently motivated and more like blind churn.

@dcodeIO
Copy link

dcodeIO commented Sep 24, 2020

I'd like to note that if there is a way to avoid inflicting the burden of deploying separate modules for coordination on the user and instead coordinate through the host/the Wasm API using otherwise independent and small modules (think ECMAScript modules), that I'd always pick this option, be it via rtt.canon, or other means. In general, if anything, I'd like to see the direction here inverted, i.e. aiming to bake the necessary capabilities into the API by improving what we have instead of throwing it away already.

Also, my sentiment of "improving what we have" or more generally "working together instead of against each other" applies to other similar discussions as well, as anything else is not helping but just stalling progress imho. For instance, I'm also in the camp of people who haven't had a chance to experiment yet but would love to, and I am having the impression that part of the problem are those frequent 40 minute discussions proposing fundamental changes that nobody really can give useful input on just yet.

@conrad-watt
Copy link
Contributor

conrad-watt commented Sep 24, 2020

If we accept the premise that every non-trivial Wasm program will have a central coordinating top-level that knows all of the required subtyping relations, then that is a strong case for nominal types, but I'm not feeling positive about the specifics of this proposal.

There are some details missing from how this proposal would work, but if rtt.fresh were just a non-canonicalising replacement for rtt.canon, that would be the worst kind of design-by-committee compromise, where we implicitly accept the premise that makes nominal types viable (even preferable), but end up with something that combines disadvantages of both nominal and structural systems.

If rtt.fresh requires the rtts of sub-objects, then the resulting system seems to be a re-statement of #119, unless there is some advantage gained from the potential to create rtts at runtime.

I think one of the main reasons that the nominal proposal does not have more traction is that people (@rossberg in particular) don't accept the premise of a necessary central co-ordinating top-level with so much type knowledge. Creating marginally different proposals that avoid the use of the word "nominal" won't be enough to avoid this central debate.

@titzer
Copy link
Contributor

titzer commented Sep 24, 2020

Also, my sentiment of "improving what we have" or more generally "working together instead of against each other" applies to other similar discussions as well, as anything else is not helping but just stalling progress imho. For instance, I'm also in the camp of people who haven't had a chance to experiment yet but would love to, and I am having the impression that part of the problem are those frequent 40 minute discussions proposing fundamental changes that nobody really can give useful input on just yet.

This sentiment is also underlying my comment. We need to narrow the cone of uncertainty enough to move forward, even if it means hitting an actual roadblock. I said it before and I still believe that one limping but complete language implementation is worth 1000 subtyping debates.

For that reason I would strongly prefer that we align efforts towards getting stuff working, and that actually means implementing a complete enough thing to implement complete enough languages. We could argue another year or two about structural or nominal types and then finally someone wins or gives up and then we move forward with the winner, which doesn't survive the heat of battle and just be a year or two behind, and no better off.

@conrad-watt
Copy link
Contributor

conrad-watt commented Sep 24, 2020

I am somewhat concerned that canonicalisation will end up being a "performance death by 1000 cuts" that only becomes apparent when we eventually start looking at advanced uses of concurrent GC objects (considering the potential need for fences/synchronisation on type/RTT creation depending on the implementation strategy).

That being said, I agree that this proposal probably won't improve what we have now or be a viable alternative (that's not already under discussion).

@RossTate
Copy link
Contributor Author

If we accept the premise that every non-trivial Wasm program will have a central coordinating top-level that knows all of the required subtyping relations, then that is a strong case for nominal types, but I'm not feeling positive about the specifics of this proposal.

Each module, even in a structural language, will have to have already predetermined the required subtyping relations, e.g. in choosing what specific types to use and in choosing when to use rtt.canon versus rtt.sub.

There haven't yet been any example systems built using either the existing or the proposed mechanism, so this feels insufficiently motivated and more like blind churn.

The existing mechanism has not been implemented. V8 does not canonicalize types even within modules, let alone across modules. Currently V8 generates a different rtt per type index per module. Binaryen has not implemented type minimization yet to ensure that equi-recursively equivalent types are given the same index within a module. So now is a good time to examine the existing mechanism.

The pattern I gave above shows how to implement any application wanting the functionality of rtt.canon with rtt.fresh by just moving the feature from engine code to application code. On the other hand, you cannot do the reverse. At present, it sounds like the needs of most languages line up better with rtt.fresh, and rtt.fresh is much closer to what implementations currently offer.

I think one of the main reasons that the nominal proposal does not have more traction is that people (@rossberg in particular) don't accept the premise of a necessary central co-ordinating top-level with so much type knowledge.

I am receptive to this, but I do not believe rtt.canon is a practical solution to this problem. Consider this example structural program from the Post-MVP:

(type $Pair (typeparam $X)
  (struct (field (ref (typeparam $X)) (field (ref (typeparam $X)))))
)

(func
  (typeparam $T)
  (param $rttT (rtt (typeparam $T)))
  (param $x (ref $T))
  (result (ref $Pair (typeparam $T)))

  ;; allocate a pair with full RTT information
  (struct.new_with_rtt
    (rtt.canon $Pair (typeparam $T) (local.get $rttT))
    (local.get $x) (local.gt $x)
  )
)

The use of rtt.canon in this example to implement structural pairs requires a linear-time computation just to construct the pair. The issue is that the "key" for rtt.canon is an equi-recursive type, but the application really wants the key to be a structural type in its surface language. The type grammar of most surface-level languages is not equi-recursive, and so this particular example of pairs could be done in constant time.

On top of this, there is the irony that using structural types to support rtt.canon means that even the Post-MVP cannot express the low-level structural types of high-level structural languages, as discussed here, whereas a nominal type system can.

So whereas structural languages can use means similar to what other languages would use to coordinate low-level nominal types, they have no means to address the limitations of low-level structural types.


As a meta-note, this issue illustrates that there is a use for a codeable cross-platform dynamic-linking system for WebAssembly, i.e. a Java dynamic-linking system could take care of name resolution, whereas a structural dynamic-linking system could take care of canonicalizing rtts (according to the surface-language's types). Ideally it would have the property that you could give it "open" modules (for some language) and ask it to spit out an "open" module (for that language) that could later be linked again with more "open" modules (of the same language). And ideally you could give it a collection of open modules and ask it to produce a "closed" module, i.e. one that can be shipped as a self-contained application whose only interactions are meant to be with a foreign environment. I don't know if such a system should be written within WebAssembly itself, or as a layer on top of WebAssembly, but maybe it could address @rossberg's concerns.

@conrad-watt
Copy link
Contributor

The existing mechanism has not been implemented. V8 does not canonicalize types even within modules, let alone across modules. Currently V8 generates a different rtt per type index per module. Binaryen has not implemented type minimization yet to ensure that equi-recursively equivalent types are given the same index within a module. So now is a good time to examine the existing mechanism.

It's true that we can get a bad nominal type system by half-assing our implementation of the current MVP, but no one would deliberately design the language feature to be like this. It's just an inferior version of #119. It would genuinely be a failure of our specification process if this proposal were adopted.

@RossTate
Copy link
Contributor Author

No one is proposing that. I was responding to the comment that removing rtt.canon would be churn by pointing out that rtt.canon has not actually been implemented. V8 needs to (and plans to) change its implementation to something; the question is what should that thing be? I was arguing that rtt.fresh would both be an easier change and a more broadly useful change.

@conrad-watt
Copy link
Contributor

conrad-watt commented Sep 25, 2020

I'm not denying that taking an in-progress MVP implementation without canonicalisation and building rtt.fresh on top is easier than implementing canonicalisation. My point is that the resulting type system is nominal(ish), so to accept this proposal the committee must commit to a nominal type system.

If the committee is willing to commit to a nominal type system, we should adopt #119 instead, which is actually well-designed, instead of being a flawed last-minute repurposing of a structural proposal.

@RossTate
Copy link
Contributor Author

Ah, sorry, I misunderstood your intent. I agree that the logical next step would be to adopt #119; I was just trying to break the nominal-vs-structural discussion down into more focused subtopics, one of which seemed to be rtt.canon.

@conrad-watt
Copy link
Contributor

More broadly, I don't think consensus on this issue can be used as a stepping stone to get to #119. The fundamental issue in dispute is the necessity of a central top-level with full subtyping knowledge. If that's a given, then any nominal system looks more reasonable. Similarly, "you don't have to implement canonicalisation" is an argument in favour of any nominal system. There's nothing specific about rtt.fresh that makes the debate easier. The only way I could see this issue being viewed more favourably than #119 would be if people don't fully grok the nominal-ness of the proposal and support it just because canonicalisation is hard to implement. That would be the "failure of our specification process" I was alluding to.

Referencing your earlier comment...

Each module, even in a structural language, will have to have already predetermined the required subtyping relations, e.g. in choosing what specific types to use and in choosing when to use rtt.canon versus rtt.sub.

Each module needs to know the subtyping relations used by its own code, but the controversial issue is the necessity of a coordinating top-level that knows about the subtyping uses of every (somewhat coupled) module making up the program. FWIW, I actually think such a top-level probably will always exist in practice, but I'm not the only one who needs convincing!

@tlively
Copy link
Member

tlively commented Nov 1, 2022

The proposal no longer includes RTTs at all, so closing this.

@tlively tlively closed this as completed Nov 1, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants