Define iso-recursive type semantics #243

rossberg · 2021-09-15T14:05:01Z

This is a detailed definition of the iso-recursive type semantics with explicit subtyping that I proposed in the recent meeting. I tried to keep the type representation as "natural" as possible, e.g. avoiding explicit type variables or substitutions except as auxiliary notions in the definition of type equivalence . Please let me know if something is unclear (or wrong).

I intend to implement this in the interpreter next.

proposals/gc/MVP.md

rossberg · 2021-09-17T20:52:38Z

I figured that the formulation was a bit hacky and obscure after all, so I replaced it was something more "standard" and simultaneously more confined to the definition of type equivalence. It should be semantically equivalent to the earlier formulation, but cleaner.

tlively · 2021-09-23T16:59:54Z

Thanks for posting this, @rossberg, and sorry for the delay in responding to it. I still don't completely understand all the details here, so I could use help in answering a few questions.

We previously had a concrete example where isorecursion was unable to express some Java subtyping. Is that example expressible in this new system? Can we be confident that there are no other expressiveness problems?
We have found in our optimization work that it is important in Binaryen to be able to have different types for different nominal source types that are otherwise structurally equivalent. For example, it is important that we have different types for two different but structurally equivalent subclasses of Object. Does this system allow us to define two distinct but structurally equivalent types? Does putting the two structurally equivalent definitions in the same rec group define two distinct types, perhaps?
Is the expectation that whole-program compilers will use just a single large rec group in their output?
The goal of using structural typing is to let source-level structural types to correspond to Wasm types that can be defined separately in various modules without needing any imports to coordinate type identity. How would that look concretely in this system? Would schemes for separate compilation put each connected component of source-level structural types into its own rec group that is duplicated along with its dependencies in every module that uses one of those types?
What does the canonicalization algorithm for this type system look like?

proposals/gc/MVP.md

RossTate · 2021-09-27T16:40:10Z

Recently we have had a number of discussions establishing consensus on what we want for specifically the MVP. In light of the outcomes of those discussions, could you articulate why this variant on isorecursive types should be an MVP feature rather than a Post-MVP feature.

conrad-watt · 2021-09-27T16:55:36Z

@RossTate given that we agreed to prototype both a nominal and a structural system as MVP candidates, I'm not sure what the motivation would be for requiring such a justification right now, unless you're suggesting that an alternative structural proposal would be a more viable MVP candidate?

RossTate · 2021-09-27T17:05:27Z

According to the meeting notes, the following is what we agreed to:

Poll, we will evaluate nominal types by adding the minimum delta necessary (types and instructions) to the current MVP and tools.

The changes in this PR are substantial and will require significant work to implement. As with anything requiring substantial resources, a motivation should be provided.

conrad-watt · 2021-09-27T17:49:51Z

Then the poll was poorly phrased given the spirit of what was decided, and if you feel it's necessary we can secure a revised wording in a future meeting. Since we have a group understanding that equi-recursive canonicalisation is not viable, making a decision based on an evaluation of the current MVP against a nominal proposal is essentially the same as selecting the nominal proposal. Given the conversations that took place in the meeting, I don't think many people interpreted the poll in this way.

titzer · 2021-09-27T18:43:51Z

It was my understanding that we would pursue both structural and nominal types, and the only viable option (due to the performance issues with equi-recursive types) was an iso-recursive formulation. So I have been working under the assumption that we'd be doing this change. So I generally support this PR, though I have not dug into the details.

tlively · 2021-09-27T19:22:02Z

Speaking for Google folks, we've been implementing the system described in the milestone 4 prototype spec to fulfill the intent of that poll. We're also interested in the system in this PR as a potential next step because the milestone 4 spec still uses equirecursive types for its structural component and we will want a better structural component in the final MVP system if there will be a structural component at all. The first step to investigate this system further is discussing it to see if any issues come up that would make implementing it impractical or unnecessary, so I am happy to have this PR open to enable that discussion.

tlively · 2021-09-27T21:34:16Z

Attempting to answer my own questions

We previously had a concrete example where isorecursion was unable to express some Java subtyping. Is that example expressible in this new system? Can we be confident that there are no other expressiveness problems?

I think that example works in this new system because subtyping is precisely according to the declared supertypes and the only constraints on the declared supertypes are that their indices must be less than the subtype index. So in that example B and C would be in the same rec group and they would each declare A as a supertype and each would be valid because in checking each individual subtype relation, you look only at what other subtypes are declared, not whether the others are also valid. This is exactly how subtyping works in our nominal implementation.

One potential problem I can think of is when the ordering constraint on declared supertypes comes into conflict with the adjacency constraint of types within a rec group. For example if you had A <: B <: C and A and C were mutually recursive, but for some reason B had to be in a different rec group. I can't think of any truly concrete examples of that problem, but I could imagine an artificial situation where A and C should be shared structurally between modules but B should not be shared structurally.

If we do find (or believe there exist) concrete instances of this problem, the simplest fix would be to disallow cyclic supertype chains without requiring an ordering constraint. That's the approach we've implemented in Binaryen's nominal implementation and the approach described in the milestone 4 prototype.

We have found in our optimization work that it is important in Binaryen to be able to have different types for different nominal source types that are otherwise structurally equivalent. For example, it is important that we have different types for two different but structurally equivalent subclasses of Object. Does this system allow us to define two distinct but structurally equivalent types? Does putting the two structurally equivalent definitions in the same rec group define two distinct types, perhaps?

Yes, I believe that two structurally equivalent definitions in the same rec group define distinct types. The reason is that recursive types are equivalent only when their rec groups are pointwise equivalent and the projected indices are the same. A whole-program compiler can use this property to exactly recover nominal behavior by putting all of its types in a single rec group.

Is the expectation that whole-program compilers will use just a single large rec group in their output?

If they do, they will get the current nominal behavior but won't be able to structurally share types. This is fine because these compilers see the whole program, so they should feel free to do this.

The goal of using structural typing is to let source-level structural types to correspond to Wasm types that can be defined separately in various modules without needing any imports to coordinate type identity. How would that look concretely in this system? Would schemes for separate compilation put each connected component of source-level structural types into its own rec group that is duplicated along with its dependencies in every module that uses one of those types?

A compiler that does want to share source-level structural types structurally would have to place the shared structural types in their own rec group(s). Compilers intending to share types structurally would need to agree on a convention for grouping and ordering the shared types (and all of their component types) for the sharing to work out. Since different source units might contain different structural types, the rec groups in practice will need to be as small as possible, i.e. they will correspond to strongly connected components, and will be internally ordered according to some total ordering on subtype syntax, including dependencies on other rec groups, which means there will need to be a total ordering on entire rec groups as well.

What does the canonicalization algorithm for this type system look like?

Each rec group is represented as a single finite syntax tree and the set of all rec groups is organized into a dependence tree. The dependence tree is acyclic because mutually recursive types must always be in the same rec group (otherwise they wouldn't be valid). rec groups are deduplicated bottom-up in the dependency graph by hash-consing their structures and comparing their dependencies (which must have already been canonicalized).

A problem

Consider module 1 defining structural tuple types A and C such that C <: A. Module 1 must "nominally" declare that A is C's supertype. Then consider module 2 using A and C and also defining another structural tuple type B such that we should have C <: B <: A. But C was declared to have A as a supertype, so any code in module 2 depending on C <: B will not validate. Or alternatively, module 2 could declare C <: B as intended, but then C in module 2 would be a distinct type from C in module 1 because type equivalence requires declared supertypes to match.

So it seems that this system cannot directly express shared structural types when those structural types have subtyping, which I think is the case for at least function types in most relevant languages. I'm struggling to think of workarounds that would not also work in a simpler, purely nominal system. Given this limitation, are the benefits of this system over a purely nominal system worth the extra complexity? What do you think @rossberg? Is there a simple workaround that I'm not seeing?

rossberg · 2021-09-28T13:27:52Z

[Sorry for not replying earlier, was out sick.]

We previously had a concrete example where isorecursion was unable to express some Java subtyping. Is that example expressible in this new system? Can we be confident that there are no other expressiveness problems?

I think that example works in this new system because subtyping is precisely according to the declared supertypes [...] This is exactly how subtyping works in our nominal implementation.

Right.

One potential problem I can think of is when the ordering constraint on declared supertypes comes into conflict with the adjacency constraint of types within a rec group. For example if you had A <: B <: C and A and C were mutually recursive, but for some reason B had to be in a different rec group. I can't think of any truly concrete examples of that problem, but I could imagine an artificial situation where A and C should be shared structurally between modules but B should not be shared structurally.
If we do find (or believe there exist) concrete instances of this problem, the simplest fix would be to disallow cyclic supertype chains without requiring an ordering constraint. That's the approach we've implemented in Binaryen's nominal implementation and the approach described in the milestone 4 prototype.

Right, there are alternative ways to impose the non-cyclicity constraint, but this is the simplest, and I can't think of any scenario where it would be a serious problem.

Note that in your example, B could not be pulled out of the recursion group no matter what. The definition of C depends on B depends on A, so they are all mutually recursive.

We have found in our optimization work that it is important in Binaryen to be able to have different types for different nominal source types that are otherwise structurally equivalent. For example, it is important that we have different types for two different but structurally equivalent subclasses of Object. Does this system allow us to define two distinct but structurally equivalent types? Does putting the two structurally equivalent definitions in the same rec group define two distinct types, perhaps?

Yes, I believe that two structurally equivalent definitions in the same rec group define distinct types. The reason is that recursive types are equivalent only when their rec groups are pointwise equivalent and the projected indices are the same. A whole-program compiler can use this property to exactly recover nominal behavior by putting all of its types in a single rec group.

Yes, the degenerate case of just using one whole-program recursive group essentially emulates fully nominal behaviour.

That said, I'm confused why an optimisation would depend on distinguishing equivalent representations. Can you elaborate? This is the first time I'm hearing of this problem, I believe.

Is the expectation that whole-program compilers will use just a single large rec group in their output?

If they do, they will get the current nominal behavior but won't be able to structurally share types. This is fine because these compilers see the whole program, so they should feel free to do this.

I don't necessarily expect them to do that, but of course they could, if it avoids some hassle for them.

The goal of using structural typing is to let source-level structural types to correspond to Wasm types that can be defined separately in various modules without needing any imports to coordinate type identity. How would that look concretely in this system? Would schemes for separate compilation put each connected component of source-level structural types into its own rec group that is duplicated along with its dependencies in every module that uses one of those types?

Yes. This duplication is generally unavoidable anyway, because at the import boundaries each module has to repeat the definition of any imported type it uses, even under a hypothetical fully nominal semantics. Otherwise, validation, compilation, and link-time checking wouldn't work under Wasm's module system. However, a nominal or equi-recursive semantics might get away with keeping some types in a recursion group fully abstract in some import cases, while iso-recursive does not. Not clear how relevant this case would be in practice, though. But it certainly is an incentive to keep recursion groups small when doing separate compilation.

A compiler that does want to share source-level structural types structurally would have to place the shared structural types in their own rec group(s). Compilers intending to share types structurally would need to agree on a convention for grouping and ordering the shared types (and all of their component types) for the sharing to work out.

Yes, producers have to canonicalise type definitions suitably. That is already the case in some places, e.g., with the order of fields in a struct, or the order of cases in a source-level variant type (when lowering it to int tags).

FWIW, the main use case here is linking multiple modules produced by the same compiler, so the type canonicalisation ought to be implicitly consistent. Multiple different compilers for the same language interacting is not a use case we have seriously discussed thus far. If they would not want to go through interface types, they would of course have to agree on and codify their low-level data layout strategy. But that is a fairly hypothetical scenario, I think – I'm not aware of such degree of compatibility between high-level language compilers for native code elsewhere.

What does the canonicalization algorithm for this type system look like?

Each rec group is represented as a single finite syntax tree and the set of all rec groups is organized into a dependence tree. The dependence tree is acyclic because mutually recursive types must always be in the same rec group (otherwise they wouldn't be valid). rec groups are deduplicated bottom-up in the dependency graph by hash-consing their structures and comparing their dependencies (which must have already been canonicalized).

Exactly. That is the main property of the iso-recursive semantics.

A problem

Consider module 1 defining structural tuple types A and C such that C <: A. Module 1 must "nominally" declare that A is C's supertype. Then consider module 2 using A and C and also defining another structural tuple type B such that we should have C <: B <: A. But C was declared to have A as a supertype, so any code in module 2 depending on C <: B will not validate. Or alternatively, module 2 could declare C <: B as intended, but then C in module 2 would be a distinct type from C in module 1 because type equivalence requires declared supertypes to match.
So it seems that this system cannot directly express shared structural types when those structural types have subtyping, which I think is the case for at least function types in most relevant languages. I'm struggling to think of workarounds that would not also work in a simpler, purely nominal system.

Yes, with pre-declared subtyping, you have the same limitations in this regard as any such system, including a nominal one.

Given this limitation, are the benefits of this system over a purely nominal system worth the extra complexity?

The benefit of structural typing is independent of that. When compiling nominal source languages, this obviously is a non-issue. When compiling structural source languages that do not have source-level subtyping, you also have no issue. Such languages might still make internal use of subtyping hierarchies, but these tend to be very shallow and well-layered (such as lowering closure types). That is the main use case this is trying to address.

The only scenario where the limitation would become relevant would be lowering a source language with structural subtyping to Wasm in a direct, non-coercive manner. Such non-coercive compilation obviously won't fly with pre-declared subtyping, no surprise there. That is the price we are willing to pay, since such languages are rare, and non-coercive compilation for them would likely run into other blockers anyway (e.g., wrt to generics).

If this ever becomes a practical concern, the explicit, zero-cost coercion operator I mentioned in my presentation could be a way out. This possibility was pointed out to me by Sam Lindley.

Co-authored-by: Thomas Lively <[email protected]>

RossTate · 2021-09-28T14:14:14Z

Given this limitation, are the benefits of this system over a purely nominal system worth the extra complexity?

The benefit of structural typing is independent of that.

Given that we have decided that separate compilation (of any language) is a Post-MVP concern, and that we are willing to add more features in the Post-MVP to address issues that arise for separate compilation, could you either articulate an MVP scenario with which we could evaluate the claimed benefits granted by this extra complexity or articulate a forwards-compatibility issue in the MVP that would prevent us from adding this in a Post-MVP?

askeksa-google · 2022-02-01T11:38:56Z

I think it's safe to assume that if a producer wants Binaryen to perform type minimization, then it does not depend on distinguishing the identities of structurally identical types. Otherwise, as you say, there isn't really much that Binaryen can do.

So the flag controlling this behavior in Binaryen will be a signal from the producer that it does not need this distinction, which then enables Binaryen to perform the minimization / canonicalization.

If the producer needs to distinguish between certain types, it will be its responsibility to create recursion groups that uphold that distinction.

kripken · 2022-02-01T17:43:50Z

To make sure I understand, is it correct that we are expecting two workflows here? Like this:

The producer expects to link wasm modules at runtime. It emits proper recursion groups for that. An optimizer like Binaryen must not modify those recursion groups in any way (except, I guess, for removing an entire recursion group when possible).
The producer does not expect to link modules at runtime. An optimizer like Binaryen can freely optimize types inside the module (removing types, removing fields, etc.). In this case, it would be best to emit a single big rec group, both because it saves some space in the type section, and also because that defines types in the most precise way possible (there is no automatic merging of types that would limit toolchain or VM optimizations).

And optimizers like Binaryen will need a flag to distinguish between these two cases. An open question is what the default should be for such a flag.

Does all that make sense? Or is there a third use case, or is something wrong in one of the use cases?

askeksa-google · 2022-02-01T20:02:11Z

There is a third use case, the first in #243 (comment): The producer expects to link its Wasm modules, but it will always optimize its modules with Binaryen (or a similar tool, but crucially the same for all modules). Here, the optimizer is free to change the recursion groups as long as so does it in a deterministic manner (e.g. equi-recursive canonicalization) to make the groups consistent across the modules.

kripken · 2022-02-01T20:54:14Z

@askeksa-google I see, yes, I guess as long as the optimizations are "deterministic" (do not change depending on the code in each module) then that could work.

But how much benefit do you expect there? My intuition is that the large improvements come from optimizations that do depend on the code, like removing unused types and fields, making fields immutable, etc.

tlively · 2022-02-01T21:50:45Z

I measured the type validation and canonicalization time for a real-world j2wasm module across three type systems, both before and after applying Binaryen optimizations at -O2. For each type system, I measured the total validation and canonicalization time as well as times for two separate components of the work. The first component involves validating the new heap type definitions and deduplicating their definitions and the second component involves moving the new heap types and copying the new value types to the global type stores. The implementation of the second part is the same for all three systems, but there's probably some room to optimize it for the nominal and isorecursive systems. The modules have single large recursion groups in the isorecursive case. I am also interested in measuring the case of minimal recursion groups, but don't have the modules in that format yet.

The unoptimized module has 88442 type definitions and is about 93 MB, including a name section. Validating and building these types takes:

Nominal: 402 ms (19 ms + 381 ms)
Isorecursive: 451 ms (69 ms + 382 ms)
Equirecursive: 14855 ms (493 ms + 14361 ms)

The optimized module has 24843 type definitions and is about 10 MB without a name section. Validating and building these types takes:

Nominal: 89 ms (4 ms + 84 ms)
Isorecursive: 104 ms (19 ms + 85 ms)
Equirecursive: 4282 ms (139 ms + 4143 ms)

My takeaway is that the extra overhead of canonicalizing isorecursive types over nominal types is probably not an issue, especially considering that the canonicalization algorithm is linear.

tlively · 2022-02-02T00:29:54Z

I also measured an upper bound on the code size cost of the additional constraints on the order of types in the type section. In nominal and equirecursive modes, Binaryen orders types by decreasing use count so that frequently used types have smaller LEB-encoded indices. I measured the same optimized module from the previous comment with the optimal default ordering, an ordering that respects the isorecursive constraints on the ordering of subtypes but otherwise tries to be optimal, and the reverse of the optimal ordering. Again, the isorecursive case uses a single large recursion group, so this is the least constrained case for the isorecursive system, but even in the worst case the isorecursive system can be no worse than the pessimal ordering.

Optimal nominal ordering (baseline):

binary size: 10104049 bytes
gzipped binary size: 2402160 bytes

Isorecursive ordering:

binary size: 10106225 bytes (+0.02%)
gzipped binary size: 2400039 bytes (-0.09%)

Pessimal nominal ordering:

binary size: 11205902 bytes (+10.9%)
gzipped binary size: 2438062 (+1.5%)

So I don't think binary size will be a problem for the isorecursive type system either.

askeksa-google · 2022-02-02T10:37:58Z

@kripken I don't see optimization as the main goal in this scenario. Rather, it allows a producer to not perform any type canonicalization itself, leaving it all to e.g. Binaryen, while still being able to link its modules at runtime.

kripken · 2022-02-02T16:31:38Z

@askeksa-google Ok, that's where we were seeing things differently then. I was only thinking about optimizations, where I think a single flag is enough. Yes, makes sense that if there are other non-optimization stuff we should do in Binaryen then those could be additional flags as needed.

proposals/gc/MVP.md

rossberg · 2022-02-03T09:54:56Z

@tlively, thanks for these numbers. That's encouraging!

FWIW, if necessary, I wouldn't mind imposing a different constraint to ensure acyclicity in the subtype relation. The ordering restriction merely seemed the simplest and cheapest to check.

tlively · 2022-02-10T17:13:34Z

proposals/gc/MVP.md

-* `deftype` is a new category of types that generalises the existing type definitions in the type section
-  - `deftype ::= <functype> | <structtype> | <arraytype>`
+* `deftype` is the syntax for an entry in the type section, generalising the existing syntax
+  - `deftype ::= <subtype> | rec <subtype>*`


Do I understand correctly that this would be simplified if we made singleton type definitions equivalent to rec groups of size 1? @rossberg, what do you think about making that change in this PR to simplify the semantics and hopefully make them more readable.

Done: preexisting "regular" type definitions are now reinterpreted as syntactic shorthands for singleton recursion groups.

tlively · 2022-02-10T17:14:26Z

proposals/gc/MVP.md

+* `subtype` is a new category of type defining a single type, as a subtype of possible other types
+  - `subtype ::= sub <typeidx>* <strtype>`
+  - the preexisting syntax with no `sub` clause is redefined to be a shorthand for a `sub` clause with empty `typeidx` list: `<strtype> == sub () <strtype>`
+  - Note: This allows multiple supertypes. For the MVP, it could be restricted to at most one supertype.


Let's change the language from "could be restricted" to "is restricted."

proposals/gc/MVP.md

tlively · 2022-02-10T17:42:12Z

proposals/gc/MVP.md

+
+Note 2: This semantics implies that type equivalence checks can be implemented in constant-time by representing all types as trees in tied form and canonicalising them bottom-up in linear time upfront.
+
+Note 3: It's worth noting that the only observable difference to a nominal type system is the equivalence rule on (non-recursive) type indices: instead of looking at their definitions, a nominal system would require `$t = $t'` syntactically (at least as long as we ignore things like checking imports, where type indices become meaningless).


Other differences are the presence of recursion groups and the restrictions on which type indices can be used in the definition of a type. Maybe it would be more useful to note here that having a single large recursion group essentially gives you a nominal system?

The comment was referring to the difference in the rules. I reworded it to make that more explicit, and also added a sentence describing the practical consequence you mention.

proposals/gc/MVP.md

tlively · 2022-02-10T17:46:31Z

proposals/gc/MVP.md

     - if `$t = <structtype>` or `$t = <arraytype>`
-     - or `$t = type ht` and `rt <: data` (imports)
-  - `(type $t) <: func`
+     - or `$t = type ht` and `ht <: data` (imports)


Are these lines referring to type imports? I know they were already in here, but let's remove them since we don't plan to have type imports in the MVP.

Right, removed dependence on type imports.

tlively

LGTM!

tlively · 2022-02-14T17:03:41Z

proposals/gc/MVP.md


-Note: RTT values correspond to type descriptors or "shape" objects as they exist in various engines. RTT equality can be implemented as a single pointer test by memoising RTT values. More interestingly, runtime casts along the hierachy encoded in these values can be implemented in an engine efficiently by using well-known techniques such as including a vector of its (direct and indirect) super-RTTs in each RTT value (with itself as the last entry). The value `<n>` then denotes the length of this vector. A subtype check between two RTT values can be implemented as follows using such a representation. Assume RTT value v1 has static type `(rtt n1? $t1)` and v2 has type `(rtt n2? $t2)`. To check whether v1 denotes a sub-RTT of v2, first verify that `n1 >= n2` -- if both `n1` and `n2` are known statically, this can be performed at compile time; if either is not statically known, it has to be read from the respective RTT value dynamically, and `n1 >= n2` becomes a dynamic check. Then compare v2 to the n2-th entry in v1's supertype vector. If they are equal, v1 is a sub-RTT.
+Note: RTT values correspond to type descriptors or "shape" objects as they exist in various engines. RTT equality can be implemented as a single pointer test by memoising RTT values. More interestingly, runtime casts along the hierachy encoded in these values can be implemented in an engine efficiently by using well-known techniques such as including a vector of its (direct and indirect) super-RTTs in each RTT value (with itself as the last entry). A subtype check between two RTT values can be implemented as follows using such a representation. Assume RTT value v1 has static type `(rtt $t1)` and v2 has type `(rtt $t2)`. Let `n1` and `n2` be the lenghts of the respective supertype vectors. To check whether v1 denotes a subtype RTT of v2, first verify that `n1 >= n2` -- if both `n1` and `n2` are known statically, this can be performed at compile time; if either is not statically known, it has to be read from the respective RTT value dynamically, and `n1 >= n2` becomes a dynamic check. Then compare v2 to the n2-th entry in v1's supertype vector. If they are equal, v1 is a subtype RTT.


A couple spelling fixes and remove the mention of n1 and n2 not being statically known.

Suggested change

Note: RTT values correspond to type descriptors or "shape" objects as they exist in various engines. RTT equality can be implemented as a single pointer test by memoising RTT values. More interestingly, runtime casts along the hierachy encoded in these values can be implemented in an engine efficiently by using well-known techniques such as including a vector of its (direct and indirect) super-RTTs in each RTT value (with itself as the last entry). A subtype check between two RTT values can be implemented as follows using such a representation. Assume RTT value v1 has static type `(rtt $t1)` and v2 has type `(rtt $t2)`. Let `n1` and `n2` be the lenghts of the respective supertype vectors. To check whether v1 denotes a subtype RTT of v2, first verify that `n1 >= n2` -- if both `n1` and `n2` are known statically, this can be performed at compile time; if either is not statically known, it has to be read from the respective RTT value dynamically, and `n1 >= n2` becomes a dynamic check. Then compare v2 to the n2-th entry in v1's supertype vector. If they are equal, v1 is a subtype RTT.

Note: RTT values correspond to type descriptors or "shape" objects as they exist in various engines. RTT equality can be implemented as a single pointer test by memoising RTT values. More interestingly, runtime casts along the hierarchy encoded in these values can be implemented in an engine efficiently by using well-known techniques such as including a vector of its (direct and indirect) super-RTTs in each RTT value (with itself as the last entry). A subtype check between two RTT values can be implemented as follows using such a representation. Assume RTT value v1 has static type `(rtt $t1)` and v2 has type `(rtt $t2)`. Let `n1` and `n2` be the lengths of the respective supertype vectors. To check whether v1 denotes a subtype RTT of v2, first verify that `n1 >= n2` -- both `n1` and `n2` are known statically, so this can be performed at compile time. Then compare v2 to the n2-th entry in v1's supertype vector. If they are equal, v1 is a subtype RTT.

Ah, but for casts the type $t1 itself usually is not statically known (only a supertype of it), so neither is n1. I pushed a tweak that points that out.

rossberg added 2 commits September 15, 2021 15:46

Merge branch 'funcref'

367ba59

Context lookup

6e78232

rossberg requested review from conrad-watt and tlively September 15, 2021 14:05

Merge branch 'master' into iso

70799a5

fgmccabe reviewed Sep 16, 2021

View reviewed changes

proposals/gc/MVP.md Outdated Show resolved Hide resolved

fgmccabe reviewed Sep 16, 2021

View reviewed changes

proposals/gc/MVP.md Outdated Show resolved Hide resolved

rossberg added 7 commits September 17, 2021 07:03

Fixes; rename deftype->subtype

be40331

Fix example

37bebaf

Example

68ce610

Eps

4b2d4e8

Eps

e9d9226

Fix unrolling

de79609

Simplify semantics: use rec variable, but only in eq rules

36d37a7

Clean up

7580a30

tlively reviewed Sep 27, 2021

View reviewed changes

conrad-watt mentioned this pull request Sep 27, 2021

Agenda for subgroup meeting - October 5 #244

Closed

rossberg and others added 2 commits September 28, 2021 15:28

Update proposals/gc/MVP.md

70af4a2

Co-authored-by: Thomas Lively <[email protected]>

Update proposals/gc/MVP.md

f7cd0b6

Co-authored-by: Thomas Lively <[email protected]>

tlively reviewed Feb 2, 2022

View reviewed changes

proposals/gc/MVP.md Show resolved Hide resolved

tlively reviewed Feb 10, 2022

View reviewed changes

This was referenced Feb 10, 2022

Pushing RTTs to post-MVP #275

Closed

Performance of type equivalence and subtype checking #68

Closed

Dynamic linking requirements #228

Closed

exnrefs are nominal #85

Closed

Minimal delta for evaluating nominal types #234

Closed

Address comments

a6385fd

tlively approved these changes Feb 14, 2022

View reviewed changes

rossberg and others added 2 commits February 15, 2022 09:34

Typos; clarification

ab8bc68

Merge branch 'master' into iso

a7eb3a1

rossberg merged commit 6a5b5fa into master Feb 15, 2022

rossberg deleted the iso branch February 15, 2022 08:52

titzer mentioned this pull request Mar 5, 2022

Lift recursion groups to function-references proposal? #282

Closed

tlively mentioned this pull request Nov 1, 2022

Difference between rec groups of size one and types without rec groups? #334

Closed

rossberg mentioned this pull request Apr 18, 2024

Component comments rossberg/dummy#1

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Define iso-recursive type semantics #243

Define iso-recursive type semantics #243

rossberg commented Sep 15, 2021 •

edited

Loading

rossberg commented Sep 17, 2021

tlively commented Sep 23, 2021

RossTate commented Sep 27, 2021

conrad-watt commented Sep 27, 2021

RossTate commented Sep 27, 2021

conrad-watt commented Sep 27, 2021

titzer commented Sep 27, 2021 •

edited

Loading

tlively commented Sep 27, 2021

tlively commented Sep 27, 2021

rossberg commented Sep 28, 2021

A problem

RossTate commented Sep 28, 2021

askeksa-google commented Feb 1, 2022

kripken commented Feb 1, 2022

askeksa-google commented Feb 1, 2022

kripken commented Feb 1, 2022

tlively commented Feb 1, 2022 •

edited

Loading

tlively commented Feb 2, 2022

askeksa-google commented Feb 2, 2022

kripken commented Feb 2, 2022

rossberg commented Feb 3, 2022

tlively Feb 10, 2022

rossberg Feb 14, 2022

tlively Feb 10, 2022

rossberg Feb 14, 2022

tlively Feb 10, 2022

rossberg Feb 14, 2022

tlively Feb 10, 2022

rossberg Feb 14, 2022

tlively left a comment

tlively Feb 14, 2022

rossberg Feb 15, 2022


		Note 2: This semantics implies that type equivalence checks can be implemented in constant-time by representing all types as trees in tied form and canonicalising them bottom-up in linear time upfront.

		Note 3: It's worth noting that the only observable difference to a nominal type system is the equivalence rule on (non-recursive) type indices: instead of looking at their definitions, a nominal system would require `$t = $t'` syntactically (at least as long as we ignore things like checking imports, where type indices become meaningless).


		Note: RTT values correspond to type descriptors or "shape" objects as they exist in various engines. RTT equality can be implemented as a single pointer test by memoising RTT values. More interestingly, runtime casts along the hierachy encoded in these values can be implemented in an engine efficiently by using well-known techniques such as including a vector of its (direct and indirect) super-RTTs in each RTT value (with itself as the last entry). The value `<n>` then denotes the length of this vector. A subtype check between two RTT values can be implemented as follows using such a representation. Assume RTT value v1 has static type `(rtt n1? $t1)` and v2 has type `(rtt n2? $t2)`. To check whether v1 denotes a sub-RTT of v2, first verify that `n1 >= n2` -- if both `n1` and `n2` are known statically, this can be performed at compile time; if either is not statically known, it has to be read from the respective RTT value dynamically, and `n1 >= n2` becomes a dynamic check. Then compare v2 to the n2-th entry in v1's supertype vector. If they are equal, v1 is a sub-RTT.
		Note: RTT values correspond to type descriptors or "shape" objects as they exist in various engines. RTT equality can be implemented as a single pointer test by memoising RTT values. More interestingly, runtime casts along the hierachy encoded in these values can be implemented in an engine efficiently by using well-known techniques such as including a vector of its (direct and indirect) super-RTTs in each RTT value (with itself as the last entry). A subtype check between two RTT values can be implemented as follows using such a representation. Assume RTT value v1 has static type `(rtt $t1)` and v2 has type `(rtt $t2)`. Let `n1` and `n2` be the lenghts of the respective supertype vectors. To check whether v1 denotes a subtype RTT of v2, first verify that `n1 >= n2` -- if both `n1` and `n2` are known statically, this can be performed at compile time; if either is not statically known, it has to be read from the respective RTT value dynamically, and `n1 >= n2` becomes a dynamic check. Then compare v2 to the n2-th entry in v1's supertype vector. If they are equal, v1 is a subtype RTT.

Define iso-recursive type semantics #243

Define iso-recursive type semantics #243

Conversation

rossberg commented Sep 15, 2021 • edited Loading

rossberg commented Sep 17, 2021

tlively commented Sep 23, 2021

RossTate commented Sep 27, 2021

conrad-watt commented Sep 27, 2021

RossTate commented Sep 27, 2021

conrad-watt commented Sep 27, 2021

titzer commented Sep 27, 2021 • edited Loading

tlively commented Sep 27, 2021

tlively commented Sep 27, 2021

Attempting to answer my own questions

A problem

rossberg commented Sep 28, 2021

A problem

RossTate commented Sep 28, 2021

askeksa-google commented Feb 1, 2022

kripken commented Feb 1, 2022

askeksa-google commented Feb 1, 2022

kripken commented Feb 1, 2022

tlively commented Feb 1, 2022 • edited Loading

tlively commented Feb 2, 2022

askeksa-google commented Feb 2, 2022

kripken commented Feb 2, 2022

rossberg commented Feb 3, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tlively left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rossberg commented Sep 15, 2021 •

edited

Loading

titzer commented Sep 27, 2021 •

edited

Loading

tlively commented Feb 1, 2022 •

edited

Loading