-
Notifications
You must be signed in to change notification settings - Fork 74
Isorecursion seems too weak #217
Comments
Once the slides from this morning's meeting are posted, it would be great if you could elaborate on how the system as presented fails in these cases. I tried to reason through it myself, but my memory/understanding of the presented subtyping details is too spotty :) |
As far as I can see, these examples are expressible just fine, provided we support nested recursive types as I alluded to (I didn't present the full Wasm syntax/rules for that, but you can extrapolate from the standard rules for iso-recursive types). All you have to do is wrap classes into another layer of type recursion:
We can generalise this pattern to an extreme where we simply wrap the entire program's type section into one big type recursion. That would emulate the global recursive namespace that implicitly exists for nominal types -- the names correspond to the distinct paths inside the root type. And as with nominal types, this global approach will break down at module/linking boundaries. Because in a modular system, "global" can only be "global wrt to a single module". There is no global name space beyond that. You'd need cyclic linking to allow name-based recursion to extend beyond module boundaries (which in turn requires a vastly different and more complex semantics for type recursion). We can only have our cake and eat it too with equi-recursive types, because with those, each module in a recursive group can freely duplicate the types from other modules in its group, and linking will equate them. With iso or nominal, you'd have to break the recursion somehow, using anyref plus casts as elsewhere. |
I'll start with the easiest one. According to the subtyping slide of the presentation (link to slides for easy reference), For my second example, I made a mistake as I had forgotten that the subtyping rule allowed prefixing. This mistake can be rectified either by making Besides these examples, there are some larger issues I should point out. One is that the rules provided are not reflexive. Consider the following:
According to the rules given, this subtyping holds only if This can be fixed by adding a reflexivity rule, but it is also unknown if the given rules are transitive. Sure, that could be fixed by adding a transitivity rule, but at present it is unknown if this transitive closure is decidable. This paper claimed to have developed such an algorithmic formulation, but subsequent attempts to mechanize the proof of transitivity failed—this paper had to restrict isorecursive types to covariant fixpoints, thereby eliminating the example above entirely from the type system. Note that my example above would arise regularly in OO programs compiling to wasm due to the These problems are well-known in the subtyping community. However, they are not as well known in the ML module community where the issues do not arise due to the focus on type equality rather than subtyping (as ML has no type-level subtyping). These issues do not, however, arise with nominal types. The handwavy comparisons @rossberg makes seem to be due to related misunderstandings of subtyping systems. I might be wrong, but he has been asked on numerous occasions over multiple years, both in public and in private, to give concrete examples that would illustrate his concerns, and he has not. As a reminder to the group, at the in-person meeting he claimed that nominal types would require substantially more imports than structural types, but in #128 the problem was examined in more depth and the conclusion was that the two approaches would need roughly the same number of imports. Similarly, he has claimed that structural types would be necessary to support simple dynamic linking models, but in #156 a very simple approach to support such models with nominal types was offered and seemingly supported by many people. And in another presentation he claimed that C# would need host-provided type canonicalization and consequently structural types, but in #215 I (who, unlike him, have actually implemented a C# runtime, and on a typed backend no less) explained why C# would not need or even be helped by such a feature and in fact would find it a hindrance to representing its own internal data structures. It is demeaning, demoralizing, and counterproductive to have the champion act as if such discussions never took place and continue to spread unsubstantiated claims. Since I do not know who has observed these conversations or the relevant corrections to his presentations, it puts me in a position where every time he makes such claims in new places (such as in this thread) I feel I need to inform people about the actual status quo (as I am doing now) to provide appropriate context and prevent further spread of misinformation, which distracts from the topic at hand. If the champion still believes such claims, I would ask that he would back them up in the many locations that explicitly requested such evidence and insight from him. But this thread is about the limitations of isorecursive subtyping, not (perceived) limitations of nominal subtyping, and as such I would ask @rossberg to refrain from making negative comments about nominal types so that we can stay on topic. Moving on. |
So working through the first example:
We can simplify @rossberg's lowering to ignore vtables and
Now we can check that
This almost seems to work given the extension to nested recursion (which we should certainly formalize somewhere to make sure we are all talking about the same thing) and the addition of a reflexivity rule. The only issue is that it looks like we have to prove @RossTate, it does look like allowing nested recursive types resolves the problem you spotted with the indices needing to match, but may not resolve the problem above. Does that agree with your understanding? I was curious about the decidability and looked at the second paper you linked to above, @RossTate. Here's the relevant section from page 26:
I don't have the full context here, so I can't really judge for myself how relevant this is to our current situation. Are we affected by the mentioned "spurious restrictions on the usage of type variables" given that we don't have type variables in the language? Or is that talking about type metavariables in the rules? Would the proof that "only works for 'executable' environments" work for us? |
Thank you very much, @tlively, for the insightful post and for pushing the discussion forward! So the issue with handwaving extensions is that 1) the ambiguity can lead to different people inferring different extensions, and 2) we have no understanding of the expressiveness and decidability properties of the extensions. In this case, you and I have inferred different extensions with different decidability properties. I'll elaborate below, using your simplified example (repeated here so that people don't have to scroll up and down):
The extension I inferred is for a subtyping like
With my inferred extension, the subtyping Now your inferred extension is perfectly reasonable too (as there's an easy way to fix the digression problem—the presentation was a little imprecise with its formulation). However, it seems to have very different computational properties. To see why, observe that your definitions for
Consider then, why have the layer of indirection and what is it doing? It's definitely doing something since eliminating it results in the following, for which the subtyping breaks:
Well, let's consider another minimized example:
Now, by the same reasoning you gave, you can prove that I don't know which of the two inferred interpretations @rossberg intended, or if there was another neither of us came up with, but it seems either the extension is insufficiently expressive or it is too expressive to preserve the cheap canonicalization property of isorecursive types. In fact, it seems like any extension that would make it possible for one index to be a subtype of another would also make it possible for the two indices to be equivalent and consequently would seem to break the cheap canocialization property that has led us to consider isorecursive types in the first place. |
Okay, let's flesh this out. The following rules for (higher-kinded) subtyping on type tuples and iso-recursive types are fairly standard:
To deal with tuple-kinded iso-recursion, you will want one more rule, something like the following:
So recursive types are not entirely rigid at tuple kind, we allow beta reduction of projections across μ. This is fairly weak but still inductive. (A is relating not just variables but paths now.) Turning back to the example, we have to be a bit more precise about what are type definitions and what are internal variables. So let me enrich the notation and make the rec variables explicit (this wouldn't need to be explicit in the text or binary format, but the semantics needs to keep the distinction straight):
Now we can derive
There is no loop here, since the latter goal is on mere paths, not the original projections from rec types (also, A has been extended).
No. Neither of these transformations is valid with iso-recursive types. For starters, they wouldn't even preserve kinds. However, it is true that the inner recursions aren't used, so one could simplify inner rec's to plain type tuples.
They're equivalent only if you add an antisymmetry rule as an axiom. You are right, though, that transitivity is not immediately obvious, so will need some checking. However, I don't think that the papers you pointed to are a strong signal to the contrary, since they define waaaaay richer type systems, with full-blown dependent types, unions, intersections, and refinements, which make transitivity difficult to establish for plenty of other reasons. That all being said, I want to stress the other observation I made above: in general, we won't be able to compile mutually recursive classes to Wasm without breaking up the inter-class recursion (except with equi-recursive types ;) ). So these examples are kind of useless as a general compilation scheme anyway. Given that, I am not convinced it's even worth supporting nested recursion/tuples like this. |
Yeah, this is the extension that I showed @tlively's interpretation effectively enabled. But, as I pointed out, it completely changes the metatheoretic properties of the system. For example, even ignoring reflexivity/transitivity issues, this extension changes the complexity from linear time to quadratic time. But the following is probably the more important impact of this extension.
You seem to be suggesting that, in addition to subtype equivalence (i.e. the two types are subtypes of each other), there would be another equivalence relation. Subtype equivalence is what matters for type-checking, so what would be the purpose of this other equivalence relation? From context, it sounds like this other equivalence relation is meant to indicate when
There are other efforts that simply failed (and hence are not published). This paper illustrates that the other complexities were fairly inconsequential—just the isorecursion was problematic (hence the problem was solved by solely restricting isorecursion).
You are repeating the same unsubstantiated claim you made above even after I explained why this was problematic and specifically requested you not to do so. Your winky face suggests you are aware that this is inappropriate but somehow think this is cute—you did the same thing in this inappropriate comment as well. Please stop this belittling and counterproductive behavior.
This seems dismissive. As I've pointed out before, |
Thanks for fleshing those rules out, @rossberg! For the
@RossTate, can you explain how this rule changes the complexity to quadratic? Also, to clarify, is this the complexity to check the subtype relation between two types?
I'm not sure this is what @rossberg is suggesting. @RossTate, can we roll back this line of conversation a bit and can you re-explain the problem of loss of simple canonicalization in light of the newly fleshed-out subtyping rules? I also agree that this thread is not the place to discuss |
Thanks, @tlively, for helping so much with the discussion!
On my end, I see a mistake I made. In my head, I automatically "fixed" the To see why, consider that it's requirement is to prove The fix to this is to permit Recall that equirecursive subtyping is also quadratic. That's no coincidence—you can encode equirecursive subtyping in this extended version of isorecursive subtyping. Each type component essentially corresponds to a state of the automata representation of an equirecursive type, and the collection of projection-pairs that end up getting assumed forms the bisimulation relation between these states. So we're back to where we came from. The question, then, is what is the impact of this on canonicalization?
So However, once we add rules that can make projections of different indices subtypes, then there is no such syntax because distinct projections can be equivalent. For example, |
@RossTate, I don't quite understand your argument for why |
Sure thing. The following is a variant on my first example (whose primary challenge is relating types within the same recursion grape):
If you consider why this type-checks nominally, the fact that That's the intuition. I'll reformulate the example with the notation we've been using above:
Now for the subtyping:
Does that clarify, @tlively? |
Fun fact: I tested the scenario in Java. It compiles and works as expected!
Francis
…On Mon, May 24, 2021 at 11:20 AM Ross Tate ***@***.***> wrote:
Sure thing. The following is a variant on my first example (whose primary
challenge is relating types within the same recursion grape):
class A { B b; A foo(); }
class B extends A { override C foo(); } // inherits field B b
class C extends A { override B foo(); } // inherits field B b
If you consider why this type-checks nominally, the fact that B can
extend A depends on the fact that C (not B this time) is a subtype of A.
So the projection-pair subtyping for B <: A depends on the
projection-pair subtyping C <: A. But rec-proj only allows
projection-pair subtypings to depend on themselves. (For fun, in this
example I also made the projection-pair subtyping for C <: A depend on B
<: A so that we still have a cycle of projection-pair subtyping
dependencies.)
That's the intuition. I'll reformulate the example with the notation we've
been using above:
type $ABC = rec $ABCself
$A = struct (mut (ref $ABCself.$B)) (ref (func [] -> [ref $ABCself.$A]))
$B = struct (mut (ref $ABCself.$B)) (ref (func [] -> [ref $ABCself.$C]))
$C = struct (mut (ref $ABCself.$B)) (ref (func [] -> [ref $ABCself.$B]))
Now for the subtyping:
⊢ $ABC.$B <: $ABC.$A
-------------------------- [rec-proj] [struct] ([refl] | [func])
$ABCself.$B <: $ABCself.$A ⊢ (ref $ABCself.$C) <: (ref $ABCself.$A)
-------------------------- [ref]
$ABCself.$B <: $ABCself.$A ⊢ $ABCself.$C <: $ABCself.$A
><
failure - mismatch between assumed subtyping and goal subtyping
Does that clarify, @tlively <https://github.com/tlively>?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#217 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAQAXUAPSPXVWCOIHE6MNCLTPKKFNANCNFSM45DALP4A>
.
--
Francis McCabe
SWE
|
@tlively, yes, the rules are too weak for that example. So, let me try to summarise what I believe we now all agree on:
So the concern boils down to how relevant we consider limitations (3) and (4). Personally, I don't think they would be deal breakers. Again, insert my standard reminder that there will always be things that our type systems won't be able to express and where we will have to fall back to casts. And regarding this particular case, cycles of this sort are not super common, and even less so the forward edges in them, so casts to deal with forward references would not seem very likely to have relevant performance impact on real code. (To Take Ross' example of Object.toString, the overhead of executing that method will surely dominate any down cast to String in the end. If not, we'd have much more serious problems elsewhere.) Though of course this should be confirmed through some experimentation. What I suspect matters more in practical terms is whether extra handling of forward references complicate producers significantly. The answer to that is less obvious. One experiment I plan to do is to prototype iso-recursive types in the reference interpreter and in the Wob compiler, to get a better insight into the impact they'd have. |
This sounds like it could be a deal breaker, not in terms of performance necessarily, but in terms of how easy it is to target the proposal. For compiler authors, it doesn't matter how rare these patterns are in practice; as long as the patterns exist, they will have to put in the extra bookkeeping work to support them. We are competing against JS as a compilation target and if Wasm GC is significantly harder to target than JS, that significantly raises the bar for how much of an improvement it needs to be in other areas.
Yes, I think we all agree on (4). (5) seems like a separable discussion, so I filed #220 to make sure we're all on the same page about that. |
Actually, compiling to JS is not the only competitor. Another is 'roll-your-own' GC on linear memory. |
This discussion has been extremely useful and the problems raised here have been addressed in #243. |
Here are two class hierarchies that isorecursion (as just presented) seems to not support (but which nominal types support):
The text was updated successfully, but these errors were encountered: