Skip to content

Should declaration variance support an explicit "covariant" marker? #2078

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
leafpetersen opened this issue Jan 27, 2022 · 11 comments
Open
Labels
variance Issues concerned with explicit variance

Comments

@leafpetersen
Copy link
Member

We have discussed adding declaration site variance as a non-breaking feature in which type parameters to classes may be marked as out (statically checked covariant), in (statically checked contravariant), or inout (statically checked invariant). In order to make the change non-breaking, unmarked type variables would continue to be treated as runtime checked covariant. This issue is to discuss whether or not we should also add an explicit marker for runtime checked covariant classes.

Syntax

An initial proposed syntax for the marker is covariant to match the existing covariant marker on term level parameters. Example:

class C<covariant T, inout S> {
  S f1(S x) => x; // Valid because S is invariant
  T f2(T x) => x; // Valid because T is covariant
}

Semantics

A type parameter marked with this modifier would be treated as runtime checked covariant exactly as if no modifier were on the type parameter. That is, for the language, class A<T> ... and class A<covariant T> ... would be treated equivalently. Example:

class A<covariant T, S> {
  S f1(S x) => x; // Valid because S is runtime checked covariant, runtime check inserted on argument
  T f2(T x) => x; // Valid because T is runtime checked covariant, runtime check inserted on argument
}

Motivation

There are several motivations behind this proposal.

The first is that some set of our users will very likely prefer to always use statically checked variance, and will prefer that any runtime checked classes be marked explicitly covariant. This enables such marking, with a lint to prevent unmarked type parameters from being introduced.

The second is that it gives us a potential path forward towards changing the default behavior. If we eventual reach a point where the majority of code uses explicit covariant, we could then choose to make inout or out the default in a language versioned breaking change.

Discussion

Since this has no semantic significance in the proposal, the same effect in the short term could be achieved with an annotation associated with a lint. This doesn't provide any organic migration path however.

cc @eernstg @lrhn @munificent @natebosch @jakemac53 @stereotype441

@leafpetersen leafpetersen added the variance Issues concerned with explicit variance label Jan 27, 2022
@lrhn
Copy link
Member

lrhn commented Jan 27, 2022

If we allow, but do not force, the use of covariant to mean the same thing as no modifier, then I expect a lot of code to not write it. I wouldn't, it's incredibly verbose. (Having to write in/out/inout on every type parameter is already more verbose than what we have now, and I do worry about how it affects readability, especially since I can personally never remember which is which.)

If we make a modifier required in the same language version that introduces the feature, then it does require a migration of all files, but it's completely automatable (just add covariant in front of every type parameter). It would then allow us to sooner allow a non-modifier type parameter to mean something else (say inout).

I do appreciate using covariant for the unsound covariance. It's already known for "unsound subtyping".
(Is there any real difference between a covariantly unsound type parameter and an out type parameter where every contravariant occurrence is explicitly marked covariant? Except that the latter only works for contravariant occurrences in parameters, not return types.)

@eernstg
Copy link
Member

eernstg commented Jan 27, 2022

I think this covariant modifier on a type parameter is nicely consistent with the existing use on parameters, and it would be good if we could give developers a convenient software evolution strategy that leads to the elimination of dynamic type checks. If we get to a point where every class type parameter has a modifier then we can freely change the meaning of a modifier-less type parameter.

However, it might be useful to consider an alternative strategy: We could aim for the unannotated type parameter to mean 'soundly covariant' at some point in the future, and then we could allow developers to prepare for the switch by adding the modifier covariant to each value parameter of a method of the class that has a declared type where subsumption doesn't imply that the parameter type is a supertype of the statically known type (that is, on every parameter whose type requires a dynamic check). So we'd have the following:

// Current style, using a dynamically checked covariant `T`.
class C<T> { T f(T x) => x; }

// Proposed migration, using a `covariant` type parameter.
class C<covariant T> { T f(T x) => x; }

// Alternative migration, using `covariant` on value parameters.
class C<T> { T f(covariant T x) => x; }

So we would gradually migrate the code, ending up with a form where we could switch the meaning of the plain type parameter from being 'dynamically checked covariant' to 'statically checked covariant'. We would then no longer have the notion of a dynamically checked covariant type parameter at all, every type parameter would be statically checked.

However, we would allow for the pre-sound-variance semantics by means of the covariant modifier on each value parameter that needs it. Of course, this is the reason why T as a parameter type in the example will not be an error after the switch.

The migration is slightly lossy, because a parameter which is covariant-by-declaration can be overridden by a parameter whose type is a subtype (so T f(covariant T) in C<T> may be overridden by num f(covariant int) in D extends C<num> {}, and the current treatment of dynamically checked type parameters admits only num f(num)). This means that it is no longer safe to omit the dynamic type check on the parameter in certain situations (in particular, when the receiver is this, and the actual argument has static type T). I don't know whether that optimization is actually performed today. However, any optimizations that allow for omitting the dynamic type check on the actual argument because the receiver type is statically known will still be valid (and, of course, the method implementation could still be inlined, etc.).

So what's the trade-off, how does the alternative strategy compare to the proposal using covariant T?

  • We would not aim for a situation where every class type parameter has a modifier, we would aim for a situation where every modifier-less type parameter X can be reinterpreted to mean out X.
  • In this state, where we have marked just the value parameters that need it as covariant, the dynamic type checks will be revealed in the method declarations more explicitly than they would if we maintain the notion of a dynamically checked covariant type parameter.
  • On the other hand, the use of covariant on parameters is a bit more permissive than the current treatment of dynamically checked covariant type parameters, and we'd need to evaluate carefully whether or not that matters.

@mateusfccp
Copy link
Contributor

What are the reasons to not use statically checked variance? I mean, I understand that we want the feature to be non-breaking, so by not marking the type parameter we have the exact same behavior as we already have today.

However, I think most people prefer to have statically checked than dynamically checked variance, which will lead to the following:

  • Everyone simply mark their type parameters with out in almost all type parameters of their code base;
  • Most people won't use the covariant marking anyway, as it is already the default behavior (maybe we should encourage them with a lint?);
  • In a later version of the language, we change the unmarked type parameters to mean out, as we think that the ones who want dynamically checked variance will be using covariant;
  • When this happens, most people will have to either remove their out markings (or not remove and have them redundantly) or have a breaking change as they actually didn't mark anything with covariant.

@natebosch
Copy link
Member

I would prefer to not force authors to make a choice - either make covariant required (and add it automatically with dart migrate) or not allow it at all.

I do think it's better if code that will cause runtime checks has a positive indication in the source, rather than the unsafety being indicated by the absence of a marker. I don't have a strong opinion on whether the marker should appear on the declaration of the generic type or on parameters.

@munificent
Copy link
Member

In order to make the change non-breaking, unmarked type variables would continue to be treated as runtime checked covariant.

Do we have any data on how breaking this would be if we were to instead treat unmarked type variables as meaning either invariant or sound covariant? It feels unfortunate to me to give the best syntax (no modifier) to the worst semantics (both slow and prone to runtime exceptions).

I understand that we may have to do that because of the sins of our past, but I'd like to know that we need to before we do that. The one data point I have is that for several years in C#, all generic type parameters were invariant and I can rarely recall it causing any compile errors even when I wrote many generic classes without actually understanding variance at all.

Especially now that we have language versioning, I could potentially see us:

  • In a future language version, treat no modifier to mean either invariance or sound covariance. (I'd like us to gather some data on which is the most useful default.)
  • Use covariant to mean the current unsound covariance.
  • Ship a dart fix that adds covariant to unmarked type parameters in legacy code.

I think that aligns with our general trend towards users having to syntactically opt in to behavior that is slower or may fail at runtime—null safety, dynamic calls, etc.

@Levi-Lesches
Copy link

It seems a big driver of this is deciding which option would be the best default. My input as a Dart user would be to prefer making that the default on the first release, instead of introducing a backward-compatible version first and changing the default later. Even if the first release is completely non-breaking, it still comes with a mental workload on developers, who have to learn the new mechanics, verify that their codebase still works, and then find new places to take advantage of it. Changing the default later would mean going over their codebase and reworking their mental models a second time.

I get that avoiding breaking changes is a positive goal, but IMO the point of these compiler warnings/errors is to notify the user "hey, something changed and now your code has these issues here, here, and there. Let me show you how to fix them". Splitting what is fundamentally a breaking overhaul of the mechanics into two mostly-non-breaking releases forces users to consider the implications twice, make changes the first time that will be overridden by the second, and get minimal guidance from the tooling in the process.

As a side note, I also agree that any unsafe behavior involving runtime checks should always be denoted by an explicit modifier. I'm less qualified to say what the default, non-modified behavior should be, but I would support the safest one that's most broadly applicable, like how Dart sometimes assumes Object where a more specific type could fit (would that be inout?). If the user wants a more specific and restrictive type, they'd have to specify that themselves. That would also serve as a good visual indicator to the reader that this use-case isn't like all the others while letting most cases be nameless.

@leafpetersen
Copy link
Member Author

Do we have any data on how breaking this would be if we were to instead treat unmarked type variables as meaning either invariant or sound covariant? It feels unfortunate to me to give the best syntax (no modifier) to the worst semantics (both slow and prone to runtime exceptions).

I tested this back when we were designing strong mode, and it blew up the world. Code is more tightly typed now, so it's remotely possible that we're in a better spot, but I'm highly skeptical. List and Iterable would both have to be made invariant, and the latter in particular is rough. We could fix the Iterable issue by adding upper bounds on type parameters (and maybe we should do this anyway), but even List is going to be a pretty heavy lift, I think. But yeah, maybe I'll try to carve out some time to test some variants of this. I'm really not optimistic though.

Note that this is also not something you can purely find via static analysis: you can have code which is statically clean, but blows up at runtime because of a cast which used to work via covariance and now fails.

Especially now that we have language versioning, I could potentially see us:

  • In a future language version, treat no modifier to mean either invariance or sound covariance. (I'd like us to gather some data on which is the most useful default.)
  • Use covariant to mean the current unsound covariance.
  • Ship a dart fix that adds covariant to unmarked type parameters in legacy code.

This is basically the path I'm trying to map out with what I'm proposing in this issue: the difference is that I'm proposing an interim state where we allow either explicitly or implicitly marked runtime checked covariance, and try to get people to incrementally migrate their code by shipping lints. The alternative, as you note, is to ship it via a language version and just do the migration. Which... maybe.

@munificent
Copy link
Member

Do we have any data on how breaking this would be if we were to instead treat unmarked type variables as meaning either invariant or sound covariant? It feels unfortunate to me to give the best syntax (no modifier) to the worst semantics (both slow and prone to runtime exceptions).

I tested this back when we were designing strong mode, and it blew up the world. Code is more tightly typed now, so it's remotely possible that we're in a better spot, but I'm highly skeptical. List and Iterable would both have to be made invariant, and the latter in particular is rough. We could fix the Iterable issue by adding upper bounds on type parameters (and maybe we should do this anyway), but even List is going to be a pretty heavy lift, I think. But yeah, maybe I'll try to carve out some time to test some variants of this. I'm really not optimistic though.

Ah, good point. I think if we made the default invariant, then at the same time, we would have to mark Iterable and List as covariant for (unfortunate) backwards compatibility. We can do that atomically at the same time as we flip the default since we own those types. It would, of course, be great to make Iterable soundly covariant, but my understanding is that we can't because of... uh... fold() and reduce()?

I was more wondering about how breaking it would be if we made unannotated generics default to invariant in user types we don't control.

Note that this is also not something you can purely find via static analysis: you can have code which is statically clean, but blows up at runtime because of a cast which used to work via covariance and now fails.

Yeah. :( All the more reason to try to move the Dart ecosystem to a place where that is no longer true!

This is basically the path I'm trying to map out with what I'm proposing in this issue: the difference is that I'm proposing an interim state where we allow either explicitly or implicitly marked runtime checked covariance, and try to get people to incrementally migrate their code by shipping lints. The alternative, as you note, is to ship it via a language version and just do the migration. Which... maybe.

Got it, thanks. Yeah, I could see us doing a two-step migration but, like @Levi-Lesches says, that can end up being even more annoying for users. We have language versioning, so I wonder if we can simply treat step one as "stay on the old language version".

@lrhn
Copy link
Member

lrhn commented Feb 1, 2022

Iterable should be the poster-child of sound covariance. It's not, and that's sad, and it's because of reduce (fold is actually fine).
If we removed reduce and fold from Iterable, and made them extension methods instead, then that might make a lot of things easier. (And if implementations still had the methods, some dynamic invocations might also succeed.)
That, and sort, shuffle and the indexOfs from List. One can dream!

On the other hand, it's not clear how variance and extension methods work together. It's probably just as impossible to make an unsound covariant extension method on a soundly covariant interface, as it is to do it directly on the interface.

(While I'm very much attracted to use-site variance because it allows you to say that your use of, say, a List is actually covariant, it probably won't allow us to optimize much. If optimization is the goal, declaration site variance is stronger, just so much stronger that it's not actually viable for a lot of existing classes.)

@leafpetersen
Copy link
Member Author

Iterable should be the poster-child of sound covariance. It's not, and that's sad, and it's because of reduce (fold is actually fine). If we removed reduce and fold from Iterable, and made them extension methods instead, then that might make a lot of things easier. (And if implementations still had the methods, some dynamic invocations might also succeed.)

Another option is to add lower bounds to generic type parameters, though this would be breaking.

@lrhn
Copy link
Member

lrhn commented Feb 2, 2022

With lower bounds, the signature of reduce would be R reduce<R super E>(R Function(R, R) combine);.
That does appear to be sound (at least I can't easily break it).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
variance Issues concerned with explicit variance
Projects
None yet
Development

No branches or pull requests

7 participants