Skip to content

I wish type inference would not resolve to Null #34058

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
matanlurey opened this issue Aug 1, 2018 · 18 comments
Open

I wish type inference would not resolve to Null #34058

matanlurey opened this issue Aug 1, 2018 · 18 comments
Labels
area-language Dart language related items (some items might be better tracked at github.com/dart-lang/language). strict-analysis-needed Mark issues where strict-raw-types, strict-inference and similar analysis options were helpful

Comments

@matanlurey
Copy link
Contributor

... best explained through a simple example (real issue: angulardart/angular#1533):

void main() {
  var comp1 = Comp1();
  var comp2 = Comp2();
  comp2.registerEvent(eventHandler1(comp1.onEventInt));
}

class Comp1 {
  void onEventInt(int event) {}
}

class Comp2 {
  void registerEvent(void Function(String) listener) {
    listener('Hello');
  }
}

void Function(E) eventHandler1<E, F extends E>(void Function(F) handler) {
  print('Function($E) for eventHandler1<$E, $F>(void Function($F))');
  return (E event) {
    handler(event as F);
  };
}

See the issue? If you run this program, you get the following:

Function(String) for eventHandler1<String, Null>(void Function(Null))
Uncaught exception:
CastError: "Hello": type 'JSString' is not a subtype of type 'Null'

It looks like type inference realizes (correctly) that int is not a sub-type of String, and recovers by falling back to bottom (i.e. Null). While this is correct, the user experience is degraded significantly - something that could have been a compile-time error is now a (very confusing) runtime one.

We slightly improved the runtime error by adding an assert:

 assert(
        E == Null || F != Null,
        "Event handler '$handler' isn't assignable to expected type "
        "'($E) => void'");

... of course this is a hack, and I'm not sure it is technically always correct.

I wish we could either:

  • Ban inferring Null, i.e. require the user to type <Null>.
  • Find a replacement pattern for the above use case that has correct static type checking
@matanlurey
Copy link
Contributor Author

/cc @MichaelRFairhurst who has been looking at lints preventing explicit use of Null.

@MichaelRFairhurst
Copy link
Contributor

Ah!

Interestingly, I think there is a better solution:

void Function(E) eventHandler1<E, F extends void Function(E)>(F handler) {
  print('Function($E) for eventHandler1<$E, $F>()');
  return (E event) {
    handler(event);
  };
}

Though I could be wrong.

The problem is that the need to cast E to F should show you that you aren't getting the subtying relationship you want.

This is a case where super would also work, but we don't have that in Dart:

void Function(E) eventHandler1<E, F super E>(void Function(F) handler) {

because parameters are contravariant, not covariant, and extends E is for covariance.

However, we don't need to use super, in this case you can use an extends with a contravariant placement instead, and so

void Function(E) eventHandler1<E, F extends void Function(E)>(F handler) {

and then you can tell that the cast is no longer needed, which is a clue that this is a more soundly typed bit of code.

Unless I'm mistaken about what the code is intending to do, or something else.

....

Regarding the original request, I think inference in general does a much better job at choosing Null than humans do, so I'm not convinced that it creates problems. I do remember seeing @leonsenft complaining about inferring Null at one pointt for a case that I thought was fairly sound, but clearly is unintuitive to many many many people.

I'd be supportive of something like "no implicit Null" for cases like this, partly because Null is just such a dang weird type that it feels like it should be opt-in only. I think it would take some experiments to get such a lint/analysis option that works well, but examples like these help us figure out how that might best work.

@leonsenft
Copy link

The reason for the cast is that we want to support users binding specific event handlers such as

void handleClick(MouseEvent);

to listener in

void addEventListener(String type, void Function(Event) listener);

for events where MouseEvent is a subtype of Event. While this isn't technically sound, we know on the web that an event listener bound to the click type event will indeed be a MouseEvent (*).

While I like your solution of making the generic bound the function type itself, I believe this won't work, because void Function(MouseEvent) is not a subtype of void Function(Event) (the converse is true).

(*) It's actually possible to dispatch a CustomEvent for any event type, which invalidates this assumption.

Considering this isn't sound in addition to the issue description above (*), one would think we should drop this cast and require users perform type promotion themselves if needed:

void handleClick(Event e) {
  if (e is MouseEvent) {
    ...
  }
}

Unfortunately this is a massive breaking change, since these APIs predate Dart 2.

@MichaelRFairhurst
Copy link
Contributor

Hmm, well I'm not terribly surprised that I was misunderstanding the goals of the code.

It's hard to really fairly evaluate how the type inference reacts to something that technically isn't sound. But its also a reality that code like this often needs to exist, and that users of that code need to get reasonable behavior from it.

The thumbs up here also go to show that we sometimes infer Null in confusing places & ways.

Note that we also infer Null as a return type for functions like so:

var f = () {}; // () -> Null

and we were not able to fix this for Dart 2, and would affect whatever we choose.

And we weren't able to fix this in Dart 2 either:

int x = cond ? [] : {}; // infers dynamic, which is assignable to int

So we definitely have some room to grow here.

I personally hope we do Dart 3 soon ish. Dart 2 is a big change by virtue of being runtime strong, and its good that we didn't cram too many breakages in there. However, now that we have that, there's stuff like this that would be awesome to keep refining, even where breaking changes are required.

@eernstg
Copy link
Member

eernstg commented Aug 2, 2018

In support of catching an inferred Null, noting that we have another case which is similar:

I believe the underlying problem here can be described as "fan _ too large". So maybe we could say that it's because something hits the large fan. ;-)

We have a simpler variant of the same problem with top types and conditional expressions:

int x = b ? '42' : true; // Upcast to `Object`, then downcast to `int`: OK!

The problem is that too many (irrelevant) types are accepted in this sequence of casts, so even ridiculously wrong types will pass silently. Because of the huge fan out (that is, the huge number of subtypes of Object), the loss of information is radically larger than it is when Object is not part of the cast sequence.

Similarly, Null has huge fan in, in the sense that there are a lot more supertypes of Null than there are of any other type. We may thus incur a radical loss of information when any computed type (e.g., an inferred type) produces a contravariant Null, in order to make some expression type correct in some context:

void Function(E) castFunctionByWrapping<E, F extends E>(void Function(F) f) {
  return (E x) => (f as dynamic)(x);
  // Could use `f(x)`, if we don't mind errors due to an over-eager static check.
}

void f(int i) => i;
void Function(String) g = castFunctionByWrapping(f);

main() {
  g(null); // OK
  g('Not an int'); // Dynamic error.
}

I renamed eventHandler1 to castFunctionByWrapping in order to highlight what it's actually doing. The problem is, again, that inference can only resolve the given situation by inferring Null, and that causes a radical loss of information.

So maybe we can extract a rule which goes something like this:

A computed extreme type (Null or a top type) should be flagged, because it incurs a radical loss of information and hence allows wildly wrong types to pass silently.

That rule would only fire when the extreme type is actually computed (i.e., we don't get it for b ? '42' : (true as dynamic) because the top type dynamic is already one of the operands). The flag might be a lint, or it might be some other diagnostic, or it might be that inference (and similar type computations) will refuse to produce a result. Furthermore, the rule may apply in general or just in specific situations (possible starting point: only for conditional expressions and function literal parameter types), but the main point is that this particular kind of transition (whether it's a cast, a least upper bound, or whatever) is error-prone due to the huge number of edges going to/from a particular node in the subtype graph.

@leonsenft
Copy link

Thanks for the response @MichaelRFairhurst and @eernstg. I'd definitely be in favor of a lint for inferred Null.

@lrhn
Copy link
Member

lrhn commented Aug 4, 2018

I think the issue here is not Null as much as inferring a type and then adding a down-cast to the inferred type. Down-casts are fine if that is what you want. If I assign my Widget to FancyWidget, it's because I wan't to. If something inferred FancyWidget even though it introduces a down-cast, then the inference probably doesn't know what I'm doing, and it should just have given up.

The types Object and Null are cases where that happens more than generally (because they are related to all otherwise unrelated types), but I don't think it's a their fault, they're just the most common cases for a general problem.

@MichaelRFairhurst
Copy link
Contributor

MichaelRFairhurst commented Aug 4, 2018 via email

@matanlurey
Copy link
Contributor Author

@lrhn:

The types Object and Null are cases where that happens more than generally (because they are related to all otherwise unrelated types), but I don't think it's a their fault, they're just the most common cases for a general problem.

Agreed. If there was a lint, I think that would be fine with me.

@MichaelRFairhurst
Copy link
Contributor

MichaelRFairhurst commented Aug 5, 2018 via email

@MichaelRFairhurst
Copy link
Contributor

MichaelRFairhurst commented Aug 5, 2018 via email

@eernstg
Copy link
Member

eernstg commented Aug 7, 2018

@MichaelRFairhurst wrote:

But it would be nice for:

expect<T>(T a, T b) => a == b || throw ...;

That could actually be expressed using type exactness (that we've discussed a number of times as a device for obtaining invariance, e.g., List<exactly num>):

bool expect<T>(exactly T a, exactly T b) => a == b || throw ...;

where inference would search for a suitable actual type argument for all invocations where no type argument list is provided, and then we would simply have an inference failure in the cases where the actual value arguments for a and b have static types which are proper subtypes of T (and we could have an upcast which could be accepted statically and might fail dynamically, if the static type of the actual arguments isn't exact).

However, I know that this idea has strong opponents, based on the assumption that any kind of support for exact types (not just exact type arguments of generic classes, but types which are exact at top level) will pollute the source code with information about implementation details.

But, keeping the actual topic of this issue in mind, I still think that the huge fan in/out of the extreme types is a primary reason for the confusion, well aided by the combination of a downcast plus an upcast (in some order), due to the massive loss of information. An exact type is an interesting dual to that because it incurs a minimal loss of information. ;-)

@matanlurey
Copy link
Contributor Author

@eernstg Doesn't bool expect<exactly T> make more sense?

@eernstg
Copy link
Member

eernstg commented Aug 8, 2018

Doesn't bool expect<exactly T> make more sense?

The intention behind exactness is that we express a type whose values is a subset of what it would have been without the exactness, and in return for that inconvenience we get additional guarantees. For instance, we may know that a particular method invocation is safe:

// `nums` cannot be a `List<int>` or a `List<Null>`, only a `List<num>`.
void foo(List<exactly num> nums) => nums..add(42)..add(3.14159);

// `X` could be `num`, so `list` would be a `List<num>`, and `it` 
// could be an `Iterable<int>`.
void bar<X>(List<exactly X> list, Iterable<X> it) => list.addAll(it);

This differs from a type variable relation which may be deceptive because it's static only:

// With `X` = `num`, `Y` = `double`, `list` a `List<int>`, `it` an `Iterable<double>`: 
// This may be statically OK (`list` static type: `List<num>`), but it will throw
// at run time!
void bar2<X, Y extends X>(List<X> list, Iterable<Y> it) => list.addAll(it);

The problem is that the type variables seem to express a useful typing relationship, but they only reflect what the compiler happens to know about the actual arguments at the call site. If we want to have a real guarantee then we must insist at the call site that some types are known more precisely, and that's where exactly will help us out.

I hope this illustrates why we may wish to use a type variable with and without exactness at different locations in its scope. If we were to use declarations like bool expect<exactly T>(...) then, presumably, we would have to insist that T means exactly T everywhere.

For an alternative interpretation of what bool expect<exactly T>(...) could mean, it could be something like "the actual type argument must be exact". But the actual type argument (inferred or written manually) is a stand-alone value, and it doesn't really make sense to say that it must be exact: There is no instance whose dynamic type can be used to determine whether it is actually exact or not.

Type typeOf<exactly X>() => X;

main() {
  print(typeOf<List<num>>()); // Reject this as "not exact"? Why? Why not? ;-)
}

Yet another alternative would be to say that a formal type parameter X may be declared as exactly X, and in that case it is required that there is no proper subtype relationship between the resulting formal parameter types and the statically known types of the actual arguments:

bool equals<exactly X>(X a, X b) => a == b;

// Error: Tried to infer `X` as `Point`, which causes 2nd argument to have
// an upcast from `ColorPoint` to `Point`.
var b = equals(new Point(), new ColorPoint());

It might be possible to fill in the details of such a mechanism, but I suspect that it can be fully emulated by using exactly on all occurrences of the type variable (as in bool equals<X>(exactly X a, exactly X y)), and I'm convinced that exactness in the types can do many other useful things as well.

@MichaelRFairhurst
Copy link
Contributor

MichaelRFairhurst commented Aug 8, 2018

My thought on

bool equals<exactly X>(X a, X b) => a == b

Is that this isn't really what we're checking, that a and b are the same type. For instance:

print(1 == 1.0); // true

When we use specific type arguments, it works fine:

equals<int>(1, "1"); // error and should be
equals<int>(1, 1.0); // error and should be
equals<num>(1, 1.0); // no error and should not be
equals<num>(1, "1"); // error and should be
equals<Object>(1, "1"); // no error and should not be

But with inference, it does not:

equals(1, "1"); // no error but, for this particular API, there probabl should be
equals(1, 1.0); // debatable whether there should be

People add generic types to their program to make their code more safe & precise. But in this case, if you rely on inference then T a, T b may as well be dynamic a, dynamic b unless you happen to do something like print(T) in your code. Which is rare. So this is counter-intuitive.

If we don't infer "Object," then 1, "1" will be an error, and that may be enough.

If we want 1, 1.0 to be an error until someone passes an explicit num type (I think this is reasonable but would like numbers on how breaking this is), then we want either all inference, or some inference, to assert that constraints match, rather than attempting to unify them. What is the best word for asserting that all constraints match, that's a tough question. I think "exact" or "exactly" makes sense, but, not if it conflicts with List<exactly num> list, because that's a completely different thing.

One thing I like about exactly as you've described it though, Erik, is the ability to specify which parameters should be the lowest constraint, and which should be the highest constraint:

expect<T>(exactly T actual, /* NOT exactly */ T expected);

However in this case I don't see a case for either expected or actual being exactly T.

num x;
int y;
expect(x, y); // Is this an invalid test?
expect(y, x); // Or is this?

In any case

class Direction {}
class Left extends Direction {}
class Right extends Direction {}

expect(Left(), Right()); // should not be inferred to Direction
expect(Right(), Left()); // should not be inferred to Direction
expect(getDirection(), Left()); // should not fail with a cast error
expect(getDirection(), Right()); // should not fail with a cast error

So maybe we should just never infer a lower/upper bound than the lowest/highest constraint?

@MichaelRFairhurst
Copy link
Contributor

One thought is that for any lint that detects "truisms" like so:

void expect<T>(T a, T b);

we can differentiate this from

T oneOf<T>(T a, T b);

because the crazy inferred type will be passed along in the program. Assuming we can get rid of implicit downcasts, this would be a fairly safe definition.

we could even so far as:

var x = oneOf(1, "1"); // no lint
oneOf(1, "1"); // lint

@a-siva a-siva added the area-language Dart language related items (some items might be better tracked at github.com/dart-lang/language). label Aug 14, 2018
@eernstg
Copy link
Member

eernstg commented Aug 20, 2018

@MichaelRFairhurst:

A while back @johnniwinther wrote Related Types, [pdf], where he studies this kind of issue: Constructs that do not have a type error in the usual sense, but where the given (possibly inferred) types are very likely to cover up some underlying mistake, for instance, because a non-trivial expression of type boolean (I'm spelling it like that because the paper uses Java as its main example) is guaranteed to be false and hence the true branch of an if statement is effectively dead code. So Johnni would definitely have some deeper ideas about this whole area.

It's a delicate area, anyway. For instance, you mention that we might want to allow using 1 and 1.0 together. Even more obviously, it would make sense to allow myPoint1 == myPoint2 even in the case where the two points have different dynamic type (say, CartesianPoint vs. PolarPoint) if those types are considered to be implementation details, and the type "that matters" for both points is a plain Point. So I'm not convinced that we want anything like "must have the same type" or (even worse) "must have the same static type".

But I think the last thing you said here puts the focus on something which is a lot more promising:

So maybe we should just never infer a lower/upper bound than the lowest/highest constraint?

Namely, we could take that to mean: Don't zigzag!

The point is that we shouldn't use a least upper bound or similar computation to find a supertype (say, for a conditional expression) and then downcast to a subtype. I suggested that we should catch the extreme types (because this zigzag operation then causes an extreme loss of information), but we could also consider flagging the general case where there is a zigzag at all (that is, upcast-then-downcast or downcast-then-upcast).

@jmesserly
Copy link

(disclaimer: I just skimmed this comment thread to get caught up.)

I think we can address this with the strict analysis flags: #33749

In particular either --strict-inference or --strict-inferred-casts should catch this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-language Dart language related items (some items might be better tracked at github.com/dart-lang/language). strict-analysis-needed Mark issues where strict-raw-types, strict-inference and similar analysis options were helpful
Projects
None yet
Development

No branches or pull requests

7 participants