Skip to content

Do explicit partial generic instantiation generalize to function typed expressions? #1604

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
leafpetersen opened this issue Apr 28, 2021 · 7 comments

Comments

@leafpetersen
Copy link
Member

leafpetersen commented Apr 28, 2021

Implicit instantiation of generic functions, e.g:

void main() {
  void localFn<T>(T x) {
    print(T);
  }

  void Function(int) k0 = localFn;
}

are restricted to cases where the thing being partially instantiated is an identifier which names a function or method declaration. It may not be an arbitrary expression (e.g. a local variable).

Question 1: For the explicit variant, do we intend to relax this (for implicit also?)?
Question 2: If so, we we allow explicit instantiation of this (for implicit and/or explicit)?

extension on void Function<T>(T x) {
  void foo() {
    void Function(int) f = this<int>; // Ok?
    void Function(int) g = this; // Ok?
  }
}

cc @lrhn @eernstg @munificent @natebosch @jakemac53 @stereotype441

@lrhn
Copy link
Member

lrhn commented Apr 28, 2021

Ad Q1: I was not planning to. It's orthogonal to explicit instantiation, and it's probably not cost-free. We've been doing without it until now.

Being able to specialize a function declaration at tear-off means that we can reuse the same specialization function every time. We only need one <T>=>(args)=>f<T>(args) function available, and we can refer to the declaration of the function to find it. (Instance methods are slightly trickier, but they can have a virtual _instatiate$foo<T>() => (args) => this.foo<T>(args) helper method too).
Because we can see statically which declarations are torn off, we can tree-shake any unnecessary instantiation helper functions.

It matters that we have per-function helpers, not per-instantiation-point wrappers.
Instantiation specializes the actual function based on the context type requirements, it doesn't just wrap to match the local type requirement. Example:

void foo<T>(T x, [T y]) {}
void Function(int) f = foo;
print(f.runtimeType); // void Function(int, [int])  -- *NOT* void Function(int)

Only specializing declaration tear-offs means we can't specialize a function expression at all, it's only for named declarations.

If we choose to specialize a function value, then that changes.

The following looks like something which should then work:

S Function(S) specialize<S>(T Function<T>(T) f) => f;  // general instantiation of generic function typed expression.
X id<X>(X value) => value;
int Function(int) intId = specialize(id);
X defaulter<X>(X? value, [X? defaultValue]) => value ?? defaultValue ?? throw ArgumentError("no default");
intId = specialize(defaulter);
print(intId.runtimeType); // int Function(int?, [int?])

The specialize invocation infers <int> as type argument, the => f; (aka => f<S>;, but that should be possible to infer) instantiation will then specialize the f function to one type argument.
Again, that is not just (S x) => f(x). If f implements a subtype of the T Function<T>(T) type, specialization should preserve the original arguments and only specialize the type parameter, which should again be a subtype of the required return type.

The question is how to do that effectively when all you have is a function value and a static type.

We could let every function value carry around its own "specializer" information. It would be harder to tree-shake if we instantiate any compatibly typed function value anywhere in the program.

Probably doable, but with an overhead.

Ad Q2: If yes to Q1, then yes here too. It should work for any arbitrary function-typed expression, which this is. No reason to restrict it to variables holding functions, it's going to be looking at the value anyway.

Generally, an extension method this should behave just like any other variable in every way. There is absolutely no reason for it not to. (And it should be able to be promoted!). It feels bloody stupid if void Function(int) = this; is invalid but var self = this; void Function(int) f = self; is fine (and similar arguments for every other use of this).

@stereotype441
Copy link
Member

Someone (@lrhn perhaps?) mentioned in the language meeting today that in terms of expressibility, the user doesn't actually lose anything if we restrict <typeArguments> to only be applicable to tear-offs, because a user who wants to apply type arguments to an arbitrary function-typed expression can always tear off .call. Which got me to thinking, couldn't I use that trick to make the specialize example above work today?

I tried it:

main() {
  S Function(S) specialize<S>(T Function<T>(T) f) => f.call;
  X id<X>(X value) => value;
  int Function(int) intId = specialize(id);
  X defaulter<X>(X? value, [X? defaultValue]) => value ?? defaultValue ?? (throw ArgumentError("no default"));
  intId = specialize(defaulter);
  print(intId.runtimeType);
}

and I'm sorry to say that it exposed a behavioral difference between CFE and analyzer. The analyzer says this code has no errors, but the CFE says:

../../tmp/proj/test.dart:3:57: Error: A value of type 'T Function<T>(T)' can't be returned from a function with return type 'int Function(int)'.
  int Function(int) specialize(T Function<T>(T) f) => f.call;

(I'll file an issue in the SDK repo about this behavioral difference, and once we decide whether the analyzer or the CFE is "correct" we can assign it to the appropriate team 😃)

After experimenting a bit, I think that the CFE only allows generic instantiation of .call if .call resolves to a method on an actual class; if it resolves to the implicit .call of a function type, that's not supported.

Which is not terribly surprising after our discussion today about this stuff being actually implemented through a hidden _instatiate$foo<T>() => (args) => this.foo<T>(args) method. It makes sense that this wouldn't work for arbitrary function types, because they aren't backed by a class, so there's nowhere to put the _instantiate$call method.

Which makes me think that:
(1) Lasse is probably right that there would be a nontrivial cost to allowing generic instantiation of arbitrary function-typed expressions.
(2) We can't really argue that we don't lose functionality by making this restriction because users can always instantiate a tear off of .call; that's broken today, and fixing it would incur the same implementation costs because this language feature is essentially all sugar.

I'm not sure where we go from here, though.

@stereotype441
Copy link
Member

(Further musings)

I keep thinking back to @leafpetersen's mental model of generic instantiation as syntactic sugar for an eta expansion (i.e. if f has static type T Function <T, U>(U) then f<A, B> is equivalent to (B b) => f<A, B>(b)). It was my mental model too until this morning's meeting, and I'm wondering, what if we changed the language behavior so that this mental model is correct?

Obviously it would be a breaking change in theory, but how breaking would it be in practice? It sounds from our discussion today (and from the bug I discovered above) like the only user-visible effect of the change would be that when tearing off a method via a base class and applying generic arguments, the resulting method has a runtime type that matches its static type at the site of the tear off, so the user couldn't later downcast it to a more specific type and make use of e.g. extra optional arguments. My suspicion is that the number of people who do that in practice is approximately zero.

And I think the implementation cost could be kept pretty low, because two instantiations that act on the same generic function type could share a common implementation, e.g. we would compile this:

foo(T Function<T>(T) a, T Function<T>(T) b) {
  int Function<int> aInt = a;
  String Function<String> bString = b;
}

into this:

U Function(U) instantiate$T_Function_T<U>(T Function<T>(T) f) => (u) => f<U>(u);
foo(T Function<T>(T) a, T Function<T>(T) b) {
  int Function<int> aInt = instantiate$T_Function_T<int>(a);
  String Function<String> bString = instantiate$T_Function_T<String>(b);
}

@eernstg
Copy link
Member

eernstg commented Apr 29, 2021

When considering whether to generalize generic function and method instantiation to 'generic function object instantiation' (such that it's applicable to every function object, not just to a term that statically resolves to a specific function or method), we seem to have three obvious choices available:

  1. Keep the semantics as currently specified and implemented (so we preserve the parameter list shape even in the case where it is not statically known, and we do not support generic instantiation of arbitrary function objects). Specify explicitly that .call on a function object cannot be subject to generic method instantiation (so the CFE keeps its error and the analyzer starts reporting the same error). This is the cheapest option (implementation, and run-time performance).

  2. Drop the preservation of the parameter list shape and make it a simple static-type-based eta expansion. This is breaking, but may not break anything in practice. The point that's not obvious to me is how difficult it would be to ensure that the resulting tearoff has the desired equality semantics (same receiver, same type arguments => equal). Any function object could be a tearoff of m from receiver o, so we need to tailor the equality of the function object for every generic instantiation of a function object. Wrt identical, we should be able to ignore the fact that some function objects are canonicalized, because f<int> where f evaluates to a function object won't ever be canonicalized anyway.

  3. Preserve the parameter list shape, and support generic instantiation of arbitrary function objects as well. This is similar to option 1, but we'd need to consider expressions of type Function and dynamic as well, and their equality semantics.

@lrhn
Copy link
Member

lrhn commented Apr 30, 2021

I have been checking how we canonicalize/make equal tear-offs of the same function. In short: Not very consistently.

See the code on Dartpad.
Result summary:

Tear-off runtime-type dart2js-compare VM-compare
top <T1>(T1) => T1 identical identical
C.stat <T1>(T1) => T1 identical identical
c.inst <T1>(T1) => T1 identical equal
d.inst <T1>(T1?, [T1?]) => T1 identical equal
local <T1>(T1) => T1 identical identical
top<int> (int) => int identical identical
C.stat<int> (int) => int identical identical
c.inst<int> (int) => int not equal equal
d.inst<int> (int?, [int?]) => int not equal equal
local<int> (int) => int not equal not equal
top<num> (num) => num not equal not equal
C.stat<num> (num) => num not equal not equal
c.inst<num> (num) => num not equal equal
d.inst<num> (num?, [num?]) => num not equal equal
local<num> (num) => num not equal not equal

(Dart2js is run on DartPad, VM locally on release 2.12.4).

Dart2js either canonicalizes or it has no equality at all. The VM recognizes equality of instance method instantiated tear-offs, but not any other kind.

@leafpetersen
Copy link
Member Author

We decided not to generalize this, and also to add spec language making it explicit that you cannot implicitly instantiate the .call method of an arbitrary function.

@eernstg
Copy link
Member

eernstg commented May 11, 2021

Spec update in 4c0656a.

I'll close this issue because the question in the title has been addressed.

@lrhn, you contributed a substantial amount of material about the identity and equality properties of tearoffs, perhaps that topic should be addressed in a new issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants