Skip to content

NNBD tool suggests casts for map keys #49106

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
rakudrama opened this issue May 25, 2022 · 15 comments
Closed

NNBD tool suggests casts for map keys #49106

rakudrama opened this issue May 25, 2022 · 15 comments
Labels
area-migration (deprecated) Deprecated: this label is no longer actively used (was: issues with the `dart migrate` tool).

Comments

@rakudrama
Copy link
Member

The NNBD migration tool suggests casts for keys to Map.operator[].
This operation has argument type Object?, so these casts are unnecessary, and in this case actually harmful since the code relies on lookups outside the map key type returning null.

Example from sdk/pkg/compiler/js/rewrite_async.dart

image

An additional problem is that the migration tool will insert as Block without ensuring Block is imported, and does not use the library prefix. Both problems with the inserted cast create broken migrated code.

/cc @stereotype441 @leafpetersen

@rakudrama rakudrama added the area-migration (deprecated) Deprecated: this label is no longer actively used (was: issues with the `dart migrate` tool). label May 25, 2022
@stereotype441
Copy link
Member

Oh interesting! I've tracked this down to a flaw in the analyzer's resolution logic. When resolving an expression of the form a[b], the context for b is obtained from the type of the first argument to a.operator[]= (if that lookup succeeds); the type of the argument to a.operator[] is used as a fallback, only if the lookup of operator[]= fails. Since Map<T, U>.operator[]= has type void Function(T, U), this means that the context type used for the map index is the key type for the map.

That's totally wrong: the type of a.operator[]= should only be be relevant if the a[b] appears as the target of an assignment or a ++ or -- operator. The front end does the right thing here.

I'll add a test case to the migration tool to repro this, and then I'll re-classify as an analyzer bug.

copybara-service bot pushed a commit that referenced this issue May 26, 2022
Bug: #49106
Change-Id: Id207cef6b23c75eddbff24bb3e70935998a265f6
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/246054
Reviewed-by: Konstantin Shcheglov <[email protected]>
Commit-Queue: Paul Berry <[email protected]>
@stereotype441 stereotype441 added legacy-area-analyzer Use area-devexp instead. and removed area-migration (deprecated) Deprecated: this label is no longer actively used (was: issues with the `dart migrate` tool). labels May 26, 2022
@scheglov
Copy link
Contributor

Which type should be used as the context type when there are both read and write?
Always the index type from operator[]=? Or GLB with the index type from operator[]?

CFE can run this:

class A {
  int operator [](int index) => 0;
  operator []=(num index, int value) {}
}

void f(A a) {
  a[ g() ]++;
}

T g<T>() => 0 as dynamic;

main() {
  f( A() );
}

...and infers int as the type argument for g.

But not this:

class A {
  int operator [](num index) => 0;
  operator []=(int index, int value) {}
}

...reports:

bin/test3.dart:7:6: Error: A value of type 'num' can't be assigned to a variable of type 'int'.
  a[ g() ]++;
     ^

@scheglov
Copy link
Contributor

@eernstg

@bwilkerson bwilkerson added area-migration (deprecated) Deprecated: this label is no longer actively used (was: issues with the `dart migrate` tool). and removed legacy-area-analyzer Use area-devexp instead. labels May 29, 2022
@leafpetersen
Copy link
Member

@stereotype441 is going to look at the example from @scheglov , cc @lrhn @munificent @eernstg

@eernstg
Copy link
Member

eernstg commented Jun 2, 2022

In the language specification, a[g()]++ is specified to be treated as (a[g()] += 1) which is again treated as (a1[g1] = a1[g1] + 1) where a1 is a fresh variable bound to the result of evaluating a, and similarly for g1 and g(); so it's essentially treated as let a1 = a, g1 = g() in a1[g1] = a1[g1] + 1.

So this means that there is basically nothing at all to build upon in the language specification for the context type of g(), and we'd need to have a separate rule to specify the context type in this case. Similarly, I don't see any such rule in any other specification document (in particular, in the null safety feature spec and in the inference feature spec).

However, GLB of the relevant formal parameter types of the two members that are invoked ([] and []=) seems to be the obvious choice.

We might want to have a rule to ensure that the two index parameters don't have completely incompatible types, but we don't currently have that, so we would just get a compile-time error for any usage of e1[e2]++ or e1[e2]-- (or pre-increments) in those cases, unless e2 has type dynamic or Never, but we don't have to change anything about that right now.

Cf. dart-lang/language#2271.

@stereotype441
Copy link
Member

Digging through the front end source code, for ordinary indexed reads (e.g. var x = e[i];), compound assignments (e.g. e[i] += v) and if-null assignments (e.g. e[i] ??= v), the context type it uses for the index expression is always the type of the index parameter to operator []. Whereas for simple assignments (e.g. e[i] = v) it uses the type of the index parameter to operator []=.

Whereas the analyzer, IIRC, always tries to get its context type from the type of the index parameter of operator []=, regardless of whether it's an indexed read, compound assignment, if-null assignment, or simple assignment. It only looks at the index parameter of operator [] if its lookup of operator []= fails.

As I see it, we have three options worth considering:

  1. Switch the analyzer to match the front end behavior. On the plus side, this is low effort (since only analyzer needs to be changed) and low risk (it's non-breaking, because any code that would be newly rejected by the change is code that the CFE currently rejects). On the minus side, there's not really a good principled argument for this approach; it's just what the CFE happens to do right now.
  2. Switch both the analyzer and front end so that they use the type from operator [] for ordinary indexed reads, the type from operator []= for simple assignments, and the GLB for compound and if-null assignments (as @scheglov and @eernstg suggest). This has the advantage that it's very principled and easy to justify in the spec. It has the disadvantage that it's potentially breaking (though I suspect breakages will be rare in practice), and since it requires changes both to the CFE and analyzer, it will either require coordination between the teams to roll out the change together, or it will require someone who is familar with both codebases (me perhaps) to change both the CFE and analyzer at the same time.
  3. Do 1 now, and do 2 as a future improvement. This has the advantage that it quickly solves the migration tool issue, and brings the CFE and analyzer behaviors into line with each other. It has the disadvantage that it's a little more churn, since it means changing the analyzer behavior twice.

Note that regardless of the approach we take, it's necessary for soundness that we check for assignability between the static type of the index and the type of the index parameter of whatever operator (or operators) might get called. In both the analyzer and the CFE, the logic to check for assignability is unrelated to the logic for choosing a context type, and as far as I'm aware, there are no bugs with the assignability checking logic. So we should be able to choose among 1, 2, and 3 without worrying about soundness issues.

Personally I favor option 3 because it solves the worst problems (bad migrations and CFE/analyzer mismatch) quickly and with minimal effort, and it lets us postpone the effort to bring the implementation in line with a more principled spec (which, though important, is higher effort and lower reward than just bringing the analyzer into line with the CFE).

@stereotype441
Copy link
Member

There was also some discussion about whether it would be good to have a lint to warn users when their definitions of operator [] and operator []= have different / incompatible index types. Personally I don't see a lot of value in doing so, because (a) it's not really a problem unless someone tries to do a compound or if-null assignment, (b) we already have a concrete example of a case where it makes sense to some people for the index types to differ (Map), and (c) I'm not aware of any evidence that people are accidentally giving operator [] and operator []= mismatched index types, and then regretting it later.

@eernstg
Copy link
Member

eernstg commented Jun 3, 2022

There might be some completely different examples that someone can come up with—but conceptually, I would expect any actual design of a class that has an operator [] and an operator []= to have a notion of a 'key' type and a 'value' type, and it would then modify the state to use the given key for operator []=, and it would just search for the key with operator []. This makes the approach taken in Map a canonical one: The key type must be specified precisely for operator []=, but any supertype will do for operator [].

This means that we can expect the GLB of the two types to be, in practice, the type from operator []=.

This means that approach 1 is a bit inconvenient, because it uses the type from operator [] in some cases where []= is also invoked. For example:

class A {
  int operator [](Object? o) => 0; // Expected parameter type: Some supertype of `String`.
  operator []=(String s, int i) {}
}

void f(X Function<X>() whatever, A a) {
  a[whatever()]++; // CFE error, analyzer OK.
}

This actually serves as an argument against approach 1 and 3, and in favor of keeping the analyzer unchanged (such that the current analyzer behavior will not become a breaking change).

Of course, Map operator [] can return null, so we aren't going to have a[b]++ where a is a Map, but in the case where + will work (and in the case where we introduce a magic ! to the increment/decrement expressions ;-), the issue comes up there, too. Other compound assignments bring up the same issue as well.

So maybe we should aim for 2 after all?

@lrhn
Copy link
Member

lrhn commented Jun 3, 2022

We have a subtype restriction between getter return type and setter argument type (I never remember which, but one must be a subtype of the other).

That rule was driven by always wanting to allow this.x = this.x (so it must be getter subtype of setter), and preferably also this.x op= e, where x op e is assumed to be able to have the same type as x.

Using the same reasoning here, a[b] += 1 should be valid, which means that b the return type of operator[] should be a subtype of the type of the second argument of operator[]=.

(On the other hand, that's actually the opposite of what I'd advocate for, if we had started from scratch. It makes no sense to store a value that you cannot read out again. If you can set a num but only get an int, then the setter is not "setting", it's converting and setting, and it should be method instead. In practice, the two should probably just be the same.)

For the first argument, if we assume the two operators are related, it makes no sense to be able to store at a key that you cannot read at again afterwards, so maybe we should require the key of the []= to be a subtype of the key of the []. That matches what Map does, and likely most other well-behaved programs.
It'd be a breaking change, so just using GLB as context type for now is probably fine. Or simply require that at least one of the two is a subtype of the other.

About the desugaring: If the key argument is typed as dynamic, we'll need to do two downcasts, unless one implies the other. Say:

Map<num, num> m = ...l
m[0] += (... as dynamic);
/// equivalent to
let x = ...  as dynamic in m[x as Object?] = m[x as num] + 1

and if so, those downcasts are not governed by the context type. They cannot be.
With a class hierarchy where A and B has multiple possible maximal lower bounds, we may end up choosing a GLB (or none) which is not the actual type of the value, and where the actual type is another a subtype of both key types. So we should not use the GLB for downcasts.

(And never do type inference after expanding syntactic sugar, always do it directly on the original source, precisely because, as @eernstg showed, syntactic desugaring does not preserve context. Just like CPS transformation is precisely about removing syntactic context. That seems at odds with not using being able to use the context type here, but that's because e1[e2] in e1[e2] += ... actually places e2 in two separate type contexts using only one syntactic contexts.)

@stereotype441
Copy link
Member

With a class hierarchy where A and B has multiple possible maximal lower bounds, we may end up choosing a GLB (or none) which is not the actual type of the value, and where the actual type is another a subtype of both key types. So we should not use the GLB for downcasts.

Good point. If we wind up going with option 2, we should definitely include a language test to check the behavior of this corner case.

@munificent
Copy link
Member

That rule was driven by always wanting to allow this.x = this.x (so it must be getter subtype of setter), and preferably also this.x op= e, where x op e is assumed to be able to have the same type as x.

Using the same reasoning here, a[b] += 1 should be valid, which means that b the return type of operator[] should be a subtype of the type of the second argument of operator[]=.

+1. I think the correspondence between getter/setters and index/index setters should be strong. I really wish the language had a formal notion of "property" or maybe "l-value" that would subsume these. It's always confusing for users when a getter and setter don't stay in sync, and likewise with index operators.

It is true that Map's [] operator takes a looser type than []=... but that's arguably a bug and not a feature. It was done for the same reason that Iterable.contains() accepts Object instead of E and that causes so many user bugs that we have a recommended lint to avoid it. We should probably have a similar lint for Map.[].

@stereotype441
Copy link
Member

This issue has come up again, so I think we should take some action to fix it. After discussing it with @munificent and @leafpetersen last week, I think that in the short term, we should go with my suggestion 1 (switch the analyzer to match the front end behavior). In other words, for ordinary indexed reads (e.g. var x = e[i];), compound assignments (e.g. e[i] += v) and if-null assignments (e.g. e[i] ??= v), obtain the context type for the index expression from the type of the index parameter to operator []. For simple assignments (e.g. e[i] = v), obtain the context type for the index expression from the type of the index parameter to operator []=.

In the long term, we can consider changing the context type for indexed reads to use GLB for compound and if-null assignments.

My rationale is: the short term fix is easy and non-breaking, and it gets the migration tool to do the right thing. The longer term fix is principled and easy to justify in the spec, but it's more effort to implement and it's technically a breaking change, so we should approach it with more caution.

If there are no objections, I'll coordinate with @scheglov later in the week about fixing this in the analyzer, and I'll file a language issue so that we don't forget about the long term plan.

@scheglov
Copy link
Contributor

FWIW, I had a fix for this since May, but never uploaded it, and for now it implements GLB.
https://dart-review.googlesource.com/c/sdk/+/256647

I will update it to the plan that Paul recommends.

@stereotype441
Copy link
Member

We talked about this in the language team meeting and we're going to go with the plan I outlined above. @scheglov, feel free to move ahead with your CL. I'll file a separate issue to discuss the possibility of using GLB for compound and if-null assignments.

copybara-service bot pushed a commit that referenced this issue Sep 1, 2022
Bug: #49106
Change-Id: I522269360c6664ac4a9129cb71bf378c9dfca812
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/256647
Reviewed-by: Paul Berry <[email protected]>
Commit-Queue: Konstantin Shcheglov <[email protected]>
@stereotype441
Copy link
Member

Fixed in d07005b.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-migration (deprecated) Deprecated: this label is no longer actively used (was: issues with the `dart migrate` tool).
Projects
None yet
Development

No branches or pull requests

8 participants