Skip to content

How should name collision in tuples be handled. #1291

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
lrhn opened this issue Nov 6, 2020 · 13 comments
Closed

How should name collision in tuples be handled. #1291

lrhn opened this issue Nov 6, 2020 · 13 comments
Labels
patterns Issues related to pattern matching. question Further information is requested

Comments

@lrhn
Copy link
Member

lrhn commented Nov 6, 2020

The current proposal for tuples introduces a field1 getter for a positional entry and a name getter for a named entry named name.
It's made a compile-time error to have a named entry with the same name as a member of Object (toString, runtimeType, noSuchMethod, hashCode).

It's not defined what happens if a named entry has the name field1 and there is also a positional entry which would get that name, It should probably also be an error.
However, all these restrictions may interfere with uses of tuples. A function can have a named parameter named toString, and if you can't create a tuple with a named entry with that name, then you can't spread a tuple as all the arguments of a function call of that function. You can't properly abstract over function arguments if the tuples are restricted in names in ways that function parameters are not.

The fieldX names themselves may be open to discussion. The language specification doesn't use the name "field" (it's "instance varibles"), so hard-coding the term into the language seems a little odd. Any name can cause a problem

Should we do something else than naming the positional entries? Should we allow named entries with any name, and provide access to them?

@lrhn lrhn added question Further information is requested patterns Issues related to pattern matching. labels Nov 6, 2020
@lrhn
Copy link
Member Author

lrhn commented Nov 6, 2020

My proposal:
Allow the [] operator to be used on expressions of a tuple type. The argument must be a literal (possibly just a constant, but I'm not sure even that's worth it). It's not a method invocation, it's special language syntax for projecting from a tuple.
The operand must be either an integer or a symbol. You can also access named entries by .name as long as they're not conflicting with members of Object.
So:

(int, int, {int x, int hashCode}) p = (0, 1, x: 2, hashCode: 3);
print(p[0]); // 0
print(p[1]); // 1
print(p.x); // 2
print(p[#x]); // 2
print(p[#hashCode]); // 3

Again, there is no [] member on the "interface" of tuples. It's a language level operation you can perform on expressions of tuple type, and because the operand is a literal, and the tuple type is known (since the expression has a tuple type), the expression can be given the correct static type.

You can use .name for named entries, and you can fall back on [#name] in case of conflicts.

(This means that tuple types do occupy the [] operator, so extensions cannot add operator[] to a tuple type, just as you can't add call to Function or anything to dynamic or Never).

This gives a readable projection syntax. It requires named members to be added for positional entries, which avoids one source of name conflict, and it provides a canonical way to access any entry, as well as a convenient way for unconflicted named entries.

The DestructureX interfaces would have no reason to exist.

@eernstg
Copy link
Member

eernstg commented Nov 6, 2020

Looks good!

However, it won't take long before we get this request:

(int, int, {int x, int hashCode}) p = (0, 1, x: 2, hashCode: 3);
for (var index in [0, 1, #x, #hashCode]) print(p[index]);

We could handle this by a compile-time loop unrolling, but I guess the answer will be "Just say no!". ;-)

@munificent
Copy link
Member

It's not defined what happens if a named entry has the name field1 and there is also a positional entry which would get that name, It should probably also be an error.

Yeah, I think I made that an explicit error in some draft of the proposal but accidentally forgot it. I think this should be an error.

A function can have a named parameter named toString, and if you can't create a tuple with a named entry with that name, then you can't spread a tuple as all the arguments of a function call of that function.

This is a fair point, but I wonder if the complexity needed to support fully generalizing over all possible parameter lists carries its weight.

It's not a method invocation, it's special language syntax for projecting from a tuple. The operand must be either an integer or a symbol.

I think using the [] syntax for this would cause an unending series of bug reports from very confused users who don't understand why the operator works fine with computed indexes/keys on lists and maps, but not records.

I get the principle of not wanting to have certain member names treated as specially by the language, but I think adding a whole separate kind of "record field access" syntax and semantics to the language feels like a lot of mechanism for not a lot of value in return. We don't give you any way to define a method named call() in a class without that implicitly letting the class be used a function. Likewise, if I remember right, before Dart 2.0 you could run a for-in loop on anything that had a .iterator member, regardless of whether it declared it implemented Iterable. Neither of these seemed particularly problematic to me.

@lrhn
Copy link
Member Author

lrhn commented Nov 6, 2020

(For the record, I have always wanted you to use operator () or operator call instead of just call to make an object callable.)

@munificent
Copy link
Member

munificent commented Nov 9, 2020

@munificent: could you please explain your preference for the name fieldN over the more traditional $N?

Traditional according to what tradition? Using $ in an identifier is always dubious in Dart because it makes string interpolation really confusing. In practice, the only places I see $ used is in generated code, specifically to send a signal that "this is weird, don't use it".

Kotlin uses componentN, which is kind of long but I like that it's human-friendly.

I would have considered _n, but unfortunately the leading underscore is already taken in Dart to mean privacy.

(For the record, I have always wanted you to use operator () or operator call instead of just call to make an object callable.)

Me too. But not doing that hasn't seemed to have caused us much pain.

@munificent
Copy link
Member

at least it's used in sh. Kind of :-)

In my book, that's a signal to not do something. :)

I like .n notation. It's short and not confusing

I kind of like that and have idly considered it. It has at least one edge case ambiguity:

extension on (String, String) {
  method() {
    print(.0); // Oops! double literal not a field access on implicit this.
  }
}

But, overall, I think it's kind of strange looking and probably not worth adding new syntax for.

If this is not acceptable, then maybe C#'s variant .itemN is not that bad?

That's OK too. Another option is elementN.

I don't see anything wrong with [n] notation either. If n is a variable, there will be a runtime check. How is this case different from a list subscript? Why more "compile-time safety" is required here?

var record = ("str", 3);
var i = isTuesday ? 0 : 1;
var field = record[i];

What is the static type of field?

@leafpetersen
Copy link
Member

```dart
var record = ("str", 3);
var i = isTuesday ? 0 : 1;
var field = record[i];

What is the static type of field?

I'm actually somewhat interested in this syntax, and I would say that a[i] is a static error if a has a tuple type, and i is not an integer literal in the appropriate range. You to then need to decide whether (record as dynamic)[i] is a runtime error, or whether it evaluates (slowly) to the ith field. I think either approach is valid.

@eernstg
Copy link
Member

eernstg commented Nov 12, 2020

The question about the least upper bound of two tuple/record types is basically determined by the subtype structure (cf. #1278): If we have DestructureN types then there's subtyping among different shapes (width subtyping), but it's allowed to drop all named fields and some positional fields, and nobody has proposed that we should consider width subtyping for subsets of the named fields. So LUB((int, {int foo}), (int, {int foo, int bar}) == Destructure1<int>, and only the positional field can be looked up. For the (int, int) and (double, double) case we'd use covariance and get Destructure2<num, num>. I think the main question here is the underlying subtype structure, and then the least upper bounds and the questions about accessible members will follow.

For the treatment of (myRecord as dynamic)[i] or (myRecord as dynamic)[mySymbol], it could be claimed that it is consistent with much of Dart to perform the lookup at run-time (possibly slowly), but we could also note that this amounts to a dynamic member lookup operation, and that is not something which is available without reflection.

Hence, it seems consistent to require for e1[e2] where the type of e1 is a subtype of Record that e2 is a constant expression of type int or Symbol, and the type of e1[e2] is then the type of that component — and e1[e2] will throw at run time if e1 is a record with static type dynamic.

@lrhn
Copy link
Member Author

lrhn commented Nov 12, 2020

For LUB, I'd go with LUB(t1, t2) where t1 and t2 are tuple types as:

  • If t1 and t2 have the same structure (same number of positionals, same names) then do pointwise LUB on each position
  • otherwise the result is Record.

I don't want the DestructureX interfaces, and if they interfere with LUB, it becomes even more confusing.
If you do:

var p1 = (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, foo: 42);
var p2 = (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, bar: 42);
var p = test ? p1 : p2;  // infer Destructure10<int, int, ..., int>.
print (p.field0); // prints "1";

then it works, but if you then add an element to p1:

var p1 = (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, foo: 42);

then there is no Destructure11 interface and it has to either just go to Record or stay at Destructure10.
The latter is only possible if we make every tuple with >10 positional elements implement Destructure10 (which probably means making Destructure10 <: Destructure9 <: etc. Which I'm not sure I want either, even though it's obviously possible).
If we change the access to p[0] instead, then the DestructureX interfaces won't help for member access, you need a tuple type.

@lrhn
Copy link
Member Author

lrhn commented Nov 12, 2020

@tatumizer It's one way of thinking, but it requires (int, int, {int foo, int bar}) to be a subtype of (int, {int foo}). Subtyping means assignability. (Or at least it requires assignability since we're assigning (0, 1, foo: 5, bar: 5) to a which will have that type, we might avoid the subtyping by doing a coercion at the assignment).

Because of assignability, we should be able to do the un-conditional assignment (int, {int x}) p = (1, 1, bar: 1, foo: 1);. Throwing away values like that is very likely to hide errors. I want to not allow that, for the same reasons I don't want to allow over-applying functions. Example:

void foo(int x, {int y = 0}) => ...; 
foo(1, 1, foo: 0, bar: 0);

That looks like, and probably is, an error. We do not allow it. Tuple assignment is very closely related to parameter passing, so having significantly different behavior is a design smell.

For that reason, the current specification does not have any subtype relations between tuple types based on common sub-structure.

@munificent
Copy link
Member

I'm actually somewhat interested in this syntax, and I would say that a[i] is a static error if a has a tuple type, and i is not an integer literal in the appropriate range.

My earlier comment on this was: I think using the [] syntax for this would cause an unending series of bug reports from very confused users who don't understand why the operator works fine with computed indexes/keys on lists and maps, but not records.

For LUB, I'd go with LUB(t1, t2) where t1 and t2 are tuple types as:

  • If t1 and t2 have the same structure (same number of positionals, same names) then do pointwise LUB on each position
  • otherwise the result is Record.

That's what the proposal states too.

Here the type of a is computed as the type of "biggest common substructure". I think this is a logically consistent way of thinking.

You're correct. It's logically consistent and sound. In other languages, this is called "width subtyping" or "row polymorphism". However, I think it's looser than we want and is more likely to mask bugs than do anything useful. The goal of a type system isn't necessarily to statically allow all operations that would not fail at runtime. It's to help users write the code they intend to write and help them not write the code they don't intend to write.

@munificent
Copy link
Member

I went ahead and removed the synthesized getter names for positional fields from the proposal. This means that currently the only way to access positional fields is using pattern matching and destructuring. I'll re-open this issue to talk about potentially adding a new syntax for accessing those fields.

@munificent munificent reopened this Jun 27, 2022
@munificent
Copy link
Member

...and there is a newer issue to discuss this that I think has more up-to-date conversation, so I'll close this one in favor of that: #2388.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
patterns Issues related to pattern matching. question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants