Skip to content

Unified collections #200

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Mar 2, 2019
Merged

Unified collections #200

merged 15 commits into from
Mar 2, 2019

Conversation

munificent
Copy link
Member

OK, here's my attempt to pull together a unified proposal for set literals, spread, and control flow. The basic process was:

  • Take type inference and disambiguation from the doc Leaf and Lasse put together. I renamed a few things, mostly around "part" -> "element", since "part" already means something in Dart.

  • Take some grammar changes from that doc but mostly pull them from the spread and control flow proposals since those were more comprehensive. (The doc didn't have C-style for, etc.) Simplified it where I could.

  • Take the static error checking from the spread and control flow docs, and maybe the set literal one? I don't recall now.

  • Take const stuff from the doc and the proposals. Right now, it does not separately specify how to produce a const value. I think the runtime semantics are general enough to work for that, with some modifications spelled out in the proposal.

  • Take runtime semantics from my two proposals since I think those were the most complete.

Let me know what you think!

- Has all of the type inference and disambiguation.

- Some of the rules around const.

- Does not have the static error rules.

- The runtime semantics are there, but won't work for const values.

- A bunch of other TODOs.
Copy link
Member

@lrhn lrhn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Lots of nitpicky details, but overall I think it's very good.

@lrhn
Copy link
Member

lrhn commented Feb 1, 2019

Thinking about the runtime semantics more, I think the way to go is to say that you execute an element (rather than evaluate it), and that execution ends up either throwing an error and a stack trace, or completing normally.

That termination behavior is a subtype of exectuting a statement (statements can also break, continue or return), so then we can use one specification of the behavior of an await for, for, or if construct that works exactly the same on an element and a statement body (it "executes" the body and handles normal completion and non-normal completion generally).

Executing an element has the side effect of adding elements to a result list, but does not evaluate to an actual value, so it's not too weird to think of them as similar to statements.

In other words, we just write:

  • If e is an async forElement, await for (varOrDecl in exp) elem, then execution of e is specified by (some reference to a section giving the general behavior, the one we already have).

and then we don't need to specify the loops or condition again. DRY, etc.

That general section then just needs to say "statement or element" about its body, instead of "statement", but otherwise it doesn't need to change. That avoids the need to duplicate a very complicated piece of semantics (my comments were from memory, there are probably more details).

We can do the same for if statements/elements. Execution of an if (b) e1 else e2 proceeds by evaluating b to a value, performing boolean conversion, then executing one of e1 or e2. We have the text already, we again just need to say statement or element about the branches.

(We might also be able to define ...e as equivalent to for (var x in e) x, but we'd need a temporary variable for ...?, so it's not a clean rewrite).

Copy link
Member

@eernstg eernstg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks really good! I've added some comments.

@lrhn
Copy link
Member

lrhn commented Feb 8, 2019

My thoughs on spreads are evolving.
We might want to allow spreads to be implemented using addAll on the collection we are creating (it might even be the definition, then we can always optimize the three addAll methods).

If we do that, then there is a constraint that the spreadee in a List<T> must be a subtype of Iterable<T>. We can't just allow any supertype and cast each value because addAll doesn't accept the supertype.
So, the rule

A spread element in a list or set has a static type that implements
Iterable<T> for some T and T is not assignable to the element type of
the list.

should change "assignable to" to "subtype of", and the same for other similar cases.
Without that change, we cannot implement [...x] as []..addAll(x), which is a relevant implementation choice.

If we remove implicit downcasts in the future, this will happen anyway, we cannot accept a List<Object> being spread in a Set<int>, even if we know it contains only integers, because it would be an implicit downcast for each element. We might as well disallow it now.

I'm not sure whether we should disallow spreading dynamic as well, unless the collection element type is a top type. (I have the nagging feeling that we might want to keep allowing implicit downcast from dynamic in the future, even if we remove implicit downcasts in general).

WDYT?

@munificent
Copy link
Member Author

That termination behavior is a subtype of exectuting a statement (statements can also break, continue or return), so then we can use one specification of the behavior of an await for, for, or if construct that works exactly the same on an element and a statement body (it "executes" the body and handles normal completion and non-normal completion generally).

I like this idea, but I feel like it's out of scope for this document. When it goes into the real language spec, I would love to dedupe the specification for control flow statements and elements. But for this, I think implementers just need to know what the new behavior is.

@munificent
Copy link
Member Author

We might want to allow spreads to be implemented using addAll on the collection we are creating (it might even be the definition, then we can always optimize the three addAll methods).

I think that's a reasonable implementation choice, provided the implementation of addAll() uses the protocol that the spread spec says the semantics desugar to. I don't think it makes sense for the spec to mention addAll() because that's not really defined — it doesn't state what methods it calls on the spreadee, nor what it does with the values it gets out of it.

We can't just allow any supertype and cast each value because addAll doesn't accept the supertype.

Ah, that's an interesting point.

If we remove implicit downcasts in the future, this will happen anyway, we cannot accept a List<Object> being spread in a Set<int>, even if we know it contains only integers, because it would be an implicit downcast for each element. We might as well disallow it now.

So the idea then is that if you do want to downcast each element, you do it like:

var nums = <num>[1, 2, 3];
var ints = <int>[...nums]; // No.
var ints = <int>[...nums.whereType<int>()]; // Yes.
var ints = <int>[for (var n in nums) n as int]; // Yes.

I think this is probably where we'll go if we do kill implicit downcasts. But we aren't going to do that yet, so I'm currently leaning towards leaving spread as it is with the understanding that we'll revise it when we get rid of implicit downcasts. Does that sound reasonable?

I'm not sure whether we should disallow spreading dynamic as well, unless the collection element type is a top type. (I have the nagging feeling that we might want to keep allowing implicit downcast from dynamic in the future, even if we remove implicit downcasts in general).

I don't have strong feelings on this but in Dart today, it feels weird to disallow spreading dynamic, and this proposal should ship pretty soon, so I think the current behavior makes sense with the surrounding language as it is.

This responds to all of the small-scale feedback on the pull request.
There are still a few things here and there I need to take into account.

The bigger question about what we want spread to desugar too is still an
open question that I want to explore more, but I wanted to get this out
first.
These are based on what we discussed at this morning's language
meeting.
# Conflicts:
#	accepted/future-releases/set-literals/feature-specification.md
* Reword disambiguation to avoid the confusing mouthful "syntactically
  known to be", especially because sometimes it relies on context types,
  which aren't syntax.

* Moved some compile-time error checking to a more logical place.

* Reworded to make it clear that inference happens even in unambiguous
  collections.
@munificent
Copy link
Member Author

OK, I think I've responded to everything. I left the Boolean conversion stuff in there because I think it's still useful to handle the "null throws" case. I made the performance changes we discussed this morning.

Is there anything else? I'd like to land this soon so the implementers can use it.

@lrhn
Copy link
Member

lrhn commented Feb 15, 2019

Had a thought, on the "everything happens during type inference".
It's probably not necessary, and would give the same result in the end, but I'll just record it here for posterity:

We could phrase the entire "figure out whether it's a set or a map" as being type inference.
The run-time behavior would then just depend on the inferred static type of the expression, as assigned during type inference. The static type will always be either Set<X> or Map<Y, Z> for some X, Y and Z.

Type inference of a setOrMapLiteral with context type C is performed as follows:

  • If the literal has one type argument, T, then perform type inference on the elements
    with context type Set<T>, and the static type of the literal is Set<T>.
  • If the literal has two type arguments, K and V, then perform type inference on the
    elements with context type Map<K, V>, and the static type of the literal is Map<K, V>.
  • Otherwise, let S is be the mininmal closure of the base type (etc.) of C.
    • If S is a subtype of Iterable<Object> and not a subtype of Map<Object, Object>,
      then perform type inference on the elements with context type Set<T>,
      where T is derived from S (... the usual way). The static type of the literal is Set<T>.
  • If S is a subtype > of Map<Object, Object> and not a subtype of Iterable<Object>,
    then perform type inference on the elements with context type Map<K, V>,
    where K and V are derived from S (... as usual).
    The static type of the literal is Map<K, V>.
  • Otherwise,
    • If the leaf elements of the literal contains at least one element expression and no map entries,
      then perform type inference of the elements with context type Set<?>.
      If all elements have a set element type, then let T be the least upper bound of those
      set elements, and the static type of the literal is Set<T>.
      Otherwise it is a compile-time error.
    • If the leaf elements of the literal contains a map entry and no element expression,
      then perform type inference of the elements with context type Map<?, ?>.
      If all elements have a map key type and a map value type, then
      let K be the least upper bound of those map key types,
      and let V be the least upper bound of the map value types,
      and the static type of the literal is Map<K, V>.
      Otherwise it is a compile-time error.
    • If the literal contains no elements, then it's static type is Map<dynamic, dynamic>.
    • Otherwise, the literal contains no element expressions or map entries, but may contain
      spreads, possibly nested inside ifElements and forElements.
      In that case, perform type inference of the elements with context type ?.
      • If all elements have a set element type, and not all elements have a
        map key and value type, let T be the least upper bound of the set element types of
        the elements. The static type of the literal is Set<T>.
      • If all elements have map key and value types, and not all elements have a set element
        type, then let K be the least upper bound of those map key types,
        and let V be the least upper bound of the map value types.
        The static type of the literal is Map<K, V>.
      • Otherwise it is a compile-time error. This can happen if all the spreads of the
        literal can be both iterables and maps (including dynamic) or null-aware spreads
        of values of type Null, or if one spread can only be an iterable and another can
        only be a map.

Type inference of an element with a context type C may assign a set element type
to the element, and it may assign a map key type and a map value type to the element.
Some elements may be assigned both, representing the case where it's not statically
decidable from that element alone whether it must be spread in a map or a set.
Some elements may be assigned neither, representing an unsatisfiable constraint
which will necessarily cause a compile-time error later.

... for each case ...
... if context type is Set<T> ...
... if context type is Map<K, V> ...
... if context type is ?

Dynamic semantics then create a collection matching the static type of the literal, and introduces elements or entries into it (or fails dynamically).

Do you think this might be easier to understand, because it's just one group instead of splitting it into two?

@eernstg Would it be useful as the spec approach? (I think of type inference as assigning a context type and a static type to each expression, and some type arguments to expressions that have left those out).

@munificent
Copy link
Member Author

OK, I removed the references to "Boolean conversion" and specified the behavior explicitly. (I probably could have done so in a more terse way, but I think what I have works.)

Let me know if there's any other changes you'd like to see.

dart-bot pushed a commit to dart-lang/sdk that referenced this pull request Feb 20, 2019
This is the first of several CLs updating the parser and its listeners
to conform to the unified collection spec:
dart-lang/language#200

Change-Id: I7750bc4b029f3963b2df77dab3630775ce921bba
Reviewed-on: https://dart-review.googlesource.com/c/93740
Reviewed-by: Brian Wilkerson <[email protected]>
Commit-Queue: Dan Rubel <[email protected]>
Copy link
Member

@eernstg eernstg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Bob,
sorry about the delay, I had an interview and other stuff in the morning.
I suggested a handful of changes, most of them near-typo fixes.
Apart from that, it's about making sure that leaf elements includes only the elements that we want, making sure that we apply the inference step when needed (in particular: on literals with 'unknown static type'), and then there are a couple of locations where we need to ascertain that there are no improper circularities in the definition. With those things in mind, I think it would be useful to get buy-in from Lasse on Monday so I'll just keep this as a 'comment' review.

Copy link
Member

@eernstg eernstg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

dart-bot pushed a commit to dart-lang/sdk that referenced this pull request Feb 22, 2019
This is the second of several CLs updating the parser and its listeners
to conform to the unified collection spec:
dart-lang/language#200

Change-Id: I5c277d05a3726a3f5a40823cf878f025167340e6
Reviewed-on: https://dart-review.googlesource.com/c/93741
Reviewed-by: Brian Wilkerson <[email protected]>
Commit-Queue: Dan Rubel <[email protected]>
dart-bot pushed a commit to dart-lang/sdk that referenced this pull request Feb 22, 2019
... and no longer generate handleLiteralSet or handleLiteralMap.
In addition, a new hasSetEntry parameter has been added to the
handleLiteralSetOrMap event generated by the parser to support
existing behavior. Once all listeners have implemented
unified collections and that feature is enabled by default,
the hasSetEntry parameter can be removed.

This is the third of several CLs updating the parser and its listeners
to conform to the unified collection spec:
dart-lang/language#200

Change-Id: Ia305eab1f720658f357ac4102b0b0c8128d16997
Reviewed-on: https://dart-review.googlesource.com/c/93963
Reviewed-by: Brian Wilkerson <[email protected]>
@eernstg
Copy link
Member

eernstg commented Feb 26, 2019

One more thing that we might want to include: As far as I can see, there is a missing dynamic error for a number of cases.

We have this:

  1. Else, if element is an asynchronous await for-in element:
    ..
    It is a dynamic error if stream is not an instance of a class that implements Stream.

but there is no such sentence for the sequence of a regular for-in, or the spread of a spreadElement, and I believe it would be a natural approach for Dart 2 to have these errors as soon as we have obtained the object (rather than just trying dynamically whether it happens to support methods like iterator or entries).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants