@@ -4,7 +4,7 @@ Author: Bob Nystrom
4
4
5
5
Status: In-progress
6
6
7
- Version 0.3 (see [ CHANGELOG] ( #CHANGELOG ) at end)
7
+ Version 0.4 (see [ CHANGELOG] ( #CHANGELOG ) at end)
8
8
9
9
Experiment flag: unquoted-imports
10
10
@@ -118,7 +118,7 @@ The way I think about the proposed syntax is that relative imports are
118
118
* physical* in that they specify the actual relative path on the file system from
119
119
the current library to another library * file* . Because those are physical file
120
120
paths, they use string literals and file extensions as they do today. SDK and
121
- package imports are * logical* in that you don't know where the library your
121
+ package imports are * logical* in that you don't know where the library you're
122
122
importing lives on your disk. What you know is it's * logical name* and the
123
123
relative location of the library you want inside that package. Since these are
124
124
abstract references to a * library* , they are unquoted and omit the file
@@ -160,7 +160,7 @@ the reasons for the choices this proposal makes:
160
160
161
161
### Path separator
162
162
163
- An import shorthand syntax that only supported a single identifier would work
163
+ A package shorthand syntax that only supported a single identifier would work
164
164
for packages like ` test ` and ` args ` that only expose a single library, but
165
165
would fail for even very common libraries like ` package:flutter/material.dart ` .
166
166
So we need some notion of a package name and a path within the that package.
@@ -220,27 +220,30 @@ import flutter/material;
220
220
Is the ` flutter/material ` part a single token or three (` flutter ` , ` / ` , and
221
221
` material ` )? The main advantage of tokenizing it as a single monolithic token is
222
222
that we could potentially allow characters or identifiers in there aren't
223
- otherwise valid Dart. For example, we could let you use reserved words as path
224
- segments :
223
+ otherwise valid Dart. For example, we could let you use hyphens as word
224
+ separators as in :
225
225
226
226
``` dart
227
- import weird_package/for/if/ ok;
227
+ import weird-package/but- ok;
228
228
```
229
229
230
230
The disadvantage is that the tokenizer doesn't generally have enough context to
231
- know when it should tokenize ` foo/bar ` as a single import path token versus
231
+ know when it should tokenize ` foo/bar ` as a single package path token versus
232
232
three tokens that are presumably dividing two variables named ` foo ` and ` bar ` .
233
233
234
- Unlike Lasse's [ earlier proposal] [ lasse ] , this proposal does * not* tokenize an
235
- import path as a single token. Instead, it's tokenized using Dart's current
234
+ Unlike Lasse's [ earlier proposal] [ lasse ] , this proposal does * not* tokenize a
235
+ package path as a single token. Instead, it's tokenized using Dart's current
236
236
lexical grammar.
237
237
238
- This means you can't have a path segment that's a reserved word or is otherwise
239
- not a valid Dart identifier. Fortunately, our published guidance has * always*
240
- told users that [ package names] [ name guideline ] and [ directories] [ directory
241
- guideline] should be valid Dart identifiers. Pub will complain if you try to
242
- publish a package whose name isn't a valid identifier. Likewise, the linter will
243
- flag directory or library names that aren't identifiers.
238
+ This means you can't have a path segment that uses some combination of
239
+ characters that isn't currently a single token in Dart, like ` hyphen-separated `
240
+ or ` 123LeadingDigits ` . A path component must be an identifier (which may be a
241
+ reserved word or built-in identifier, discussed below). Fortunately, our
242
+ published guidance has * always* told users that [ package names] [ name guideline ]
243
+ and [ directories] [ directory guideline ] should be valid Dart identifiers. Pub
244
+ will complain if you try to publish a package whose name isn't a valid
245
+ identifier. Likewise, the linter will flag directory or file names that aren't
246
+ identifiers.
244
247
245
248
[ name guideline ] : https://dart.dev/tools/pub/pubspec#name
246
249
[ directory guideline ] : https://dart.dev/effective-dart/style#do-name-packages-and-file-system-entities-using-lowercase-with-underscores
@@ -258,18 +261,49 @@ in a large corpus of pub packages and open source widgets:
258
261
69 ( 0.010%): dotted with non-identifiers =
259
262
```
260
263
261
- This splits every "package:" import's path into segments separated by ` / ` . Then
262
- for each segment, it reports whether the segment is a valid identifier, a
263
- built-in identifier like ` dynamic ` or ` covariant ` , etc. Almost all segments are
264
- either valid identifiers, or dotted identifiers where each subcomponent is a
265
- valid identifier.
264
+ This splits every "package:" path into segments separated by ` / ` . Then it splits
265
+ segments into components separated by ` . ` For each component, the analysis
266
+ reports whether the component is a valid identifier, a built-in identifier like
267
+ ` dynamic ` or ` covariant ` , or a reserved word like ` for ` or ` if ` .
266
268
267
- (For the very small number that aren't, they can continue to use the old quoted
268
- "package:" import syntax to import the library.)
269
+ Components that are not some kind of identifier (regular, reserved, or built-in)
270
+ are vanishingly rare. In those few cases, if a user can't simply rename the
271
+ file, they can continue to use the old quoted "package:" syntax to refer to the
272
+ file.
269
273
270
- I think this approach is much simpler than trying to add special lexing rules.
271
- It's consistent with how Java, C# and other languages parse their imports. It
272
- does mean users can do silly things like:
274
+ ### Reserved words and semi-reserved words
275
+
276
+ One confusing area of Dart that the previous table hints at is that Dart has
277
+ several categories of identifiers that vary in how user-accessible they are:
278
+
279
+ * Reserved words like ` for ` and ` class ` can never be used by a user as a
280
+ regular identifier in any context.
281
+
282
+ * Built-in identifiers like ` abstract ` and ` interface ` can't be used as * type*
283
+ names but can be used as other kinds of identifiers.
284
+
285
+ * Contextual keywords like ` await ` and ` show ` behave like keywords in some
286
+ specific contexts but are usable as regular identifiers everywhere else.
287
+
288
+ This leads to confusion about which of these flavors of identifiers can be used
289
+ as package paths. Which of these, if any, are valid:
290
+
291
+ ``` dart
292
+ import if/else;
293
+ import abstract/interface;
294
+ import show/hide;
295
+ ```
296
+
297
+ Many Dart users (including experts, some of whom may be members of the Dart
298
+ language team) don't know the full list of reserved or semi-reserved words. We
299
+ don't want them to run into problems determining which identifiers work in
300
+ package paths. To that end, we allow * all* identifiers, including reserved
301
+ words, built-in identifiers, and contextual keywords as path segments.
302
+
303
+ ### Whitespace and comments
304
+
305
+ If we don't use any special tokenizing rules for the path, that suggests that
306
+ whitespace and comments are allowed between the tokens as in:
273
307
274
308
``` dart
275
309
import strange /* comment */ . but
@@ -281,7 +315,37 @@ import strange /* comment */ . but
281
315
fine;
282
316
```
283
317
284
- But they can also choose to * not* do that.
318
+ This wouldn't cause any problems for a Dart implementation. It would simply
319
+ discard the whitespace and comments as it does elsewhere and the resulting path
320
+ is ` strange.but/another/fine ` .
321
+
322
+ However, it likely causes problems for Dart * users* and other simpler tools and
323
+ scripts that work with Dart code. In particular, we often see homegrown tools
324
+ that want to "parse" a Dart file to find its package references and traverse the
325
+ dependency graph. While these tools ideally should use a full Dart parser (like
326
+ the one in the [ analyzer package] [ ] , which is freely available), the reality is
327
+ that users often cobble together simple scripts using regex to do this kind of
328
+ parsing, or they need to write these tools in a language other than Dart. In
329
+ those cases, if the package path happens to contain whitespace or comments, the
330
+ tool will likely silently fail to recognize the package path.
331
+
332
+ [ analyzer package ] : https://pub.dev/packages/analyzer
333
+
334
+ Also, we find no compelling * use* for whitespace and comments inside package
335
+ paths. To that end, this proposal makes it an error. All of the tokens in the
336
+ path must be directly adjacent with no whitespace, newlines, or comments between
337
+ them. The previous import is an error. However, we still allow comments in or
338
+ after the directives outside of the path. These are all valid:
339
+
340
+ ``` dart
341
+ import /* Weird but OK. */ some/path;
342
+ export some/path; // Hi there.
343
+ part some/path // Before the semicolon? Really?
344
+ ;
345
+ ```
346
+
347
+ The syntax that results from the above few sections is simple to tokenize and
348
+ parse while looking like a single opaque "unquoted string" to users and tools.
285
349
286
350
## Syntax
287
351
@@ -291,27 +355,33 @@ We add a new rule and hang it off the existing `uri` rule already used by import
291
355
and export directives:
292
356
293
357
```
294
- uri ::= stringLiteral | packagePath
295
- packagePath ::= packagePathSegment ( '/' packagePathSegment )*
296
- packagePathSegment ::= dottedIdentifierList
297
- dottedIdentifierList ::= identifier ('.' identifier)*
358
+ uri ::= stringLiteral | packagePath
359
+ packagePath ::= pathSegment ( '/' pathSegment )*
360
+ pathSegment ::= segmentComponent ( '.' segmentComponent )*
361
+ segmentComponent ::= identifier
362
+ | ⟨RESERVED_WORD⟩
363
+ | ⟨BUILT_IN_IDENTIFIER⟩
364
+ | ⟨OTHER_IDENTIFIER⟩
298
365
```
299
366
300
- An import or export can continue to use a ` stringLiteral ` for the quoted form
301
- (which is what they will do for relative imports). But they can also use a
302
- ` packagePath ` , which is a slash-separated series of segments, each of which is a
303
- series of dot-separated identifiers. * (The ` dottedIdentifierList ` rule is
304
- already in the grammar and is shown here for clarity.)*
367
+ It is a compile-time error if any whitespace, newlines, or comments occur
368
+ between any of the ` segmentComponent ` , ` / ` , or ` . ` tokens in a ` packagePath ` .
369
+ * In other words, there can be nothing except the terminals themselves from the
370
+ first ` segmentComponent ` in the ` packagePath ` to the last.*
371
+
372
+ * An import, export, or part directive can continue to use a ` stringLiteral ` for
373
+ the quoted form (which is what they will do for relative references). But they
374
+ can also use a ` packagePath ` , which is a slash-separated series of segments,
375
+ each of which is a series of dot-separated components.*
305
376
306
377
### Part directive lookahead
307
378
308
- * There are two directives for working with part files, ` part ` and ` part of ` . The
309
- ` of ` identifier is not a reserved word in Dart. This means that when the parser
310
- sees ` part of ` , it doesn't immediately know if it is looking at a ` part `
311
- directive followed by an unquoted identifier like ` part of; ` or `part
312
- of.some/other.thing;` versus a ` part of` directive like ` part of thing;` or
313
- ` part of 'uri.dart'; ` It must lookahead past the ` of ` identifier to see if the
314
- next token is ` ; ` , ` . ` , ` / ` , or another identifier.*
379
+ * There are two directives for working with part files, ` part ` and ` part of ` .
380
+ This means that when the parser sees ` part of ` , it doesn't immediately know if
381
+ it is looking at a ` part ` directive followed by an unquoted identifier like
382
+ ` part of; ` or ` part of.some/other.thing; ` versus a ` part of ` directive like
383
+ ` part of thing; ` or ` part of 'uri.dart'; ` It must lookahead past the ` of `
384
+ identifier to see if the next token is ` ; ` , ` . ` , ` / ` , or another identifier.*
315
385
316
386
* This may add some complexity to parsing, but should be minor. Dart's grammar
317
387
has other places that require much more (sometimes unbounded) lookahead.*
@@ -322,23 +392,20 @@ The semantics of the new syntax are defined by taking the `packagePath` and
322
392
converting it to a string. The directive then behaves as if the user had written
323
393
a string literal containing that string. The process is:
324
394
325
- 1 . Let the * segment* for a ` packagePathSegment ` be a string defined by the
326
- ordered concatenation of the ` identifier ` and ` . ` terminals in the
327
- ` packagePathSegment ` , with all whitespace and comments removed. * So if
328
- ` packagePathSegment ` is ` a . b /* comment */ . c ` , then its * segment* is
395
+ 1 . Let the * segment* for a ` pathSegment ` be a string defined by the ordered
396
+ concatenation of the ` segmentComponent ` and ` . ` terminals in the
397
+ ` pathSegment ` . * So if ` pathSegment ` is ` a.b.c ` , then its * segment* is
329
398
"a.b.c".*
330
399
331
- 2 . Let * segments* be an ordered list of the segments of each
332
- ` packagePathSegment ` in ` packagePath ` . * In other words, this and the
333
- preceding step take the ` packagePath ` and convert it to a list of segment
334
- strings while discarding whitespace and comments. So if ` packagePathSegment `
335
- is ` a . b /* comment */ / c / d . e ` , then * segments* is [ "a.b", "c",
336
- "d.e"] .*
400
+ 2 . Let * segments* be an ordered list of the segments of each ` pathSegment ` in
401
+ ` packagePath ` . * In other words, this and the preceding step take the
402
+ ` packagePath ` and convert it to a list of segment strings. So if
403
+ ` pathSegment ` is ` a.b/c/d.e ` , then * segments* is [ "a.b", "c", "d.e"] .*
337
404
338
405
3 . If the first segment in * segments* is "dart":
339
406
340
- 1 . It is a compile error if there are no subsequent segments. * There's no
341
- "dart: dart " or "package: dart /dart.dart" library. We reserve the right
407
+ 1 . It is a compile-time error if there are no subsequent segments. * There's
408
+ no "dart: dart " or "package: dart /dart.dart" library. We reserve the right
342
409
to use ` import dart; ` in the future to mean something useful.*
343
410
344
411
2 . Let * path* be the concatenation of the remaining segments, separated
@@ -354,14 +421,14 @@ a string literal containing that string. The process is:
354
421
355
422
1 . Let * name* be the segment.
356
423
357
- 2 . Let * path* be the last identifier in the segment. * If the segment is
358
- only a single identifier , this is the entire segment. Otherwise, it's
359
- the last identifier after the last ` . ` . So in ` foo ` , * path * is ` foo ` .
360
- In ` foo.bar.baz ` , it's ` baz ` .*
424
+ 2 . Let * path* be the last ` segmentComponent ` in the segment. * If the
425
+ segment is only a single ` segmentComponent ` , this is the entire segment.
426
+ Otherwise, it's the last identifier after the last ` . ` . So in ` foo ` ,
427
+ * path * is ` foo ` . In ` foo.bar.baz ` , it's ` baz ` .*
361
428
362
429
3 . The URI is "package:* name* /* path* .dart". * So ` import test; ` desugars to
363
- ` import "package:test/test.dart"; ` , and ` import server.api; ` desugars
364
- to ` import "package:server.api/api.dart"; ` .*
430
+ ` import "package:test/test.dart"; ` , and ` import server.api; ` desugars to
431
+ ` import "package:server.api/api.dart"; ` .*
365
432
366
433
5 . Else:
367
434
@@ -463,7 +530,7 @@ this proposal's semantics. In other words, `part of foo.bar;` is part of the
463
530
library at ` package:foo/bar.dart ` , not part of the library with name ` foo.bar ` .
464
531
465
532
Users affected by the breakage can and should update their ` part of ` directive
466
- to point to the URI of the library that the file is a part, using either the
533
+ to point to the URI of the library that the file is a part of , using either the
467
534
quoted or unquoted syntax.
468
535
469
536
### Language versioning
@@ -501,6 +568,12 @@ new unquoted style whenever an existing directive could use it.
501
568
502
569
## Changelog
503
570
571
+ ### 0.4
572
+
573
+ - Allow reserved words and built-in identifiers as path components (#3984 ).
574
+
575
+ - Disallow whitespace and comments inside package paths (#3983 ).
576
+
504
577
### 0.3
505
578
506
579
- Address breaking change in ` part of ` directives with library names.
0 commit comments