diff --git a/standard/expressions.md b/standard/expressions.md index d87c5338b..2c7bc5a03 100644 --- a/standard/expressions.md +++ b/standard/expressions.md @@ -4161,7 +4161,7 @@ equality_expression ; ``` -> *Note*: Lookup for the right operand of the `is` operator must first test as a *type*, then as an *expression* which may span multiple tokens. In the case where the operand is an *expreesion*, the pattern expression must have precedence at least as high as *shift_expression*. *end note* +> *Note*: Lookup for the right operand of the `is` operator must first test as a *type*, then as an *expression* which may span multiple tokens. In the case where the operand is an *expression*, the pattern expression must have precedence at least as high as *shift_expression*. *end note* The `is` operator is described in [§12.12.12](expressions.md#121212-the-is-operator) and the `as` operator is described in [§12.12.13](expressions.md#121213-the-as-operator). diff --git a/standard/lexical-structure.md b/standard/lexical-structure.md index 49c0a1008..cdc01ce91 100644 --- a/standard/lexical-structure.md +++ b/standard/lexical-structure.md @@ -52,7 +52,18 @@ Every compilation unit in a C# program shall conform to the *compilation_unit* ### 6.2.5 Grammar ambiguities -The productions for *simple_name* ([§12.8.4](expressions.md#1284-simple-names)) and *member_access* ([§12.8.7](expressions.md#1287-member-access)) can give rise to ambiguities in the grammar for expressions. +The productions for: + +- *simple_name* ([§12.8.4](expressions.md#1284-simple-names)), +- *member_access* ([§12.8.7](expressions.md#1287-member-access)), +- *null_conditional_member_access* ([§12.8.8](expressions.md#1288-null-conditional-member-access)), +- *dependent_access* ([§12.8.8](expressions.md#1288-null-conditional-member-access)), +- *base_access* ([§12.8.15](expressions.md#12815-base-access)) and +- *pointer_member_access* ([§23.6.3](unsafe-code.md#2363-pointer-member-access)); + +(the “disambiguated productions”) can give rise to ambiguities in the grammar for expressions. + +These productions occur in contexts where a value can occur in an expression, and have one or more alternatives that end with the grammar “`identifier type_argument_list?`”. It is the optional *type_argument_list* which results in the possible ambiguity. > *Example*: The statement: > @@ -65,16 +76,15 @@ The productions for *simple_name* ([§12.8.4](expressions.md#1284-simple-names)) > > *end example* -If a sequence of tokens can be parsed (in context) as a *simple_name* ([§12.8.4](expressions.md#1284-simple-names)), *member_access* ([§12.8.7](expressions.md#1287-member-access)), or *pointer_member_access* ([§23.6.3](unsafe-code.md#2363-pointer-member-access)) ending with a *type_argument_list* ([§8.4.2](types.md#842-type-arguments)), the token immediately following the closing `>` token is examined, to see if it is +If a sequence of tokens can be parsed, in context, as one of the disambiguated productions including an optional *type_argument_list* ([§8.4.2](types.md#842-type-arguments)), then the token immediately following the closing `>` token shall be examined and if it is: -- One of `( ) ] } : ; , . ? == != | ^ && || & [`; or -- One of the relational operators `< <= >= is as`; or -- A contextual query keyword appearing inside a query expression; or -- In certain contexts, *identifier* is treated as a disambiguating token. Those contexts are where the sequence of tokens being disambiguated is immediately preceded by one of the keywords `is`, `case` or `out`, or arises while parsing the first element of a tuple literal (in which case the tokens are preceded by `(` or `:` and the identifier is followed by a `,`) or a subsequent element of a tuple literal. +- one of `( ) ] } : ; , . ? == != | ^ && || & [`; or +- one of the relational operators `< <= >= is as`; or +- a contextual query keyword appearing inside a query expression. -If the following token is among this list, or an identifier in such a context, then the *type_argument_list* is retained as part of the *simple_name*, *member_access* or *pointer_member-access* and any other possible parse of the sequence of tokens is discarded. Otherwise, the *type_argument_list* is not considered to be part of the *simple_name*, *member_access* or *pointer_member_access*, even if there is no other possible parse of the sequence of tokens. +then the *type_argument_list* shall be retained as part of the disambiguated production and any other possible parse of the sequence of tokens discarded. Otherwise, the tokens parsed as a *type_argument_list* shall not be considered to be part of the disambiguated production, even if there is no other possible parse of those tokens. -> *Note*: These rules are not applied when parsing a *type_argument_list* in a *namespace_or_type_name* ([§7.8](basic-concepts.md#78-namespace-and-type-names)). *end note* +> *Note*: These disambiguation rules shall not be applied when parsing other productions even if they similarly end in “`identifier type_argument_list?`”; such productions shall be parsed as normal. Examples include: *namespace_or_type_name* ([§7.8](basic-concepts.md#78-namespace-and-type-names)); *named_entity* ([§12.8.23](expressions.md#12823-the-nameof-operator)); *null_conditional_projection_initializer* ([§12.8.8](expressions.md#1288-null-conditional-member-access)); and *qualified_alias_member* ([§14.8.1](namespaces.md#1481-general)). *end note* @@ -124,7 +134,7 @@ If the following token is among this list, or an identifier in such a context, t > > *end example* -When recognising a *relational_expression* ([§12.12.1](expressions.md#12121-general)) if both the “*relational_expression* `is` *type*” and “*relational_expression* `is` *constant_pattern*” alternatives are applicable, and *type* resolves to an accessible type, then the “*relational_expression* `is` *type*” alternative shall be chosen. +When recognising a *relational_expression* ([§12.12.1](expressions.md#12121-general)) if both the “*relational_expression* `is` *type*” and “*relational_expression* `is` *pattern*” alternatives are applicable, and *type* resolves to an accessible type, then the “*relational_expression* `is` *type*” alternative shall be chosen. ## 6.3 Lexical analysis @@ -189,7 +199,7 @@ Line terminators divide the characters of a C# compilation unit into lines. ```ANTLR New_Line : New_Line_Character - | '\u000D\u000A' // carriage return, line feed + | '\u000D\u000A' // carriage return, line feed ; ``` @@ -262,7 +272,7 @@ fragment Input_Character // anything but New_Line_Character : ~('\u000D' | '\u000A' | '\u0085' | '\u2028' | '\u2029') ; - + fragment New_Line_Character : '\u000D' // carriage return | '\u000A' // line feed @@ -270,11 +280,11 @@ fragment New_Line_Character | '\u2028' // line separator | '\u2029' // paragraph separator ; - + fragment Delimited_Comment : '/*' Delimited_Comment_Section* ASTERISK+ '/' ; - + fragment Delimited_Comment_Section : SLASH | ASTERISK* Not_Slash_Or_Asterisk @@ -428,7 +438,7 @@ fragment Available_Identifier fragment Escaped_Identifier // Includes keywords and contextual keywords prefixed by '@'. // See note below. - : '@' Basic_Identifier + : '@' Basic_Identifier ; fragment Basic_Identifier @@ -664,16 +674,16 @@ fragment Decimal_Integer_Literal fragment Decorated_Decimal_Digit : '_'* Decimal_Digit ; - + fragment Decimal_Digit : '0'..'9' ; - + fragment Integer_Type_Suffix : 'U' | 'u' | 'L' | 'l' | 'UL' | 'Ul' | 'uL' | 'ul' | 'LU' | 'Lu' | 'lU' | 'lu' ; - + fragment Hexadecimal_Integer_Literal : ('0x' | '0X') Decorated_Hex_Digit+ Integer_Type_Suffix? ; @@ -681,11 +691,11 @@ fragment Hexadecimal_Integer_Literal fragment Decorated_Hex_Digit : '_'* Hex_Digit ; - + fragment Hex_Digit : '0'..'9' | 'A'..'F' | 'a'..'f' ; - + fragment Binary_Integer_Literal : ('0b' | '0B') Decorated_Binary_Digit+ Integer_Type_Suffix? ; @@ -693,7 +703,7 @@ fragment Binary_Integer_Literal fragment Decorated_Binary_Digit : '_'* Binary_Digit ; - + fragment Binary_Digit : '0' | '1' ; @@ -723,14 +733,14 @@ To permit the smallest possible `int` and `long` values to be written as integer > 1_2__3___4____5 // decimal, int > _123 // not a numeric literal; identifier due to leading _ > 123_ // invalid; no trailing _allowed -> +> > 0xFf // hex, int > 0X1b_a0_44_fEL // hex, long > 0x1ade_3FE1_29AaUL // hex, ulong > 0x_abc // hex, int > _0x123 // not a numeric literal; identifier due to leading _ > 0xabc_ // invalid; no trailing _ allowed -> +> > 0b101 // binary, int > 0B1001_1010u // binary, uint > 0b1111_1111_0000UL // binary, ulong @@ -774,7 +784,7 @@ If no *Real_Type_Suffix* is specified, the type of the *Real_Literal* is `double - A real literal suffixed by `D` or `d` is of type `double`. > *Example*: The literals `1d`, `1.5d`, `1e10d`, and `123.456D` are all of type `double`. *end example* - A real literal suffixed by `M` or `m` is of type `decimal`. - > *Example*: The literals `1m`, `1.5m`, `1e10m`, and `123.456M` are all of type `decimal`. *end example* + > *Example*: The literals `1m`, `1.5m`, `1e10m`, and `123.456M` are all of type `decimal`. *end example* This literal is converted to a `decimal` value by taking the exact value, and, if necessary, rounding to the nearest representable value using banker’s rounding ([§8.3.8](types.md#838-the-decimal-type)). Any scale apparent in the literal is preserved unless the value is rounded. > *Note*: Hence, the literal `2.900m` will be parsed to form the `decimal` with sign `0`, coefficient `2900`, and scale `3`. *end note* @@ -812,24 +822,24 @@ A character literal represents a single character, and consists of a character i Character_Literal : '\'' Character '\'' ; - + fragment Character : Single_Character | Simple_Escape_Sequence | Hexadecimal_Escape_Sequence | Unicode_Escape_Sequence ; - + fragment Single_Character // anything but ', \, and New_Line_Character : ~['\\\u000D\u000A\u0085\u2028\u2029] ; - + fragment Simple_Escape_Sequence : '\\\'' | '\\"' | '\\\\' | '\\0' | '\\a' | '\\b' | '\\f' | '\\n' | '\\r' | '\\t' | '\\v' ; - + fragment Hexadecimal_Escape_Sequence : '\\x' Hex_Digit Hex_Digit? Hex_Digit? Hex_Digit? ; @@ -890,11 +900,11 @@ String_Literal : Regular_String_Literal | Verbatim_String_Literal ; - + fragment Regular_String_Literal : '"' Regular_String_Literal_Character* '"' ; - + fragment Regular_String_Literal_Character : Single_Regular_String_Literal_Character | Simple_Escape_Sequence @@ -910,16 +920,16 @@ fragment Single_Regular_String_Literal_Character fragment Verbatim_String_Literal : '@"' Verbatim_String_Literal_Character* '"' ; - + fragment Verbatim_String_Literal_Character : Single_Verbatim_String_Literal_Character | Quote_Escape_Sequence ; - + fragment Single_Verbatim_String_Literal_Character : ~["] // anything but quotation mark (U+0022) ; - + fragment Quote_Escape_Sequence : '""' ; @@ -1102,7 +1112,7 @@ Pre-processing directives are not part of the syntactic grammar of C#. However, > #endif > #if B > void H() {} -> #else +> #else > void I() {} > #endif > } @@ -1155,11 +1165,11 @@ Pre-processing expressions can occur in `#if` and `#elif` directives. The operat fragment PP_Expression : PP_Whitespace? PP_Or_Expression PP_Whitespace? ; - + fragment PP_Or_Expression : PP_And_Expression (PP_Whitespace? '||' PP_Whitespace? PP_And_Expression)* ; - + fragment PP_And_Expression : PP_Equality_Expression (PP_Whitespace? '&&' PP_Whitespace? PP_Equality_Expression)* @@ -1169,12 +1179,12 @@ fragment PP_Equality_Expression : PP_Unary_Expression (PP_Whitespace? ('==' | '!=') PP_Whitespace? PP_Unary_Expression)* ; - + fragment PP_Unary_Expression : PP_Primary_Expression | '!' PP_Whitespace? PP_Unary_Expression ; - + fragment PP_Primary_Expression : TRUE | FALSE @@ -1282,15 +1292,15 @@ fragment PP_Conditional fragment PP_If_Section : 'if' PP_Whitespace PP_Expression ; - + fragment PP_Elif_Section : 'elif' PP_Whitespace PP_Expression ; - + fragment PP_Else_Section : 'else' ; - + fragment PP_Endif : 'endif' ; @@ -1488,11 +1498,11 @@ fragment PP_Line_Indicator | DEFAULT | 'hidden' ; - + fragment PP_Compilation_Unit_Name : '"' PP_Compilation_Unit_Name_Character* '"' ; - + fragment PP_Compilation_Unit_Name_Character // Any Input_Character except " : ~('\u000D' | '\u000A' | '\u0085' | '\u2028' | '\u2029' | '"')