|
| 1 | +# Replacements & Additions |
| 2 | + |
| 3 | +A “replacement & additions” file can be passed to `BuildGrammar`’s `--modification-file`/`-m` |
| 4 | +option and contains replacement and additional rules to be added to: verify/test a grammar, |
| 5 | +explore/test new language features, etc. It is a markdown (`.md`) file which follows the same |
| 6 | +conventions as those used for the Standard. |
| 7 | + |
| 8 | +The file is processed by `BuildGrammar` similarly to other markdown files; clauses are |
| 9 | +noted and text in ```` ```ANTLR ... ``` ```` code blocks is “parsed” (being generous here); |
| 10 | +as ANTLR rules, mode commands, and comments (and nothing else, don’t go trying to set options |
| 11 | +or other stuff); and added to the corresponding section in the produced grammar (`.g4`) file. |
| 12 | + |
| 13 | +The rules add to, or replace, existing rules in the corresponding clause; new rules are added |
| 14 | +to the end of the clause’s rule list, replacements occur in-place. Further in the replacement |
| 15 | +case the previous rule is kept as a comment, and this happens regardless of the previous rules |
| 16 | +clause i.e. a replacement can move a rule to a different clause. |
| 17 | + |
| 18 | +A rule is only a replacement if it changes the rule’s definition or clause. |
| 19 | +Rules with are neither additions or replacements are ignored. |
| 20 | + |
| 21 | +> *Note:* This means that a |
| 22 | +replacement & additions file can actually be a complete Standard section markdown file |
| 23 | +and only the changed rules will be extracted to update the grammar. So while working, say, |
| 24 | +on a PR if the original Standard’s section files are passed to `BuildGrammar` as input files, |
| 25 | +the section files changed by the PR are passed as modification files, then the resultant grammar will |
| 26 | +reflect the PR’s changes with the old rules as comments. |
| 27 | + |
| 28 | +> **Important:** section numbers in replacement & additions files are not maintained by |
| 29 | +the automatic section numbering tooling, they **must** be maintained manually. |
| 30 | + |
| 31 | +--- |
| 32 | + |
| 33 | +# Verification-Only Replacements & Additions |
| 34 | + |
| 35 | +This set of replacements and additions is the bare minimum required to allow the grammar to verify and run, though |
| 36 | +it may not produce the desired parse (that requires at least the use of modes and/or |
| 37 | +lexical predicates). |
| 38 | + |
| 39 | +This set can be used as a basic check that the grammar is a correct ANTLR grammar. |
| 40 | + |
| 41 | +--- |
| 42 | + |
| 43 | +## Top Level Rule |
| 44 | + |
| 45 | +The Standard’s *compilation_unit* as is will allow garbage at the end of a file, this |
| 46 | +rule has an EOF requirement to ensure the whole of the input must be a correct program. |
| 47 | + |
| 48 | +> *Note: The section number makes this the first rule in the grammar, not required but it |
| 49 | +has to go somewhere…* |
| 50 | + |
| 51 | +### 0.0.0 Top Level Rule |
| 52 | + |
| 53 | +```ANTLR |
| 54 | +// [ADDED] Rule added as the start point |
| 55 | +prog: compilation_unit EOF; |
| 56 | +``` |
| 57 | +--- |
| 58 | + |
| 59 | +## Discarding Whitespace |
| 60 | + |
| 61 | +The following changes in §7.3.2, §7.3.3 and §7.3.4, add `-> skip` to the “whitespace” |
| 62 | +token rules so that are not passed to the parser. This behaviour is implicit in the |
| 63 | +Standard. |
| 64 | + |
| 65 | +### 7.3.2 Line terminators |
| 66 | + |
| 67 | +```ANTLR |
| 68 | +// [SKIP] |
| 69 | +New_Line |
| 70 | + : ( New_Line_Character |
| 71 | + | '\u000D\u000A' // carriage return, line feed |
| 72 | + ) -> skip |
| 73 | + ; |
| 74 | +``` |
| 75 | + |
| 76 | +### 7.3.3 Comments |
| 77 | + |
| 78 | +```ANTLR |
| 79 | +// [SKIP] |
| 80 | +Comment |
| 81 | + : ( Single_Line_Comment |
| 82 | + | Delimited_Comment |
| 83 | + ) -> skip |
| 84 | + ; |
| 85 | +``` |
| 86 | + |
| 87 | +### 7.3.4 White space |
| 88 | + |
| 89 | +```ANTLR |
| 90 | +// [SKIP] |
| 91 | +Whitespace |
| 92 | + : ( [\p{Zs}] // any character with Unicode class Zs |
| 93 | + | '\u0009' // horizontal tab |
| 94 | + | '\u000B' // vertical tab |
| 95 | + | '\u000C' // form feed |
| 96 | + ) -> skip |
| 97 | + ; |
| 98 | +
|
| 99 | +``` |
| 100 | + |
| 101 | +--- |
| 102 | + |
| 103 | +## Pre-processing directives |
| 104 | + |
| 105 | +This change causes all pre-processor directives to be discarded, they don’t need to be |
| 106 | +processed to validate the grammar (processing them would exercise the *implementation* |
| 107 | +of the pre-processor, which is not part of the Standard). |
| 108 | + |
| 109 | +### 7.5.1 General |
| 110 | + |
| 111 | +```ANTLR |
| 112 | +// [CHANGE] Discard pre-processor directives |
| 113 | +PP_Directive |
| 114 | + : (PP_Start PP_Kind PP_New_Line) -> skip |
| 115 | + ; |
| 116 | +``` |
| 117 | + |
| 118 | +--- |
| 119 | + |
| 120 | +## Mutual Left Recursion Removal |
| 121 | + |
| 122 | +All but one mutual left recursive (MLR) group has been removed from the grammar (and we should |
| 123 | +strive not to introduce any new ones). |
| 124 | + |
| 125 | +This change resolves the one remaining MLR group by inlining some of the non-terminal |
| 126 | +alternatives in *primary_no_array_creation_expression*. |
| 127 | + |
| 128 | +Non-terminals that are inlined are commented out and the inlined body is indented. |
| 129 | + |
| 130 | +This change has not been made to the Standard itself as it makes *primary_no_array_creation_expression* |
| 131 | +“uglier” and would obfuscate somewhat the description in the Standard. |
| 132 | + |
| 133 | +As MLR is not supported by ANTLR without this change the grammar would be rejected. |
| 134 | + |
| 135 | +### 12.7.1 General |
| 136 | + |
| 137 | +```ANTLR |
| 138 | +// [CHANGE] This removes a mutual left-recursion group which we have (currently?) |
| 139 | +// [CHANGE] decided to leave in the Standard. Without this change the grammar will fail. |
| 140 | +primary_no_array_creation_expression |
| 141 | + : literal |
| 142 | + | simple_name |
| 143 | + | parenthesized_expression |
| 144 | + // | member_access |
| 145 | + | primary_no_array_creation_expression '.' identifier type_argument_list? |
| 146 | + | array_creation_expression '.' identifier type_argument_list? |
| 147 | + | predefined_type '.' identifier type_argument_list? |
| 148 | + | qualified_alias_member '.' identifier type_argument_list? |
| 149 | + // | invocation_expression |
| 150 | + | primary_no_array_creation_expression '(' argument_list? ')' |
| 151 | + | array_creation_expression '(' argument_list? ')' |
| 152 | + // | element_access and pointer_element_access (unsafe code support) |
| 153 | + | primary_no_array_creation_expression '[' argument_list ']' |
| 154 | + | this_access |
| 155 | + | base_access |
| 156 | + // | post_increment_expression |
| 157 | + | primary_no_array_creation_expression '++' |
| 158 | + | array_creation_expression '++' |
| 159 | + // | post_decrement_expression |
| 160 | + | primary_no_array_creation_expression '--' |
| 161 | + | array_creation_expression '--' |
| 162 | + | object_creation_expression |
| 163 | + | delegate_creation_expression |
| 164 | + | anonymous_object_creation_expression |
| 165 | + | typeof_expression |
| 166 | + | sizeof_expression |
| 167 | + | checked_expression |
| 168 | + | unchecked_expression |
| 169 | + | default_value_expression |
| 170 | + | nameof_expression |
| 171 | + | anonymous_method_expression |
| 172 | + // | pointer_member_access // unsafe code support |
| 173 | + | primary_no_array_creation_expression '->' identifier type_argument_list? |
| 174 | + | array_creation_expression '->' identifier type_argument_list? |
| 175 | + // | pointer_element_access // unsafe code support |
| 176 | + // covered by element_access replacement above |
| 177 | + ; |
| 178 | +``` |
0 commit comments