-
Notifications
You must be signed in to change notification settings - Fork 90
The Identifier grammar and its reference to keywords #260
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Labels
status: dependency
This issue or PR depends on another (see comments)
Milestone
Comments
It is at least awkward if not confusing – it would I think be less so if a better name was found for Identifier_Or_Keyword, the set of all <insert>! You might wish to rename the Identifier_Start_Character and Identifier_Part_Character to match the <insert>.
Another approach might be to rely on ANTLR predicates:
Define Keyword as a list of alternatives as now
Define, say, Available_Identifier, to be what Indentifier_Or_Keyword is now plus a predicate to exclude keywords
Now define Identifier : Available_Identifier | ‘@‘ Available_Identifier | ‘@‘ Keyword; – which can be shortened to two alternatives if wished.
Picking good non-terminal names can sometimes be a challenge!
… On 10/04/2021, at 3:04 am, Rex Jaeschke ***@***.***> wrote:
[This issue arose from my work on replacing the lexer grammar rule alternatives of the form '<...>' in Proposal 3 of #37 <#37>. ]
Although the grammar syntax used below is ANTLR, this is not an ANTLR-specific issue.
In the lexical grammar, we currently have the following:
Identifier
: Available_Identifier
| '@' Identifier_Or_Keyword
;
Available_Identifier
: '<An Identifier_Or_Keyword that is not a Keyword>'
;
Identifier_Or_Keyword
: Identifier_Start_Character Identifier_Part_Character*
;
What is confusing me is the existence and use of the rule Identifier_Or_Keyword. It might just be the name it was given, but I'm not so sure.
Contributing significantly to that confusion is the text "An Identifier_Or_Keyword that is not a Keyword." It seems to me that an identifier or keyword that is not a keyword must be an identifier! If that is not the case, then the rule name implies something that isn't true.
BTW, it seems to me that the use of "keyword" refers to those reserved words only and does not include contextual keywords, as the are always identifiers.
In Issue #259 <#259>, I get close to this subject. Here, my concern is that if we already have a rule Keyword, which captures all and only keywords, then everything else that looks like an identifier is an Identifier, so why does the Identifier grammar have to mention keywords?
I looked at the latest C spec and its grammar for identifiers makes no mention of keywords. However, the Java spec says
Identifier:
IdentifierChars but not a Keyword or BooleanLiteral or NullLiteral
IdentifierChars:
JavaLetter {JavaLetterOrDigit}
Note that the JLS is quite happy to describe in text that Identifier excludes three categories of things rather than expressing that using syntax.
Does anyone else find this section of the grammar confusing?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub <#260>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABSYVW7C24EYGU672IKVPUDTH4JQTANCNFSM42VET3RA>.
|
This was referenced Jun 1, 2021
This has been addressed by PR #342. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
[This issue arose from my work on replacing the lexer grammar rule alternatives of the form
'<...>'
in Proposal 3 of #37. ]Although the grammar syntax used below is ANTLR, this is not an ANTLR-specific issue.
In the lexical grammar, we currently have the following:
What is confusing me is the existence and use of the rule Identifier_Or_Keyword. It might just be the name it was given, but I'm not so sure.
Contributing significantly to that confusion is the text "An Identifier_Or_Keyword that is not a Keyword." It seems to me that an identifier or keyword that is not a keyword must be an identifier! If that is not the case, then the rule name implies something that isn't true.
BTW, it seems to me that the use of "keyword" refers to those reserved words only and does not include contextual keywords, as the are always identifiers.
In Issue #259, I get close to this subject. Here, my concern is that if we already have a rule Keyword, which captures all and only keywords, then everything else that looks like an identifier is an Identifier, so why does the Identifier grammar have to mention keywords?
I looked at the latest C spec and its grammar for identifiers makes no mention of keywords. However, the Java spec says
Note that the JLS is quite happy to describe in text that Identifier excludes three categories of things rather than expressing that using syntax.
Does anyone else find this section of the grammar confusing?
The text was updated successfully, but these errors were encountered: