Skip to content

Name syntax should align with XML #519

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
gibson042 opened this issue Nov 8, 2023 · 6 comments
Closed

Name syntax should align with XML #519

gibson042 opened this issue Nov 8, 2023 · 6 comments
Labels
LDML45 LDML45 Release (Tech Preview) resolve-candidate This issue appears to have been answered or resolved, and may be closed soon. syntax Issues related with syntax or ABNF

Comments

@gibson042
Copy link
Collaborator

As noted elsewhere (e.g., #399 (comment) ), it's not good to almost align with an external specification. name is already close to XML Name and even mentions it, and that gap should just be fully closed.

Doing so would have the following consequences:

  • unquoted would simplify to either 1*name-char (matching XML Nmtoken) or 1*unquoted-char (if expanding to e.g. include + for parity with -).
  • The function sigil would need to change from : to something that does not overlap with unquoted or other syntax (such as } for closing the expression, | for quoting, $ for variables, and any sigil indicating spannable open/close) and doesn't mislead developers—starting with !"#$%&'()*+,/;<=>?@[\]^`{|}~ for the first concern, I think that leaves something like *=@^~ in which @ is particularly attractive to me (e.g., {$count @number}) but a reasonable case could be made for any of the others.
    • ...or alternatively (but very unlikely), the function sigil could be | as in Jinja and literal quoting would use something else (e.g., {~0.40~ |number style=percent})

Related issues:

@stasm
Copy link
Collaborator

stasm commented Nov 8, 2023

Thanks for filing this. Historically, I was very interested in aligning with XML's name and nmtoken, but in the more recent past, I've relaxed my stance.

In particular, I think there's more benefit in aligning nmtoken than there is in aligning name, because nmtoken can be found in unquoted variant keys and unquoted option values, both of which can be sourced from CLDR/LDML.

name, on the other hand, can most likely be whatever we make it. Of course, if there aren't good reasons to diverge from XML's Name, then not diverging just for the sake of not diverging is a good idea.

@gibson042
Copy link
Collaborator Author

In particular, I think there's more benefit in aligning nmtoken than there is in aligning name, because nmtoken can be found in unquoted variant keys and unquoted option values, both of which can be sourced from CLDR/LDML.

I don't think there's much value in allowing variable/function names like 6 or -40.

@aphillips aphillips added syntax Issues related with syntax or ABNF blocker-candidate The submitter thinks this might be a block for the next release Agenda+ Requested for upcoming teleconference labels Nov 10, 2023
@aphillips aphillips added the resolve-candidate This issue appears to have been answered or resolved, and may be closed soon. label Nov 30, 2023
@aphillips
Copy link
Member

Is this being addressed in the new syntax changes (which align name with NCName)?

@gibson042
Copy link
Collaborator Author

It's in a much better spot now, but I would say not actually resolved until the definition of unquoted is fully stabilized—hopefully as something like unquoted = 1*(name-char / …) that doesn't have a first-character constraint (precluding use of : as an expression sigil if e.g. "2023-11-30T22:15Z" is allowed) and doesn't allow -1 without +1 (precluding use of + as an expression sigil).

@aphillips aphillips removed the resolve-candidate This issue appears to have been answered or resolved, and may be closed soon. label Dec 3, 2023
@aphillips aphillips added the LDML45 LDML45 Release (Tech Preview) label Jan 8, 2024
@aphillips aphillips added the resolve-candidate This issue appears to have been answered or resolved, and may be closed soon. label Jan 21, 2024
@aphillips
Copy link
Member

The changes @gibson042 mentions have been landed in the ABNF and spec. I think this is ready to close?

@gibson042
Copy link
Collaborator Author

I agree; this seems done to me with the current ABNF:

unquoted = name / number-literal; number-literal matches JSON number
; https://www.rfc-editor.org/rfc/rfc8259#section-6
number-literal = ["-"] (0 / ([1-9] *DIGIT)) ["." 1*DIGIT] [%i"e" ["-" / "+"] 1*DIGIT]

; identifier matches https://www.w3.org/TR/REC-xml-names/#NT-QName
; name matches https://www.w3.org/TR/REC-xml-names/#NT-NCName
identifier = [namespace ":"] name
namespace  = name
name       = name-start *name-char
name-start = ALPHA /name-char  = name-start / DIGIT / "-" / "."
           / %xB7 / %x300-36F / %x203F-2040

@aphillips aphillips removed blocker-candidate The submitter thinks this might be a block for the next release Agenda+ Requested for upcoming teleconference labels Jan 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
LDML45 LDML45 Release (Tech Preview) resolve-candidate This issue appears to have been answered or resolved, and may be closed soon. syntax Issues related with syntax or ABNF
Projects
None yet
Development

No branches or pull requests

3 participants