Skip to content

Allow colon in name-start, matching XML Name #483

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 8 additions & 6 deletions spec/message.abnf
Original file line number Diff line number Diff line change
Expand Up @@ -39,8 +39,11 @@ quoted-char = %x0-5B ; omit \
; based on https://www.w3.org/TR/xml/#NT-Nmtoken,
; but cannot start with U+002D HYPHEN-MINUS or U+003A COLON ":"
unquoted = unquoted-start *name-char
unquoted-start = name-start / DIGIT / "."
/ %xB7 / %x300-36F / %x203F-2040
unquoted-start = ALPHA / DIGIT / "." / "_"
/ %xB7 / %xC0-D6 / %xD8-F6 / %xF8-37D
/ %x37F-1FFF / %x200C-200D / %x203F-2040
/ %x2070-218F / %x2C00-2FEF / %x3001-D7FF
/ %xF900-FDCF / %xFDF0-FFFD / %x10000-EFFFF


; reserve sigils for private-use by implementations
Expand All @@ -61,15 +64,14 @@ reserved-char = %x00-08 ; omit HTAB and LF
/ %x7E-D7FF ; omit surrogates
/ %xE000-10FFFF

; based on https://www.w3.org/TR/xml/#NT-Name,
; but cannot start with U+003A COLON ":"
; matches https://www.w3.org/TR/xml/#NT-Name
name = name-start *name-char
name-start = ALPHA / "_"
name-start = ALPHA / ":" / "_"
/ %xC0-D6 / %xD8-F6 / %xF8-2FF
/ %x370-37D / %x37F-1FFF / %x200C-200D
/ %x2070-218F / %x2C00-2FEF / %x3001-D7FF
/ %xF900-FDCF / %xFDF0-FFFD / %x10000-EFFFF
name-char = name-start / DIGIT / "-" / "." / ":"
name-char = name-start / DIGIT / "-" / "."
/ %xB7 / %x300-36F / %x203F-2040

text-escape = backslash ( backslash / "{" / "}" )
Expand Down
19 changes: 11 additions & 8 deletions spec/syntax.md
Original file line number Diff line number Diff line change
Expand Up @@ -619,31 +619,34 @@ quoted-char = %x0-5B ; omit \
/ %xE000-10FFFF

unquoted = unquoted-start *name-char
unquoted-start = name-start / DIGIT / "."
/ %xB7 / %x300-36F / %x203F-2040
unquoted-start = ALPHA / DIGIT / "." / "_"
/ %xB7 / %xC0-D6 / %xD8-F6 / %xF8-37D
/ %x37F-1FFF / %x200C-200D / %x203F-2040
/ %x2070-218F / %x2C00-2FEF / %x3001-D7FF
/ %xF900-FDCF / %xFDF0-FFFD / %x10000-EFFFF
```

### Names

A **_<dfn>name</dfn>_** is an identifier for a _variable_ (prefixed with `$`),
for a _function_ (prefixed with `:`, `+` or `-`),
or for an _option_ (these have no prefix).
The namespace for _names_ is based on XML's [Name](https://www.w3.org/TR/xml/#NT-Name),
with the restriction that it MUST NOT start with `:`,
as that would conflict with the _function_ start character.
Otherwise, the set of characters allowed in names is large.
The namespace for _name_ matches XML's [Name](https://www.w3.org/TR/xml/#NT-Name).

As `:` is also used as the start sigil of _function_,
using a _name_ with it as a first character is NOT RECOMMENDED.
Comment on lines +634 to +637
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree with this. We have namespacing in another PR just now and there's a reasonable solution: instead of XML Name use XML-Name's NCName as the basis. The definition of NCName is exactly "Name minus the : character"

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why should there be such an unnecessarily complex overlap between variable/function/option names and unquoted literals? If the function sigil were just replaced with something that does not appear in XML NameChar then the names could be exactly described by XML Name and the unquoted literals by either 1*name-char (i.e., XML Nmtoken) or by 1*(name-char / "+" / …), in either case forming a strict superset rather than a near-superset to exclude artificially-induced ambiguity w.r.t. colons.


```abnf
variable = "$" name
function = (":" / "+" / "-") name

name = name-start *name-char
name-start = ALPHA / "_"
name-start = ALPHA / ":" / "_"
/ %xC0-D6 / %xD8-F6 / %xF8-2FF
/ %x370-37D / %x37F-1FFF / %x200C-200D
/ %x2070-218F / %x2C00-2FEF / %x3001-D7FF
/ %xF900-FDCF / %xFDF0-FFFD / %x10000-EFFFF
name-char = name-start / DIGIT / "-" / "." / ":"
name-char = name-start / DIGIT / "-" / "."
/ %xB7 / %x300-36F / %x203F-2040
```

Expand Down