Skip to content

Propose reserving additional sigils for future use #374

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Apr 24, 2023
24 changes: 19 additions & 5 deletions spec/message.abnf
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,8 @@ variant = when 1*(s key) [s] pattern
key = nmtoken / literal / "*"

expression = "{" [s] (((literal / variable) [s annotation]) / annotation) [s] "}"
annotation = function *(s option)
annotation = (function *(s option)) / reserved

option = name [s] "=" [s] (literal / nmtoken / variable)

; reserved keywords are always lowercase
Expand All @@ -25,7 +26,7 @@ text-char = %x0-5B ; omit \
/ %x7E-D7FF ; omit surrogates
/ %xE000-10FFFF

literal = "|" *(literal-char / literal-escape) "|"
literal = "|" *(literal-char / literal-escape) "|"
literal-char = %x0-5B ; omit \
/ %x5D-7B ; omit |
/ %x7D-D7FF ; omit surrogates
Expand All @@ -34,6 +35,18 @@ literal-char = %x0-5B ; omit \
variable = "$" name
function = (":" | "+" | "-") name

; reserve additional sigils for future use
reserved = reserved-start reserved-body
reserved-start = "!" / "@" / "#" / "%" / "^" / "&" / "*" / "<" / ">" / "?" / "~"
reserved-body = *( [s] 1*(reserved-char / reserved-escape / literal))
reserved-char = %x00-08 ; omit HTAB and LF
/ %x0B-0C ; omit CR
/ %x0E-19 ; omit SP
/ %x21-5B ; omit \
/ %x5D-7A ; omit { | }
/ %x7E-D7FF ; omit surrogates
/ %xE000-10FFFF

name = name-start *name-char ; matches XML https://www.w3.org/TR/xml/#NT-Name
nmtoken = 1*name-char ; matches XML https://www.w3.org/TR/xml/#NT-Nmtokens
name-start = ALPHA / "_"
Expand All @@ -44,8 +57,9 @@ name-start = ALPHA / "_"
name-char = name-start / DIGIT / "-" / "." / %xB7
/ %x0300-036F / %x203F-2040

text-escape = backslash ( backslash / "{" / "}" )
literal-escape = backslash ( backslash / "|" )
backslash = %x5C ; U+005C REVERSE SOLIDUS "\"
text-escape = backslash ( backslash / "{" / "}" )
literal-escape = backslash ( backslash / "|" )
reserved-escape = backslash ( backslash / "{" / "|" / "}" )
backslash = %x5C ; U+005C REVERSE SOLIDUS "\"

s = 1*( SP / HTAB / CR / LF )
34 changes: 20 additions & 14 deletions spec/syntax.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
1. [Design Restrictions](#design-restrictions)
1. [Overview & Examples](#overview--examples)
1. [Messages](#messages)
1. [Expressions](#expressions)
1. [Expressions](#expression)
1. [Formatting Functions](#formatting-functions)
1. [Selection](#selection)
1. [Local Variables](#local-variables)
Expand All @@ -19,6 +19,7 @@
1. [Variants](#variants)
1. [Patterns](#patterns)
1. [Expressions](#expressions)
1. [Reserved Sequences](#reserved)
1. [Tokens](#tokens)
1. [Keywords](#keywords)
1. [Text and Literals](#text-and-literals)
Expand Down Expand Up @@ -108,7 +109,7 @@ let hello = new MessageFormat('{Hello, world!}')
hello.format()
```

### Expressions
### Expression

An _expression_ represents a part of a message that will be determined
during the message's formatting.
Expand Down Expand Up @@ -317,7 +318,7 @@ A _well-formed_ message is considered _valid_ if the following requirements are
### Patterns

A **_pattern_** is a sequence of translatable elements.
Patterns MUST BE delimited with `{` at the start, and `}` at the end.
Patterns MUST be delimited with `{` at the start, and `}` at the end.
This serves 3 purposes:

- The message can be unambiguously embeddable in various container formats
Expand Down Expand Up @@ -345,20 +346,17 @@ Whitespace within a _pattern_ is meaningful and MUST be preserved.

### Expressions

**_Expressions_** can either start with an operand or a function call.
_Expressions_ ***must*** start with a _literal_, a _variable_, or an _annotation_. An _expression_ ***must not*** be empty.

A _literal_ or _variable_ ***may*** be optionally followed by an _annotation_.

The operand is a literal or a variable name.
The operand can be optionally followed by an _annotation_:
a function and its named options.
Functions do not accept any positional arguments
other than the operand in front of them.
An _annotation_ consists of a _function_ and its named _options_, or consists of a _reserved_ sequence.

Function calls do not require an operand as an argument,
but an _expression_ must not be completely empty.
_Functions_ do not accept any positional arguments other than the _literal_ or _variable_ in front of them.

```abnf
expression = "{" [s] (((literal / variable) [s annotation]) / annotation) [s] "}"
annotation = function *(s option)
annotation = (function *(s option)) / reserved
option = name [s] "=" [s] (literal / nmtoken / variable)
```

Expand Down Expand Up @@ -398,6 +396,13 @@ Message examples:
{{+h1 name=above-and-beyond}Above And Beyond{-h1}}
```

#### Reserved

_Reserved_ sequences start with a reserved character and are intended for future standardization.
A reserved sequence can be empty or contain arbitrary text.
A reserved sequence does not include any trailing whitespace.
While a reserved sequence is technically "well-formed", unrecognized reserved sequences have no meaning and might result in errors during formatting.

## Tokens

The grammar defines the following tokens for the purpose of the lexical analysis.
Expand Down Expand Up @@ -478,12 +483,12 @@ name-char = name-start / DIGIT / "-" / "." / %xB7

### Escape Sequences

Escape sequences are introduced by the backslash character (`\`).
They are allowed in translatable text as well as in literals.
Escape sequences are introduced by the backslash character (`\`) and allow the appearance of lexically meaningful characters in the body of `text`, `literal`, or `reserved` sequences respectively:

```abnf
text-escape = backslash ( backslash / "{" / "}" )
literal-escape = backslash ( backslash / "|" )
reserve-escape = backslash ( backslash / "{" / "|" / "}" )
backslash = %x5C ; U+005C REVERSE SOLIDUS "\"
```

Expand All @@ -495,6 +500,7 @@ Inside _patterns_,
whitespace is part of the translatable content and is recorded and stored verbatim.
Whitespace is not significant outside translatable text, except where required by the syntax.


```abnf
s = 1*( SP / HTAB / CR / LF )
```
Expand Down