Skip to content

Drop separate syntax constructs for markup #371

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Apr 10, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 9 additions & 9 deletions spec/formatting.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ These are divided into the following categories:
```

```
{Unknown {#placeholder#}}
{Unknown {#expression#}}
```

```
Expand Down Expand Up @@ -186,8 +186,8 @@ or contains some error which leads to further errors,
an implementation which does not emit all of the errors
should prioritise Syntax and Data Model errors over others.

When an error occurs in the resolution of an Expression or Markup Option,
the Expression or Markup in question is processed as if the option were not defined.
When an error occurs in the resolution of an Expression,
the Expression in question is processed as if the option were not defined.
This may allow for the fallback handling described below to be avoided,
though an error must still be emitted.

Expand All @@ -205,8 +205,8 @@ If a fallback string is not defined,
the U+FFFD REPLACEMENT CHARACTER `�` character is used,
resulting in the string `{�}`.

When an error occurs in a Placeholder that is being formatted,
the fallback string representation of the Placeholder
When an error occurs in an Expression that is being formatted,
the fallback string representation of the Expression
always starts with U+007B LEFT CURLY BRACKET `{`
and ends with U+007D RIGHT CURLY BRACKET `}`.
Between the brackets, the following contents are used:
Expand All @@ -222,15 +222,15 @@ Between the brackets, the following contents are used:

Example: `{$user}`

- Expression with no Operand: U+003A COLON `:` followed by the Expression Name
- Standalone expression with no Operand: U+003A COLON `:` followed by the Expression Name

Example: `{:platform}`

- Markup start: U+002B PLUS SIGN `+` followed by the MarkupStart Name
- Opening expression with no Operand: U+002B PLUS SIGN `+` followed by the Expression Name

Example: `{+tag}`

- Markup end: U+002D HYPHEN-MINUS `-` followed by the MarkupEnd Name
- Closing expression with no Operand: U+002D HYPHEN-MINUS `-` followed by the Expression Name

Example: `{-tag}`

Expand All @@ -243,7 +243,7 @@ Option names and values are not included in the fallback string representations.
When an error occurs in an Expression with a Variable Operand
and the Variable refers to a local variable Declaration,
the fallback string is formatted based on the Expression of the Declaration,
rather than the Expression of the Placeholder.
rather than the Expression in the Selector or Pattern.

For example, attempting to format either of the following messages within a context that
does not provide for the function `:func` to be successfully resolved:
Expand Down
18 changes: 5 additions & 13 deletions spec/message.abnf
Original file line number Diff line number Diff line change
@@ -1,21 +1,15 @@
message = [s] *(declaration [s]) body [s]

declaration = let s variable [s] "=" [s] "{" [s] expression [s] "}"
declaration = let s variable [s] "=" [s] expression
body = pattern
/ (selectors 1*([s] variant))

pattern = "{" *(text / placeholder) "}"
selectors = match 1*([s] selector)
selector = "{" [s] expression [s] "}"
pattern = "{" *(text / expression) "}"
selectors = match 1*([s] expression)
variant = when 1*(s key) [s] pattern
key = nmtoken / literal / "*"

placeholder = "{" [s] expression [s] "}"
/ "{" [s] markup-start *(s option) [s] "}"
/ "{" [s] markup-end [s] "}"

expression = ((literal / variable) [s annotation])
/ annotation
expression = "{" [s] (((literal / variable) [s annotation]) / annotation) [s] "}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my action item I am working on ABNF, but we might as well talk about it here. I was going to suggest:

expression = "{" [s] (((literal / variable) [s annotation])
             / annotation) [s] "}"
annotation = (function / reserved) *(s option)
function = ":" name
reserved = (%21-%26 / %2a / %2b / %2d / %2f
           / %3b-%40 / %5e / %7e) name ; reserved sigil characters, specifically these: !"#$%&*+-/^~

I'm reluctant to embrace + and - being "function introducers". I would rather just have a single starter for the "thing we've been calling a function" and having another name for other types of function. If these are different constructs, they should parse to another production name. That can be accommodated by pushing it down a level:

annotation = named *(s option)
named = function / reserved
function = ":" name
reserved = [!@#$%^&*+-<>?] name

This would allow something to become unreserved by adding a named production:

annotation = named *(s option)
named = function / reserved
function = ":" name
fragment = "#" name
reserved = [!@$%^&*+-<>?] name ; deleted # from here

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect that ultimately anything that may have an argument and/or a bag of options could be represented by some sort of "function". The proposed shape of reserved encodes this as the only possible direction for the extension of the spec, so ultimately anything that would be later un-reserved could effectively be described as a "function introducer".

If we don't want to limit ourselves this way, then really the reserved construct ought to be defined on a higher level, allowing for more freedom (using this PR's syntax as a baseline):

expression = "{" [s] (((literal / variable) [s annotation]) / annotation / reserved) [s] "}"
reserved = reserved-start reserved-body
reserved-start = "!" / "@" / "#" / "$" / "%" / "^" / "&" / "*" / "<" / ">" / "?"
reserved-body = *(reserved-char / reserved-escape / ("{" reserved-body "}") / literal)
reserved-char = %x0-5B         ; omit \
              / %x5D-7A        ; omit { | }
              / %x7E-D7FF      ; omit surrogates
              / %xE000-10FFFF
reserved-escape = backslash ( backslash / "|" / "{" / "}" )

Now, independently of how we might proceed with reserved, I do think that "open" and "close" placeholders are sufficiently common that they definitely should have their own top-level introducers + and -. This leaves us with at least 11 more to use for other concepts, should that prove desirable later on.

annotation = function *(s option)
option = name [s] "=" [s] (literal / nmtoken / variable)

Expand All @@ -38,9 +32,7 @@ literal-char = %x0-5B ; omit \
/ %xE000-10FFFF

variable = "$" name
function = ":" name
markup-start = "+" name
markup-end = "-" name
function = (":" | "+" | "-") name

name = name-start *name-char ; matches XML https://www.w3.org/TR/xml/#NT-Name
nmtoken = 1*name-char ; matches XML https://www.w3.org/TR/xml/#NT-Nmtokens
Expand Down
Loading