Skip to content

Styling and structure changes to syntax.md #419

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 30 commits into from
Jul 24, 2023

Conversation

aphillips
Copy link
Member

This includes fixing the style of definitions and their references.

In the course of doing this change, I made some non-editorial changes:

  • Introduce terms and definitions to clarify selectors, match statement, and selector and concepts. Our text was difficult to read because there was no way to refer to the match clearly (because the production is called selectors) or to refer to a specific selector (or just the selector expressions minus the keyword match). Hopefully this is a step in the right direction. This transformation is incomplete, since there are several places that refer to match rather than match statement, etc.

  • I split private-use from reserved, although that really should be a separate PR.

  • I modified the structure and description of the quoted and unquoted literals to make their relationship clearer.

  • I reorganized and made normative text in several places. I believe that the normative changes are not a material change: they are just applying the SHOUTY KEYWORDS appropriately or making explicit what is only implied by the ABNF.

  • I added quoted to the list of places where whitespace is meaningful. I think this was previously an oversight.

I expect tons of comments/hate mail ;-)

This includes fixing the style of **_definitions_** and their _references_.

In the course of doing this change, I made some non-editorial changes:

* Introduce terms and definitions to clarify _selectors_, _match statement_, and _selector_ and concepts. Our text was difficult to read because there was no way to refer to the `match` clearly (because the production is called `selectors`) or to refer to a specific _selector_ (or just the selector expressions minus the keyword `match`). Hopefully this is a step in the right direction.

* I split `private-use` from `reserved`, although that really should be a separate PR.

* I reorganized and made normative text in several places. I believe that the normative changes are not a material change: they are just applying the SHOUTY KEYWORDS appropriately or making explicit what is only implied by the ABNF.

I expect tons of comments ;-)
@aphillips aphillips requested review from stasm and eemeli July 12, 2023 00:21
@aphillips aphillips requested review from echeran and mihnita July 12, 2023 00:25
Copy link
Collaborator

@eemeli eemeli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please leave out the private-use changes from this PR.

aphillips and others added 7 commits July 12, 2023 07:21
Co-authored-by: Eemeli Aro <[email protected]>
Co-authored-by: Eemeli Aro <[email protected]>
Co-authored-by: Eemeli Aro <[email protected]>
Co-authored-by: Eemeli Aro <[email protected]>
Co-authored-by: Eemeli Aro <[email protected]>
Co-authored-by: Eemeli Aro <[email protected]>
- fix the `operand` definition
- remove private-use text for a separate PR
@aphillips aphillips requested a review from eemeli July 12, 2023 15:18
Co-authored-by: Eemeli Aro <[email protected]>
spec/syntax.md Outdated

The same message defined in a `.properties` file:
All _messages_, including simple ones, begin with U+007B LEFT CURLY BRACKET `{`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this accurate though?

A message can contain declarations, that start with let $foo = {...},
or can contain a with a selector, starting with match.
In these cases the message does not start with {

The text below in the document seems to also consider these elements (declarations & selector) as part of the message:

A message containing a declaration defining a local variable $whom which is then used twice inside the pattern:

let $whom = {$monster :noun case=accusative}
...

and

A message can have multiple selectors.
A message with a single selector:

match {$count :number}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct: we start in code mode. Note that this is not text that I originated in this PR, so we've had a spec bug for a while.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See if 7bd78fa fixes it.

@aphillips
Copy link
Member Author

I observe that there is a lot of repetition between the sections in syntax.md, which leads to forward definitions and the temptation to do stuff in two places.


{Hello, {$userObj :person firstName=long}!}
A **_standalone_** _function_ is not expected to be paired with another _function_.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The standalone, open, close are properties of the expression, not the function.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it expression or placeholder? That is, does it have meaning to say that this is an "open expression":

let $foo = {$bar +open}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... also we should say that standalone is the default for expressions such as {$foo} that have only an operand.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it expression or placeholder? That is, does it have meaning to say that this is an "open expression":

let $foo = {$bar +open}

We have not explicitly discussed this (yet!), but yes, that's what we have right now.

... also we should say that standalone is the default for expressions such as {$foo} that have only an operand.

I'm pretty sure that such a default is not currently defined, and that this is probably best considered within the context of discussing our function registry and our implementations' behaviour assigning function handlers to expressions that do not include an annotattion.

This PR should not make any such statements.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eemeli I agree that the PR shouldn't probably do that, but we should pay attention to these kinds of questions.

In particular, the current state of our spec is that these are all the same and refer to a discrete function open:

let $foo = {$bar :open}
let $foo = {$bar +open}
let $foo = {$bar -open}

Obviously, though, +open and -open need to have a different syntactical "meaning" if the open and close expressions are to do different things (such as write open vs. close tags). The interaction with the function registry is interesting here.

I will create a tracking issue.

Comment on lines +174 to +175
An **_opening element_** is a _function_ that SHOULD be paired with a _closing function_.
A **_closing element_** is a _function_ that SHOULD be paired with an _opening function_.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're including this in the spec, should these SHOULD statements also mention the expected order of the expressions, i.e. opening before closing?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably, but we also allow unpaired appearance of these, so {{-foo}something{+foo}} might just be a fragment where the missing {+foo} and {-foo} are external to the message (externality is the argument for allowing unpaired open/close functions).

There is nothing in the syntax that actually causes pairing to happen--and currently there is nothing in the registry that would lead to pairing either. Increasingly I think that the open/close syntax is either (a) merely a naming convention that carries no meaning or (b) woefully underspecified.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine to consider changes to this separately from this PR.

@aphillips aphillips requested review from eemeli and mihnita July 13, 2023 16:04
Copy link
Collaborator

@eemeli eemeli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only a few remaining line comments left from me; see also the previous unresolved ones.

spec/syntax.md Outdated
@@ -412,10 +477,11 @@ option = name [s] "=" [s] (literal / variable)

#### Reserved

**_Reserved_** annotations start with a reserved character
and are intended for future standardization
as well as private implementation use.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line should not be dropped by this PR.

aphillips and others added 2 commits July 16, 2023 09:15
Copy link
Collaborator

@catamorphism catamorphism left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of my comments are stylistic and it's up to your judgment whether to incorporate them. I explicitly marked the comments where I think there might be an error in the text with "error?"


All messages, including simple ones, are enclosed in `{…}` delimiters:
All _messages_ MUST contain a _body_.
The _body_ of a _message_ consists of either a _pattern_ or of _selectors_.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would read better to me as "either a pattern or a selectors"; but then "a selectors" sounds weird. I think it would read best if there was a name for the thing referred to here as selectors that's not a plural noun.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your comment is exactly my proposal to change the ABNF (see the discussion of the refactoring of this document in today's call)! The current syntax doesn't support talking about a selector cleanly.

The same message defined in a `.properties` file:
A **_pattern_** is a combination of _text_ and _placeholders_
to be formatted as a unit.
All _patterns_, including simple ones, begin with U+007B LEFT CURLY BRACKET `{`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what "simple" means here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the original text, modified slightly. Basically a "simple" pattern has no placeholders:

{Hello, world}


### Expression

An _expression_ represents a part of a message that will be determined
during the message's formatting.
An **_expression_** is a part of a _message_ that will be determined
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Determined" seems really vague, though I'm not sure what other word is better (I've objected to "resolved" before because it's also vague.)

An _expression_ MUST NOT be empty.

An **_operand_** is either a _literal_ or a _variable_.
An **_operand_** is a _literal_ or a _variable_ to be evaluated in an _expression_.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
An **_operand_** is a _literal_ or a _variable_ to be evaluated in an _expression_.
An **_operand_** MUST appear in an _expression_. An _operand_ is either a _literal_ or a _variable_.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your proposed change loses the "definition form" that this PR is adding, that is, that a definition say what something is before imposing anything on it.

An operand is whatever it is.

It can then go on to say that it MUST (only) appear in an expression (note that your phrase is misleading, since an expression is not required to have an operand)

Comment on lines 436 to 437
_Functions_ do not accept any positional arguments
other than the _operand_ in front of them.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
_Functions_ do not accept any positional arguments
other than the _operand_ in front of them.
A _function_ has at most one positional argument
(its _operand_, if present).

Comment on lines +570 to +571
A **_literal_** is a character sequence that appears outside
of _text_ in various parts of a _message_.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This phrasing seems backwards, since text can only appear at the top level of a pattern and _literal_s can appear in many places, but I'm not sure how to rephrase it.

Comment on lines +583 to +584
An **_unquoted_** literal is a _literal_ that does not require the `|`
quotes around it to be distinct from the rest of the _message_ syntax.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this language makes sense to someone writing a parser, but not to a user necessarily. Perhaps something like "a literal that does not include any of the following characters..." and then refer to where the list of forbidden characters is defined?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tend to agree, but... an unquoted literal is not just a literal that doesn't include specific characters. What marks it as unquoted is that it is missing its quotes! That is, |42| is a quoted literal that doesn't need to be quoted.

@aphillips aphillips merged commit 9d1c027 into main Jul 24, 2023
@aphillips aphillips deleted the aphillips-style-update-syntax branch July 24, 2023 19:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants