From 6a9a2ebaf34e852e856a56aa498ef57b5162946e Mon Sep 17 00:00:00 2001 From: Eemeli Aro Date: Wed, 13 Sep 2023 05:32:21 +0200 Subject: [PATCH 01/24] Draft message parse mode (code vs text) design doc --- exploration/0474-text-vs-code.md | 201 +++++++++++++++++++++++++++++++ 1 file changed, 201 insertions(+) create mode 100644 exploration/0474-text-vs-code.md diff --git a/exploration/0474-text-vs-code.md b/exploration/0474-text-vs-code.md new file mode 100644 index 0000000000..acad2dde74 --- /dev/null +++ b/exploration/0474-text-vs-code.md @@ -0,0 +1,201 @@ +# Message Parse Mode + +Status: **Proposed** + +
+ Metadata +
+
Contributors
+
@eemeli
+
First proposed
+
2023-09-13
+
Pull Request
+
#474
+
+
+ +## Objective + +Decide whether text patterns or code statements should be enclosed in MF2. + +## Background + +Existing message and template formatting languages tend to start in "text" mode, +and require special syntax like `{{` or `{%` to enter "code" mode. + +ICU MessageFormat and Fluent both support inline selectors +separated from the text using `{ ... }` for multi-variant messages. + +[Mustache templates](https://mustache.github.io/mustache.5.html) +and related languages wrap "code" in `{{ ... }}`. +In addition to placeholders that are replaced by their interpolated value during formatting, +this also includes conditional blocks using `{{#...}}`/`{{/...}}` wrappers. + +[Handlebars](https://handlebarsjs.com/guide/) extends Mustache expressions +with operators such as `{{#if ...}}` and `{{#each ...}}`, +as well as custom formatting functions that become available as e.g. `{{bold ...}}`. + +[Jinja templates](https://jinja.palletsprojects.com/en/3.1.x/templates/) separate +`{% statements %}` and `{{ expressions }}` from the base text. +The former may define tests that determine the inclusion of subsequent text blocks in the output. + +A cost that the message formatting and templating languages mentioned above need to rely on +is some rule or behaviour that governs how to deal with whitespace at the beginning and end of a pattern, +as statements may be separated from each other by newlines or other constructs for legibility. + +Other formats supporting multiple message variants tend to rely on a surrounding resource format to define variants, +such as [Rails internationalization](https://guides.rubyonrails.org/i18n.html#pluralization) in Ruby or YAML +and [Android String Resources](https://developer.android.com/guide/topics/resources/string-resource.html#Plurals) in XML. +These formats rely on the resource format providing clear delineation of the beginning and end of a pattern. + +## Use-Cases + +Most messages in any localization system do not contain any expressions, statements or variants. +These should be expressible as easily as possible. + +Many messages include expressions that are to be interpolated during formatting. +For example, a greeting like "Hello, user!" may be formatted in many locales with the `user` +being directly set by an input variable. + +Sometimes, interpolated values need explicit formatting within a message. +For example, formatting a message like "You have eaten 3.2 apples" +may require the input numerical value +to be formatted with an explicit `minimumFractionDigits` option. + +Some messages require multiple variants. +This is often related to plural cases, such as "You have 3 new messages", +where the value `3` is an input and the "messages" needs to correspond with its plural category. + +Rarely, messages needs to include leading or trailing whitespace due to +e.g. how they will be concatenated with other text, +or as a result of being segmented from some larger volume of text. + +## Requirements + +Easy things should be easy, and hard things should be possible. + +Developers and translators should be able to read and write the syntax easily in a text editor. + +As MessageFormat 2 will be at best a secondary language to all its users, +it should conform to user expectations and require as little learning as possible. + +The syntax should avoid footguns, +in particular as it's passed through various tools during formatting. + +## Constraints + +Limiting the range of characters that need to be escaped in plain text is important. +Following past precedent, +this design doc will only consider encapsulation styles which +start with `{` and end with `}`. + +The current syntax includes some plain-ascii keywords: +`input`, `local`, `match`, and `when`. + +The current syntax and active proposals include some sigil + name combinations, +such as `:number`, `$var`, `|literal|`, `+bold`, and `@attr`. + +The current syntax supports unquoted literal values as operands. + +## Proposed Design + +TBD + +## Alternatives Considered + +### Start in code, encapsulate text + +This approach treats messages as something like a resource format for pattern values. +Keywords are declared directly at the top level of a message, +and patterns are surrounded by `{...}`. + +Whitespace in patterns is never trimmed. + +Some code statements (variable declarations and match statements) +also use `{...}` to surround values at the top level, +so counting `{` instances is not sufficient to identify if a value is "code" or "text". + +The `{...}` are required for all messages, +including ones that only consist of text. +Delimiters of the resource format are required in addition to this, +so messages may appear wrapped as e.g. `"{...}"`. + +Examples: + +``` +{Hello world} +``` + +``` +{Hello {$user}} +``` + +``` +input {$count :number minimumFractionDigits=1} +{You have eaten {$count} apples} +``` + +``` +input {$count :number} +match {$count} +when 0 {You have no new message} +when one {You have {$count} new message} +when * {You have {$count} new messages} +``` + +``` +{ and some more} +``` + +### Start in text, encapsulate code + +The approach treats messages as template strings, +which may include statements and expressions surrounded by `{...}`. +Multi-variant messages require `match` and `when` statements that are followed by text at the top level. + +Whitespace around statements may need to be trimmed +as e.g. `input` statements may be more readable when placed on a separate line, +where they would be followed by a newline. +At least the following trimming strategies may be considered: + +1. Do not trim any whitespace. +1. Trim a minimal set of defined spaces: + - All spaces before and between variable statements. + - For single-variant messages, one newline after the last variable statement. + - For multivariant messages, + one space after a `when` statement and + one newline followed by any spaces before a subsequent `when` statement. +1. Trim all leading and trailing whitespace. + +All "code" statements are surrounded by `{...}`, +and all "text" is outside them. + +Simple messages are not surrounded by any delimiters +other that what may be required by the resource format. + +Examples using either "minimal" or "all" trimming: + +``` +Hello world +``` + +``` +Hello {$user} +``` + +``` +{input $count :number minimumFractionDigits=1} +You have eaten {$count} apples +``` + +``` +{input $count :number} +{match {$count}} +{when 0} You have no new message +{when one} You have {$count} new message +{when *} You have {$count} new messages +``` + +``` +{| |}and some more +``` From 2109334f5aa49087eceedd11d228c43f1108d84a Mon Sep 17 00:00:00 2001 From: Eemeli Aro Date: Wed, 13 Sep 2023 05:51:03 +0200 Subject: [PATCH 02/24] Note potential conflict with unquoted string literals --- exploration/0474-text-vs-code.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/exploration/0474-text-vs-code.md b/exploration/0474-text-vs-code.md index acad2dde74..ad7c413713 100644 --- a/exploration/0474-text-vs-code.md +++ b/exploration/0474-text-vs-code.md @@ -173,6 +173,9 @@ and all "text" is outside them. Simple messages are not surrounded by any delimiters other that what may be required by the resource format. +Depending on the details of the syntax of code inside the `{...}`, +unquoted non-numeric literals may need to be removed from the syntax. + Examples using either "minimal" or "all" trimming: ``` From 77409a51619bddf1f41c08e6cd2b8e1dde215317 Mon Sep 17 00:00:00 2001 From: Addison Phillips Date: Wed, 13 Sep 2023 11:41:50 +0200 Subject: [PATCH 03/24] Update 0474-text-vs-code.md --- exploration/0474-text-vs-code.md | 130 +++++++++++++++++++++++++++++++ 1 file changed, 130 insertions(+) diff --git a/exploration/0474-text-vs-code.md b/exploration/0474-text-vs-code.md index ad7c413713..caaedefd81 100644 --- a/exploration/0474-text-vs-code.md +++ b/exploration/0474-text-vs-code.md @@ -97,6 +97,10 @@ such as `:number`, `$var`, `|literal|`, `+bold`, and `@attr`. The current syntax supports unquoted literal values as operands. +Messages themselves are "simple strings" and must be considered to be a single +line of text. In many containing formats, newlines will be represented as the local +equivalent of `\n`. + ## Proposed Design TBD @@ -202,3 +206,129 @@ You have eaten {$count} apples ``` {| |}and some more ``` + +### Start with text, formalize for code + +_(From an exercise we did 2023-09-12 with @stasm, @mihnita, @aphillips. +This section is highly experimental, was produced with the help of beer and tapas, +and is preserving a conversation from Slack)_ + +This approach assumes that most users want any string message to be +a valid message format pattern with the minimal amount of special decoration. +"Code" elements can be accessed with a minimum of special decoration. + +**Make the keywords start with a distinct character** +``` +#input $count :number +#local $date1 = $date :datetime dateStyle=long +#match $count :number minFracDigits=2, $gender +#when 1, masculine {You received one message on {$date}} +#when *, masculine {You received {$count} messages on {$date}} +``` + +**Make the block-start keywords start with a distinct character** +``` +#input $count :number minFracDigits=2 #local $date1 = $date :datetime dateStyle=long +#match $count :number minFracDigits=2, $gender +when 1, masculine {You received one message on {$date}} +when *, masculine {You received {$count} messages on {$date}} +``` + +**Have a "message starts in code mode" sigil** +``` +#input {$count :number} +local $date1 = {$date :datetime dateStyle=long} +match {$count :number minFracDigits=2} {$gender} +when 1 masculine {You received one message on {$date}} +when * masculine {You received {$count} messages on {$date}} +``` + +_Permuations_ +``` +#input {$count :number} +#local $date1 = {$date :datetime dateStyle=long} +#match {$count :number minFracDigits=2} $gender $foo +when 1 masculine {You received one message on {$date}} +when * masculine {You received {$count} messages on {$date}} + +#input {$count :number dateStyle=long foo=bar} +#local $date1 = {$date :datetime dateStyle=long} +#match {$count :number minFracDigits=2} {$gender} +1 masculine {You received one message on {$date}} +* masculine {You received {$count} messages on {$date}} + +#input {$count :number dateStyle=long foo=bar} +#local $date1 = {$date :datetime dateStyle=long} +#match {$count :number minFracDigits=2} +#match {$gender} +1 masculine {You received one message on {$date}} +* masculine {You received {$count} messages on {$date}} + +#input $count:number(dateStyle=long foo=bar) +#local $date1 = $date:datetime(dateStyle=long) +#match [$count:number(minFracDigits=2) $gender] +[1 masculine] {You received one message on {$date}} +[* masculine] {You received {$count:number()} messages on {$date}} +``` +**Avoid keywords, use the sigils to signal code mode** +``` +$count:number(dateStyle=long, foo=bar,) +$count :number +$date1 = {$date :datetime dateStyle=long} +?? {$count :number minFracDigits=2} $gender $foo +[1 masculine] {You received one message on {$date}} +[* masculine] {You received {$count} messages on {$date}} +``` + +**Exploration of options side-by-side** +...or really _one above the other_... + +-- current syntax +``` +{Hello, {$username}!} +``` +-- start in text mode +``` +Hello, {$username}! +{$username}, welcome! +``` +-- current syntax +``` +input {$dist :number unit=km} +{The distance is {$dist}.} +``` +-- start in text mode, switch to code, stay until end of input +``` +#input {$dist :number unit=km} +{The distance is {$dist}.} +``` +-- or, start in text mode, switch to code, exit back into text (makes newline meaningful) +``` +#input {$dist :number unit=km}The distance is {$dist}. +``` + + +-- start in text mode +``` +#input {$count :number minFracDigits=2} +#match {$count} +1 {One apple.} +* {{$count} apples.} +``` +-- current syntax +input {$count :number minFracDigits=2} +match {$count} +when 1 {One apple.} +when * {{$count} apples.} + +=============================================================================== + +#input {$item :noun case=accusative} +#input {$color :adjective accord=$item} +{You bought a {$color} {$item}.} + +{You bought a {$color :adjective gender=$item.gender case=accusative} {$item :noun case=accusative}.} + +#input {$item :noun case=accusative} +{You bought a {$color :adjective agree=$item} {$item}.} +``` From b4b965485339d482754bd316a14a703dc1feed01 Mon Sep 17 00:00:00 2001 From: "github-actions[bot]" <41898282+github-actions[bot]@users.noreply.github.com> Date: Wed, 13 Sep 2023 09:42:14 +0000 Subject: [PATCH 04/24] style: Apply Prettier --- exploration/0474-text-vs-code.md | 25 ++++++++++++++++++++++--- 1 file changed, 22 insertions(+), 3 deletions(-) diff --git a/exploration/0474-text-vs-code.md b/exploration/0474-text-vs-code.md index caaedefd81..c6ba550b38 100644 --- a/exploration/0474-text-vs-code.md +++ b/exploration/0474-text-vs-code.md @@ -209,7 +209,7 @@ You have eaten {$count} apples ### Start with text, formalize for code -_(From an exercise we did 2023-09-12 with @stasm, @mihnita, @aphillips. +_(From an exercise we did 2023-09-12 with @stasm, @mihnita, @aphillips. This section is highly experimental, was produced with the help of beer and tapas, and is preserving a conversation from Slack)_ @@ -218,6 +218,7 @@ a valid message format pattern with the minimal amount of special decoration. "Code" elements can be accessed with a minimum of special decoration. **Make the keywords start with a distinct character** + ``` #input $count :number #local $date1 = $date :datetime dateStyle=long @@ -227,6 +228,7 @@ a valid message format pattern with the minimal amount of special decoration. ``` **Make the block-start keywords start with a distinct character** + ``` #input $count :number minFracDigits=2 #local $date1 = $date :datetime dateStyle=long #match $count :number minFracDigits=2, $gender @@ -235,6 +237,7 @@ when *, masculine {You received {$count} messages on {$date}} ``` **Have a "message starts in code mode" sigil** + ``` #input {$count :number} local $date1 = {$date :datetime dateStyle=long} @@ -244,6 +247,7 @@ when * masculine {You received {$count} messages on {$date}} ``` _Permuations_ + ``` #input {$count :number} #local $date1 = {$date :datetime dateStyle=long} @@ -270,7 +274,9 @@ when * masculine {You received {$count} messages on {$date}} [1 masculine] {You received one message on {$date}} [* masculine] {You received {$count:number()} messages on {$date}} ``` + **Avoid keywords, use the sigils to signal code mode** + ``` $count:number(dateStyle=long, foo=bar,) $count :number @@ -284,42 +290,52 @@ $date1 = {$date :datetime dateStyle=long} ...or really _one above the other_... -- current syntax + ``` {Hello, {$username}!} ``` + -- start in text mode + ``` Hello, {$username}! {$username}, welcome! ``` + -- current syntax + ``` input {$dist :number unit=km} {The distance is {$dist}.} ``` + -- start in text mode, switch to code, stay until end of input + ``` #input {$dist :number unit=km} {The distance is {$dist}.} ``` + -- or, start in text mode, switch to code, exit back into text (makes newline meaningful) + ``` #input {$dist :number unit=km}The distance is {$dist}. ``` - -- start in text mode + ``` #input {$count :number minFracDigits=2} #match {$count} 1 {One apple.} * {{$count} apples.} ``` + -- current syntax input {$count :number minFracDigits=2} match {$count} when 1 {One apple.} -when * {{$count} apples.} +when \* {{$count} apples.} =============================================================================== @@ -331,4 +347,7 @@ when * {{$count} apples.} #input {$item :noun case=accusative} {You bought a {$color :adjective agree=$item} {$item}.} + +``` + ``` From 7670caf7b3913ca88d14463732d76e68003b1aec Mon Sep 17 00:00:00 2001 From: Addison Phillips Date: Wed, 13 Sep 2023 11:50:50 +0200 Subject: [PATCH 05/24] Update 0474-text-vs-code.md --- exploration/0474-text-vs-code.md | 19 ++++++++++++++----- 1 file changed, 14 insertions(+), 5 deletions(-) diff --git a/exploration/0474-text-vs-code.md b/exploration/0474-text-vs-code.md index c6ba550b38..2c2f240501 100644 --- a/exploration/0474-text-vs-code.md +++ b/exploration/0474-text-vs-code.md @@ -246,8 +246,7 @@ when 1 masculine {You received one message on {$date}} when * masculine {You received {$count} messages on {$date}} ``` -_Permuations_ - +**_Permuations_** ``` #input {$count :number} #local $date1 = {$date :datetime dateStyle=long} @@ -267,7 +266,10 @@ when * masculine {You received {$count} messages on {$date}} #match {$gender} 1 masculine {You received one message on {$date}} * masculine {You received {$count} messages on {$date}} +``` +**Remove most {} except to delimit placeholders and patterns** +``` #input $count:number(dateStyle=long foo=bar) #local $date1 = $date:datetime(dateStyle=long) #match [$count:number(minFracDigits=2) $gender] @@ -332,13 +334,15 @@ input {$dist :number unit=km} ``` -- current syntax +``` input {$count :number minFracDigits=2} match {$count} when 1 {One apple.} when \* {{$count} apples.} +``` =============================================================================== - +``` #input {$item :noun case=accusative} #input {$color :adjective accord=$item} {You bought a {$color} {$item}.} @@ -347,7 +351,12 @@ when \* {{$count} apples.} #input {$item :noun case=accusative} {You bought a {$color :adjective agree=$item} {$item}.} - ``` -``` +While editing, notice the "single line" format of the above: + +> #input {$item :noun case=accusative}#input {$color :adjective accord=$item}{You bought a {$color} {$item}.} + +> #input {$item :noun case=accusative}{You bought a {$color :adjective agree=$item} {$item}.} + +> input {$count :number minFracDigits=2} match {$count} when 1 {One apple.} when \* {{$count} apples.} From b85c0a2286fdca532befd331b31f87a5894f1380 Mon Sep 17 00:00:00 2001 From: "github-actions[bot]" <41898282+github-actions[bot]@users.noreply.github.com> Date: Wed, 13 Sep 2023 09:51:11 +0000 Subject: [PATCH 06/24] style: Apply Prettier --- exploration/0474-text-vs-code.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/exploration/0474-text-vs-code.md b/exploration/0474-text-vs-code.md index 2c2f240501..0f0082402b 100644 --- a/exploration/0474-text-vs-code.md +++ b/exploration/0474-text-vs-code.md @@ -247,6 +247,7 @@ when * masculine {You received {$count} messages on {$date}} ``` **_Permuations_** + ``` #input {$count :number} #local $date1 = {$date :datetime dateStyle=long} @@ -269,6 +270,7 @@ when * masculine {You received {$count} messages on {$date}} ``` **Remove most {} except to delimit placeholders and patterns** + ``` #input $count:number(dateStyle=long foo=bar) #local $date1 = $date:datetime(dateStyle=long) @@ -334,6 +336,7 @@ input {$dist :number unit=km} ``` -- current syntax + ``` input {$count :number minFracDigits=2} match {$count} @@ -342,6 +345,7 @@ when \* {{$count} apples.} ``` =============================================================================== + ``` #input {$item :noun case=accusative} #input {$color :adjective accord=$item} From c2481488758a3c36895c2fe8364ba9ac7b0bed2a Mon Sep 17 00:00:00 2001 From: Addison Phillips Date: Wed, 13 Sep 2023 16:38:11 +0200 Subject: [PATCH 07/24] scratch pad use of this design document for purely evil reasons --- exploration/0474-text-vs-code.md | 49 ++++++++++++++++++++++++++++++++ 1 file changed, 49 insertions(+) diff --git a/exploration/0474-text-vs-code.md b/exploration/0474-text-vs-code.md index 0f0082402b..b96a7c6aed 100644 --- a/exploration/0474-text-vs-code.md +++ b/exploration/0474-text-vs-code.md @@ -325,6 +325,33 @@ input {$dist :number unit=km} ``` #input {$dist :number unit=km}The distance is {$dist}. ``` +``` +#input {$dist :number unit=km} +{$dist} is the distance. +``` +``` +#input {$dist :number unit=km} +{:number foo=bar} is the distance. +``` +``` +#input {$dist :number unit=km} +{42 :number} is the distance. +``` +quote the pattern to get starting whitespace: +``` +#input {$dist :number unit=km} +{ {42 :number} is the distance.} +``` +**Evil experiments with literal quoting** +``` +This horse is a {fast camel digits=foo} +This horse is a {fast :camel} +This horse is a {|fast | camel} +This horse is a {MY_BUNDLE_KEY :camel} +This horse is a {'fast' camel} +You can't have a {fast camel title="can\'t have a fast camel"} +You can\'t have a {fast camel aria-foo=$foo title=\"can\\'t have fast camel\"} +``` -- start in text mode @@ -364,3 +391,25 @@ While editing, notice the "single line" format of the above: > #input {$item :noun case=accusative}{You bought a {$color :adjective agree=$item} {$item}.} > input {$count :number minFracDigits=2} match {$count} when 1 {One apple.} when \* {{$count} apples.} + + +--- +[x] spannable and standalone non-placeholders + [?] proposed syntax with three sigils +/-/# +[x] non mutable shared namespace using input and local keywords +[x] start in text mode for messages with no declarations and match + [x] need to write set of core example messages +[x] format to parts + [?] design for shape of formatted parts for embedded +[x] expression attributes use cases + [ ] design +[X!!] logo +[ ] Nmtoken +[ ] Overriding functions, extending functions, potentially namespacing +[x] have a stability policy + [?] actual stability policy (in progress) +[x] lazy/eager evaluation - we will not prescribe it and will attempt to avoid forcing eager + - annotations are available post declaration +[x] TAG review is a goal for ~November +[x] Received valuable external input and actually listened to it + From e8c1a8f323a17b4ada6c7b369a5c767095209c1b Mon Sep 17 00:00:00 2001 From: "github-actions[bot]" <41898282+github-actions[bot]@users.noreply.github.com> Date: Wed, 13 Sep 2023 14:38:36 +0000 Subject: [PATCH 08/24] style: Apply Prettier --- exploration/0474-text-vs-code.md | 23 ++++++++++++++--------- 1 file changed, 14 insertions(+), 9 deletions(-) diff --git a/exploration/0474-text-vs-code.md b/exploration/0474-text-vs-code.md index b96a7c6aed..d916f8b7ae 100644 --- a/exploration/0474-text-vs-code.md +++ b/exploration/0474-text-vs-code.md @@ -325,24 +325,31 @@ input {$dist :number unit=km} ``` #input {$dist :number unit=km}The distance is {$dist}. ``` + ``` #input {$dist :number unit=km} {$dist} is the distance. ``` + ``` #input {$dist :number unit=km} {:number foo=bar} is the distance. ``` + ``` #input {$dist :number unit=km} {42 :number} is the distance. ``` + quote the pattern to get starting whitespace: + ``` #input {$dist :number unit=km} { {42 :number} is the distance.} ``` + **Evil experiments with literal quoting** + ``` This horse is a {fast camel digits=foo} This horse is a {fast :camel} @@ -392,24 +399,22 @@ While editing, notice the "single line" format of the above: > input {$count :number minFracDigits=2} match {$count} when 1 {One apple.} when \* {{$count} apples.} - --- + [x] spannable and standalone non-placeholders - [?] proposed syntax with three sigils +/-/# +[?] proposed syntax with three sigils +/-/# [x] non mutable shared namespace using input and local keywords [x] start in text mode for messages with no declarations and match - [x] need to write set of core example messages +[x] need to write set of core example messages [x] format to parts - [?] design for shape of formatted parts for embedded +[?] design for shape of formatted parts for embedded [x] expression attributes use cases - [ ] design +[ ] design [X!!] logo [ ] Nmtoken [ ] Overriding functions, extending functions, potentially namespacing [x] have a stability policy - [?] actual stability policy (in progress) -[x] lazy/eager evaluation - we will not prescribe it and will attempt to avoid forcing eager - - annotations are available post declaration +[?] actual stability policy (in progress) +[x] lazy/eager evaluation - we will not prescribe it and will attempt to avoid forcing eager - annotations are available post declaration [x] TAG review is a goal for ~November [x] Received valuable external input and actually listened to it - From 5d3d37c998e83a053e738470072f0233c26bbc39 Mon Sep 17 00:00:00 2001 From: Addison Phillips Date: Wed, 13 Sep 2023 16:51:22 +0200 Subject: [PATCH 09/24] fixing the evil checklist --- exploration/0474-text-vs-code.md | 34 ++++++++++++++++---------------- 1 file changed, 17 insertions(+), 17 deletions(-) diff --git a/exploration/0474-text-vs-code.md b/exploration/0474-text-vs-code.md index d916f8b7ae..88cae66799 100644 --- a/exploration/0474-text-vs-code.md +++ b/exploration/0474-text-vs-code.md @@ -401,20 +401,20 @@ While editing, notice the "single line" format of the above: --- -[x] spannable and standalone non-placeholders -[?] proposed syntax with three sigils +/-/# -[x] non mutable shared namespace using input and local keywords -[x] start in text mode for messages with no declarations and match -[x] need to write set of core example messages -[x] format to parts -[?] design for shape of formatted parts for embedded -[x] expression attributes use cases -[ ] design -[X!!] logo -[ ] Nmtoken -[ ] Overriding functions, extending functions, potentially namespacing -[x] have a stability policy -[?] actual stability policy (in progress) -[x] lazy/eager evaluation - we will not prescribe it and will attempt to avoid forcing eager - annotations are available post declaration -[x] TAG review is a goal for ~November -[x] Received valuable external input and actually listened to it +- [x] spannable and standalone non-placeholders +- [?] proposed syntax with three sigils +/-/# +- [x] non mutable shared namespace using input and local keywords +- [x] start in text mode for messages with no declarations and match +- [x] need to write set of core example messages +- [x] format to parts + - [?] design for shape of formatted parts for embedded +- [x] expression attributes use cases + - [ ] design +- [X!!] logo +- [ ] Nmtoken +- [ ] Overriding functions, extending functions, potentially namespacing +- [x] have a stability policy + - [?] actual stability policy (in progress) +- [x] lazy/eager evaluation - we will not prescribe it and will attempt to avoid forcing eager - annotations are available post declaration +- [x] TAG review is a goal for ~November +- [x] Received valuable external input and actually listened to it From 71bfd8e4a498d10285b7d3ac1f2df3ae89bb7b9b Mon Sep 17 00:00:00 2001 From: "github-actions[bot]" <41898282+github-actions[bot]@users.noreply.github.com> Date: Wed, 13 Sep 2023 14:51:44 +0000 Subject: [PATCH 10/24] style: Apply Prettier --- exploration/0474-text-vs-code.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/exploration/0474-text-vs-code.md b/exploration/0474-text-vs-code.md index 88cae66799..b88fa74ec3 100644 --- a/exploration/0474-text-vs-code.md +++ b/exploration/0474-text-vs-code.md @@ -407,14 +407,14 @@ While editing, notice the "single line" format of the above: - [x] start in text mode for messages with no declarations and match - [x] need to write set of core example messages - [x] format to parts - - [?] design for shape of formatted parts for embedded + - [?] design for shape of formatted parts for embedded - [x] expression attributes use cases - - [ ] design + - [ ] design - [X!!] logo - [ ] Nmtoken - [ ] Overriding functions, extending functions, potentially namespacing - [x] have a stability policy - - [?] actual stability policy (in progress) + - [?] actual stability policy (in progress) - [x] lazy/eager evaluation - we will not prescribe it and will attempt to avoid forcing eager - annotations are available post declaration - [x] TAG review is a goal for ~November - [x] Received valuable external input and actually listened to it From cc208545783196c6e4fb62e565680f4da306db59 Mon Sep 17 00:00:00 2001 From: Addison Phillips Date: Wed, 13 Sep 2023 23:30:30 +0200 Subject: [PATCH 11/24] Remove checklist to the wiki --- exploration/0474-text-vs-code.md | 17 ----------------- 1 file changed, 17 deletions(-) diff --git a/exploration/0474-text-vs-code.md b/exploration/0474-text-vs-code.md index b88fa74ec3..d7755ebde6 100644 --- a/exploration/0474-text-vs-code.md +++ b/exploration/0474-text-vs-code.md @@ -401,20 +401,3 @@ While editing, notice the "single line" format of the above: --- -- [x] spannable and standalone non-placeholders -- [?] proposed syntax with three sigils +/-/# -- [x] non mutable shared namespace using input and local keywords -- [x] start in text mode for messages with no declarations and match -- [x] need to write set of core example messages -- [x] format to parts - - [?] design for shape of formatted parts for embedded -- [x] expression attributes use cases - - [ ] design -- [X!!] logo -- [ ] Nmtoken -- [ ] Overriding functions, extending functions, potentially namespacing -- [x] have a stability policy - - [?] actual stability policy (in progress) -- [x] lazy/eager evaluation - we will not prescribe it and will attempt to avoid forcing eager - annotations are available post declaration -- [x] TAG review is a goal for ~November -- [x] Received valuable external input and actually listened to it From e48e3405985ae1aa3c89626c362efcb99ecd41b0 Mon Sep 17 00:00:00 2001 From: "github-actions[bot]" <41898282+github-actions[bot]@users.noreply.github.com> Date: Wed, 13 Sep 2023 21:30:49 +0000 Subject: [PATCH 12/24] style: Apply Prettier --- exploration/0474-text-vs-code.md | 1 - 1 file changed, 1 deletion(-) diff --git a/exploration/0474-text-vs-code.md b/exploration/0474-text-vs-code.md index d7755ebde6..58d9aea221 100644 --- a/exploration/0474-text-vs-code.md +++ b/exploration/0474-text-vs-code.md @@ -400,4 +400,3 @@ While editing, notice the "single line" format of the above: > input {$count :number minFracDigits=2} match {$count} when 1 {One apple.} when \* {{$count} apples.} --- - From 64a66f2ca44968f962e2215dc36ecf4ab3a6c2f8 Mon Sep 17 00:00:00 2001 From: Addison Phillips Date: Fri, 22 Sep 2023 13:28:52 -0700 Subject: [PATCH 13/24] Tweak requirements and use cases --- exploration/0474-text-vs-code.md | 29 ++++++++++++++++++++++++++++- 1 file changed, 28 insertions(+), 1 deletion(-) diff --git a/exploration/0474-text-vs-code.md b/exploration/0474-text-vs-code.md index 58d9aea221..3fb0658ecf 100644 --- a/exploration/0474-text-vs-code.md +++ b/exploration/0474-text-vs-code.md @@ -7,6 +7,7 @@ Status: **Proposed**
Contributors
@eemeli
+
@aphillips
First proposed
2023-09-13
Pull Request
@@ -70,18 +71,44 @@ Rarely, messages needs to include leading or trailing whitespace due to e.g. how they will be concatenated with other text, or as a result of being segmented from some larger volume of text. +--- + +Users editing a simple message and who wish to add an `input` or `local` annotiation +to the message do not wish to reformat the message extensively. + +Users who have messages that include leading or trailing whitespace +want to ensure that this whitespace is included in the translatable +text portion of the message. Which whitespace characters are displayed at runtime +should not be surprising. + ## Requirements Easy things should be easy, and hard things should be possible. Developers and translators should be able to read and write the syntax easily in a text editor. +Translators (and their tools) are not software engineers, so we want our syntax +to be as simple, robust, and non-fussy as possible. +Multiple levels of complex nesting should be avoided, +along with any constructs that require an excessive +level of precision on the part of non-technical users. + As MessageFormat 2 will be at best a secondary language to all its users, it should conform to user expectations and require as little learning as possible. The syntax should avoid footguns, in particular as it's passed through various tools during formatting. +ASCII-compatible syntax. While support for non-ASCII characters for variable names, +values, literals, options, and the like are important, the syntax itself should +be restricted to ASCII characters. This allows the message to be parsed +visually by humans even when embedded in a syntax that requires escaping. + +Whitespace is forgiving. We _require_ the minimum amount of whitespace and allow +users to format or change unimportant whitespace as much as they want. +This avoids the need for translators or tools to be super pedantic about +formatting. + ## Constraints Limiting the range of characters that need to be escaped in plain text is important. @@ -206,7 +233,7 @@ You have eaten {$count} apples ``` {| |}and some more ``` - + ### Start with text, formalize for code _(From an exercise we did 2023-09-12 with @stasm, @mihnita, @aphillips. From d4757b067cb1cb846f31562a763d839d4db10335 Mon Sep 17 00:00:00 2001 From: "github-actions[bot]" <41898282+github-actions[bot]@users.noreply.github.com> Date: Fri, 22 Sep 2023 20:29:13 +0000 Subject: [PATCH 14/24] style: Apply Prettier --- exploration/0474-text-vs-code.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/exploration/0474-text-vs-code.md b/exploration/0474-text-vs-code.md index 3fb0658ecf..7324b130fe 100644 --- a/exploration/0474-text-vs-code.md +++ b/exploration/0474-text-vs-code.md @@ -73,11 +73,11 @@ or as a result of being segmented from some larger volume of text. --- -Users editing a simple message and who wish to add an `input` or `local` annotiation +Users editing a simple message and who wish to add an `input` or `local` annotiation to the message do not wish to reformat the message extensively. Users who have messages that include leading or trailing whitespace -want to ensure that this whitespace is included in the translatable +want to ensure that this whitespace is included in the translatable text portion of the message. Which whitespace characters are displayed at runtime should not be surprising. @@ -88,8 +88,8 @@ Easy things should be easy, and hard things should be possible. Developers and translators should be able to read and write the syntax easily in a text editor. Translators (and their tools) are not software engineers, so we want our syntax -to be as simple, robust, and non-fussy as possible. -Multiple levels of complex nesting should be avoided, +to be as simple, robust, and non-fussy as possible. +Multiple levels of complex nesting should be avoided, along with any constructs that require an excessive level of precision on the part of non-technical users. @@ -233,7 +233,7 @@ You have eaten {$count} apples ``` {| |}and some more ``` - + ### Start with text, formalize for code _(From an exercise we did 2023-09-12 with @stasm, @mihnita, @aphillips. From 96e3d7e4fdabc8662fd4e4a2d2bf3958466bff47 Mon Sep 17 00:00:00 2001 From: Addison Phillips Date: Fri, 22 Sep 2023 14:08:22 -0700 Subject: [PATCH 15/24] Proposing a design ... not expecting us to adopt it, but we need to make progress in deciding the specific issues here. --- exploration/0474-text-vs-code.md | 43 +++++++++++++++++++++++++++++++- 1 file changed, 42 insertions(+), 1 deletion(-) diff --git a/exploration/0474-text-vs-code.md b/exploration/0474-text-vs-code.md index 7324b130fe..07add6e0c9 100644 --- a/exploration/0474-text-vs-code.md +++ b/exploration/0474-text-vs-code.md @@ -130,7 +130,48 @@ equivalent of `\n`. ## Proposed Design -TBD +**Start in text mode** + +In this option, whitespace is trimmed from `pattern` constructs unless +the pattern is quoted. + +``` +Hello world! + +Hello {$user}! + +{input $now :datetime dateStyle=long} +Hello {$user}. Today is {$now} + +{local $now = {:systemGetCurrentTime :datetime dateStyle=medium}} +Hello {$user}. Today is {$now} + +{match {$count :number integer=true}} +{when 0} Hello {$user}. Today is {$now} and you have no geese. +{when one} Hello {$user}. Today is {$now} and you have {$count} goose. +{when few} { Hello {$user}, this message has spaces on the front and end. } +{when *} Hello {$user}. Today is {$now} and you have {$count} geese. +``` + +Bear in mind that whitespace has no meaning in our syntax, +so some of the above messages are actually: +``` +{input $now :datetime dateStyle=long}Hello {$user}. Today is {$now} + +{match {$count :number option=value}{when 0} Hello {$user}{when one} Hello {$user}{when *}{ Hello {$user} } +``` + +Key choices we should consider as a working group: + +- Do we like the double-brackets that `local` and `match` require? +- Is the whitespace story clear? Will tools generate the extra `{}` + around patterns "just to be sure"? +- Is the single-line representation (most users will actually see this!) + a good one? Is this noise good noise? + +I have put this as the proposed design because no one has stood up for starting +in code mode and this is the most conservative change. +This does not make it the right syntax. ## Alternatives Considered From ebc801640cf525d35e8d0927112d64109615adbb Mon Sep 17 00:00:00 2001 From: "github-actions[bot]" <41898282+github-actions[bot]@users.noreply.github.com> Date: Fri, 22 Sep 2023 21:08:43 +0000 Subject: [PATCH 16/24] style: Apply Prettier --- exploration/0474-text-vs-code.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/exploration/0474-text-vs-code.md b/exploration/0474-text-vs-code.md index 07add6e0c9..1aa1b267e6 100644 --- a/exploration/0474-text-vs-code.md +++ b/exploration/0474-text-vs-code.md @@ -153,8 +153,9 @@ Hello {$user}. Today is {$now} {when *} Hello {$user}. Today is {$now} and you have {$count} geese. ``` -Bear in mind that whitespace has no meaning in our syntax, +Bear in mind that whitespace has no meaning in our syntax, so some of the above messages are actually: + ``` {input $now :datetime dateStyle=long}Hello {$user}. Today is {$now} From cd0a12e6bdfd9aa2cb6aef188805464160732683 Mon Sep 17 00:00:00 2001 From: Addison Phillips Date: Fri, 22 Sep 2023 14:09:38 -0700 Subject: [PATCH 17/24] Typo in `match` ... which is perhaps indicative of an answer to one of the questions about double-bracketing `match`... --- exploration/0474-text-vs-code.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/exploration/0474-text-vs-code.md b/exploration/0474-text-vs-code.md index 1aa1b267e6..ddd2c51845 100644 --- a/exploration/0474-text-vs-code.md +++ b/exploration/0474-text-vs-code.md @@ -159,7 +159,7 @@ so some of the above messages are actually: ``` {input $now :datetime dateStyle=long}Hello {$user}. Today is {$now} -{match {$count :number option=value}{when 0} Hello {$user}{when one} Hello {$user}{when *}{ Hello {$user} } +{match {$count :number option=value}}{when 0} Hello {$user}{when one} Hello {$user}{when *}{ Hello {$user} } ``` Key choices we should consider as a working group: From 9fc96f5700accf26c750bfbf4e02b52c399b53a8 Mon Sep 17 00:00:00 2001 From: Eemeli Aro Date: Sat, 23 Sep 2023 05:31:41 +0300 Subject: [PATCH 18/24] Update exploration/0474-text-vs-code.md --- exploration/0474-text-vs-code.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/exploration/0474-text-vs-code.md b/exploration/0474-text-vs-code.md index ddd2c51845..290c2f8698 100644 --- a/exploration/0474-text-vs-code.md +++ b/exploration/0474-text-vs-code.md @@ -83,7 +83,7 @@ should not be surprising. ## Requirements -Easy things should be easy, and hard things should be possible. +Common things should be easy, uncommon things should be possible. Developers and translators should be able to read and write the syntax easily in a text editor. From cba607a6b92b194b34aae603df5743505dec0779 Mon Sep 17 00:00:00 2001 From: Eemeli Aro Date: Tue, 24 Oct 2023 14:55:35 +0300 Subject: [PATCH 19/24] Rename exploration/0474-text-vs-code.md -> exploration/text-vs-code.md --- exploration/{0474-text-vs-code.md => text-vs-code.md} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename exploration/{0474-text-vs-code.md => text-vs-code.md} (100%) diff --git a/exploration/0474-text-vs-code.md b/exploration/text-vs-code.md similarity index 100% rename from exploration/0474-text-vs-code.md rename to exploration/text-vs-code.md From 84879bb5e7c18c2e3348fce426655a7874b2bc01 Mon Sep 17 00:00:00 2001 From: Eemeli Aro Date: Tue, 24 Oct 2023 15:08:15 +0300 Subject: [PATCH 20/24] Drop the "highly experimental" section documenting a Slack conversation --- exploration/text-vs-code.md | 194 ------------------------------------ 1 file changed, 194 deletions(-) diff --git a/exploration/text-vs-code.md b/exploration/text-vs-code.md index 290c2f8698..1220b6e7e5 100644 --- a/exploration/text-vs-code.md +++ b/exploration/text-vs-code.md @@ -275,197 +275,3 @@ You have eaten {$count} apples ``` {| |}and some more ``` - -### Start with text, formalize for code - -_(From an exercise we did 2023-09-12 with @stasm, @mihnita, @aphillips. -This section is highly experimental, was produced with the help of beer and tapas, -and is preserving a conversation from Slack)_ - -This approach assumes that most users want any string message to be -a valid message format pattern with the minimal amount of special decoration. -"Code" elements can be accessed with a minimum of special decoration. - -**Make the keywords start with a distinct character** - -``` -#input $count :number -#local $date1 = $date :datetime dateStyle=long -#match $count :number minFracDigits=2, $gender -#when 1, masculine {You received one message on {$date}} -#when *, masculine {You received {$count} messages on {$date}} -``` - -**Make the block-start keywords start with a distinct character** - -``` -#input $count :number minFracDigits=2 #local $date1 = $date :datetime dateStyle=long -#match $count :number minFracDigits=2, $gender -when 1, masculine {You received one message on {$date}} -when *, masculine {You received {$count} messages on {$date}} -``` - -**Have a "message starts in code mode" sigil** - -``` -#input {$count :number} -local $date1 = {$date :datetime dateStyle=long} -match {$count :number minFracDigits=2} {$gender} -when 1 masculine {You received one message on {$date}} -when * masculine {You received {$count} messages on {$date}} -``` - -**_Permuations_** - -``` -#input {$count :number} -#local $date1 = {$date :datetime dateStyle=long} -#match {$count :number minFracDigits=2} $gender $foo -when 1 masculine {You received one message on {$date}} -when * masculine {You received {$count} messages on {$date}} - -#input {$count :number dateStyle=long foo=bar} -#local $date1 = {$date :datetime dateStyle=long} -#match {$count :number minFracDigits=2} {$gender} -1 masculine {You received one message on {$date}} -* masculine {You received {$count} messages on {$date}} - -#input {$count :number dateStyle=long foo=bar} -#local $date1 = {$date :datetime dateStyle=long} -#match {$count :number minFracDigits=2} -#match {$gender} -1 masculine {You received one message on {$date}} -* masculine {You received {$count} messages on {$date}} -``` - -**Remove most {} except to delimit placeholders and patterns** - -``` -#input $count:number(dateStyle=long foo=bar) -#local $date1 = $date:datetime(dateStyle=long) -#match [$count:number(minFracDigits=2) $gender] -[1 masculine] {You received one message on {$date}} -[* masculine] {You received {$count:number()} messages on {$date}} -``` - -**Avoid keywords, use the sigils to signal code mode** - -``` -$count:number(dateStyle=long, foo=bar,) -$count :number -$date1 = {$date :datetime dateStyle=long} -?? {$count :number minFracDigits=2} $gender $foo -[1 masculine] {You received one message on {$date}} -[* masculine] {You received {$count} messages on {$date}} -``` - -**Exploration of options side-by-side** -...or really _one above the other_... - --- current syntax - -``` -{Hello, {$username}!} -``` - --- start in text mode - -``` -Hello, {$username}! -{$username}, welcome! -``` - --- current syntax - -``` -input {$dist :number unit=km} -{The distance is {$dist}.} -``` - --- start in text mode, switch to code, stay until end of input - -``` -#input {$dist :number unit=km} -{The distance is {$dist}.} -``` - --- or, start in text mode, switch to code, exit back into text (makes newline meaningful) - -``` -#input {$dist :number unit=km}The distance is {$dist}. -``` - -``` -#input {$dist :number unit=km} -{$dist} is the distance. -``` - -``` -#input {$dist :number unit=km} -{:number foo=bar} is the distance. -``` - -``` -#input {$dist :number unit=km} -{42 :number} is the distance. -``` - -quote the pattern to get starting whitespace: - -``` -#input {$dist :number unit=km} -{ {42 :number} is the distance.} -``` - -**Evil experiments with literal quoting** - -``` -This horse is a {fast camel digits=foo} -This horse is a {fast :camel} -This horse is a {|fast | camel} -This horse is a {MY_BUNDLE_KEY :camel} -This horse is a {'fast' camel} -You can't have a {fast camel title="can\'t have a fast camel"} -You can\'t have a {fast camel aria-foo=$foo title=\"can\\'t have fast camel\"} -``` - --- start in text mode - -``` -#input {$count :number minFracDigits=2} -#match {$count} -1 {One apple.} -* {{$count} apples.} -``` - --- current syntax - -``` -input {$count :number minFracDigits=2} -match {$count} -when 1 {One apple.} -when \* {{$count} apples.} -``` - -=============================================================================== - -``` -#input {$item :noun case=accusative} -#input {$color :adjective accord=$item} -{You bought a {$color} {$item}.} - -{You bought a {$color :adjective gender=$item.gender case=accusative} {$item :noun case=accusative}.} - -#input {$item :noun case=accusative} -{You bought a {$color :adjective agree=$item} {$item}.} -``` - -While editing, notice the "single line" format of the above: - -> #input {$item :noun case=accusative}#input {$color :adjective accord=$item}{You bought a {$color} {$item}.} - -> #input {$item :noun case=accusative}{You bought a {$color :adjective agree=$item} {$item}.} - -> input {$count :number minFracDigits=2} match {$count} when 1 {One apple.} when \* {{$count} apples.} - ---- From 28fdb040031ef71ce881f208b8f6145c1bc146aa Mon Sep 17 00:00:00 2001 From: Eemeli Aro Date: Tue, 24 Oct 2023 16:00:39 +0300 Subject: [PATCH 21/24] Update alternatives, dropping explicit syntaxes --- exploration/text-vs-code.md | 203 +++++++++++++----------------------- 1 file changed, 71 insertions(+), 132 deletions(-) diff --git a/exploration/text-vs-code.md b/exploration/text-vs-code.md index 1220b6e7e5..3a0b259795 100644 --- a/exploration/text-vs-code.md +++ b/exploration/text-vs-code.md @@ -25,16 +25,17 @@ Existing message and template formatting languages tend to start in "text" mode, and require special syntax like `{{` or `{%` to enter "code" mode. ICU MessageFormat and Fluent both support inline selectors -separated from the text using `{ ... }` for multi-variant messages. +separated from the text using `{…}` for multi-variant messages. +ICU MessageFormat is the only known format that uses `{…}` to also delimit text. [Mustache templates](https://mustache.github.io/mustache.5.html) -and related languages wrap "code" in `{{ ... }}`. +and related languages wrap "code" in `{{…}}`. In addition to placeholders that are replaced by their interpolated value during formatting, -this also includes conditional blocks using `{{#...}}`/`{{/...}}` wrappers. +this also includes conditional blocks using `{{#…}}`/`{{/…}}` wrappers. [Handlebars](https://handlebarsjs.com/guide/) extends Mustache expressions -with operators such as `{{#if ...}}` and `{{#each ...}}`, -as well as custom formatting functions that become available as e.g. `{{bold ...}}`. +with operators such as `{{#if …}}` and `{{#each …}}`, +as well as custom formatting functions that become available as e.g. `{{bold …}}`. [Jinja templates](https://jinja.palletsprojects.com/en/3.1.x/templates/) separate `{% statements %}` and `{{ expressions }}` from the base text. @@ -49,6 +50,12 @@ such as [Rails internationalization](https://guides.rubyonrails.org/i18n.html#pl and [Android String Resources](https://developer.android.com/guide/topics/resources/string-resource.html#Plurals) in XML. These formats rely on the resource format providing clear delineation of the beginning and end of a pattern. +Based on available data, +no more than 0.3% of all messages and no more than 0.1% of messages with variants +contain leading or trailing whitespace. +No more than one third of this whitespace is localizable, +and most commonly it's due to improper segmentation or other internationalization bugs. + ## Use-Cases Most messages in any localization system do not contain any expressions, statements or variants. @@ -130,148 +137,80 @@ equivalent of `\n`. ## Proposed Design -**Start in text mode** +### Start in text, encapsulate code, trim around statements -In this option, whitespace is trimmed from `pattern` constructs unless -the pattern is quoted. +Allow for message patterns to not be quoted. -``` -Hello world! +Encapsulate with `{…}` or otherwise distinguishing statements from +the primarily unquoted translatable message contents. -Hello {$user}! +For messages with multiple variants, +separate the variants using `when` statements. -{input $now :datetime dateStyle=long} -Hello {$user}. Today is {$now} +Trim whitespace between and around statements such as `input` and `when`, +but do not otherwise trim any leading or trailing whitespace from a message. +This allows for whitespace such as spaces and newlines to be used outside patterns +to make a message more readable. -{local $now = {:systemGetCurrentTime :datetime dateStyle=medium}} -Hello {$user}. Today is {$now} +Allow for a pattern to be `{{…}}` quoted +such that it preserves its leading and/or trailing whitespace +even when preceded or followed by statements. -{match {$count :number integer=true}} -{when 0} Hello {$user}. Today is {$now} and you have no geese. -{when one} Hello {$user}. Today is {$now} and you have {$count} goose. -{when few} { Hello {$user}, this message has spaces on the front and end. } -{when *} Hello {$user}. Today is {$now} and you have {$count} geese. -``` +## Alternatives Considered -Bear in mind that whitespace has no meaning in our syntax, -so some of the above messages are actually: +### Start in code, encapsulate text -``` -{input $now :datetime dateStyle=long}Hello {$user}. Today is {$now} +This approach treats messages as something like a resource format for pattern values. +Keywords are declared directly at the top level of a message, +and patterns are always surrounded by `{{…}}` or some other delimiters. -{match {$count :number option=value}}{when 0} Hello {$user}{when one} Hello {$user}{when *}{ Hello {$user} } -``` +Whitespace in patterns is never trimmed. -Key choices we should consider as a working group: +The `{{…}}` are required for all messages, +including ones that only consist of text. +Delimiters of the resource format are required in addition to this, +so messages may appear wrapped as e.g. `"{{…}}"`. -- Do we like the double-brackets that `local` and `match` require? -- Is the whitespace story clear? Will tools generate the extra `{}` - around patterns "just to be sure"? -- Is the single-line representation (most users will actually see this!) - a good one? Is this noise good noise? +This option is not chosen due to adding an excessive +quoting burden on all messages. -I have put this as the proposed design because no one has stood up for starting -in code mode and this is the most conservative change. -This does not make it the right syntax. +### Start in text, encapsulate code, re-encapsulate text within code -## Alternatives Considered +As in the proposed design, simple patterns are unquoted. +Patterns in messages with statements, however, +are required to always be surrounded by `{{…}}` or some other delimiters. -### Start in code, encapsulate text +This effectively means that some syntax will "enable" code mode for a message, +and that patterns in such a message need delimiters. -This approach treats messages as something like a resource format for pattern values. -Keywords are declared directly at the top level of a message, -and patterns are surrounded by `{...}`. +This option is not chosen due to adding an excessive +quoting burden on all multi-variant messages, +as well as introducing an unnecessary additional conceptual layer to the syntax. -Whitespace in patterns is never trimmed. +### Start in text, encapsulate code, trim minimally -Some code statements (variable declarations and match statements) -also use `{...}` to surround values at the top level, -so counting `{` instances is not sufficient to identify if a value is "code" or "text". +This is the same as the proposed design, +but with a different trimming rule: -The `{...}` are required for all messages, -including ones that only consist of text. -Delimiters of the resource format are required in addition to this, -so messages may appear wrapped as e.g. `"{...}"`. - -Examples: - -``` -{Hello world} -``` - -``` -{Hello {$user}} -``` - -``` -input {$count :number minimumFractionDigits=1} -{You have eaten {$count} apples} -``` - -``` -input {$count :number} -match {$count} -when 0 {You have no new message} -when one {You have {$count} new message} -when * {You have {$count} new messages} -``` - -``` -{ and some more} -``` - -### Start in text, encapsulate code - -The approach treats messages as template strings, -which may include statements and expressions surrounded by `{...}`. -Multi-variant messages require `match` and `when` statements that are followed by text at the top level. - -Whitespace around statements may need to be trimmed -as e.g. `input` statements may be more readable when placed on a separate line, -where they would be followed by a newline. -At least the following trimming strategies may be considered: - -1. Do not trim any whitespace. -1. Trim a minimal set of defined spaces: - - All spaces before and between variable statements. - - For single-variant messages, one newline after the last variable statement. - - For multivariant messages, - one space after a `when` statement and - one newline followed by any spaces before a subsequent `when` statement. -1. Trim all leading and trailing whitespace. - -All "code" statements are surrounded by `{...}`, -and all "text" is outside them. - -Simple messages are not surrounded by any delimiters -other that what may be required by the resource format. - -Depending on the details of the syntax of code inside the `{...}`, -unquoted non-numeric literals may need to be removed from the syntax. - -Examples using either "minimal" or "all" trimming: - -``` -Hello world -``` - -``` -Hello {$user} -``` - -``` -{input $count :number minimumFractionDigits=1} -You have eaten {$count} apples -``` - -``` -{input $count :number} -{match {$count}} -{when 0} You have no new message -{when one} You have {$count} new message -{when *} You have {$count} new messages -``` - -``` -{| |}and some more -``` +- Trim all spaces before and between declarations. +- For single-variant messages, trim one newline after the last declaration. +- For multivariant messages, + trim one space after a `when` statement and + one newline followed by any spaces before a subsequent `when` statement. + +This option is not chosen due to the quoting being too magical. +Even though this allows for all patterns with whitespace to not need quotes, +the cost in complexity is too great. + +### Start in text, encapsulate code, trim maximally + +This is the same as the proposed design, +but with a different trimming rule: + +- Trim all leading and trailing whitespace for each pattern. + +Expressing the trimming on patterns rather than statements +means that leading and trailing spaces are also trimmed from simple messages. +This option is not chosen due to this being somewhat surprising, +especially when messages are embedded in host formats that have predefined means +of escaping and/or trimming leading and trailing spaces from a value. From ff50c23b59511f8b3feef656d4372108c8d0a0a6 Mon Sep 17 00:00:00 2001 From: Eemeli Aro Date: Tue, 24 Oct 2023 19:28:24 +0300 Subject: [PATCH 22/24] Apply suggestions from code review Co-authored-by: Addison Phillips --- exploration/text-vs-code.md | 38 ++++++++++++++++++++++--------------- 1 file changed, 23 insertions(+), 15 deletions(-) diff --git a/exploration/text-vs-code.md b/exploration/text-vs-code.md index 3a0b259795..d1785360b8 100644 --- a/exploration/text-vs-code.md +++ b/exploration/text-vs-code.md @@ -61,18 +61,29 @@ and most commonly it's due to improper segmentation or other internationalizatio Most messages in any localization system do not contain any expressions, statements or variants. These should be expressible as easily as possible. -Many messages include expressions that are to be interpolated during formatting. -For example, a greeting like "Hello, user!" may be formatted in many locales with the `user` -being directly set by an input variable. - -Sometimes, interpolated values need explicit formatting within a message. -For example, formatting a message like "You have eaten 3.2 apples" -may require the input numerical value -to be formatted with an explicit `minimumFractionDigits` option. - -Some messages require multiple variants. -This is often related to plural cases, such as "You have 3 new messages", -where the value `3` is an input and the "messages" needs to correspond with its plural category. +Many messages include expressions that are meant to be replaced during formatting. +For example, a greeting like "Hello, {$username}!" would be formatted with the variable +`$username` being replaced by an input variable. + +In some rare cases, replacement variables might be added (or removed) in one particular +locale versus messages in other locales. + +Sometimes, the replacement variables need to be formatted. +For example, formatting a message like `You have {$distance} kilometers to go` +requires that the numeric value `{$distance}` be formatted as a number according to the locale: `You have 1,234 kilometers to go`. + +Formatting of replacement variables might also require tailoring. +For example, if the user wants to show fractions of a kilometer in the above example +they might include a `minimumFractionDigits` option to get a result like +`You have 1,234.5 kilometers to go`. + +Some messages need to choose between multiple patterns (called "variants"). +For example, this is often related to the handling of numeric values, +in which the pattern used for formatting depends on one of the data values +according to its plural category +(see [CLDR Plural Rules](https://cldr.unicode.org/index/cldr-spec/plural-rules) for more information). +So, in American English, the formatter might need to choose between formatting +`You have 1 kilometer to go` and `You have 2 kilometers to go`. Rarely, messages needs to include leading or trailing whitespace due to e.g. how they will be concatenated with other text, @@ -119,9 +130,6 @@ formatting. ## Constraints Limiting the range of characters that need to be escaped in plain text is important. -Following past precedent, -this design doc will only consider encapsulation styles which -start with `{` and end with `}`. The current syntax includes some plain-ascii keywords: `input`, `local`, `match`, and `when`. From f898db61ca2bdebb76641dd9d22be67e306829df Mon Sep 17 00:00:00 2001 From: Eemeli Aro Date: Tue, 24 Oct 2023 19:39:54 +0300 Subject: [PATCH 23/24] Add "Start in text, encapsulate code, do not trim" --- exploration/text-vs-code.md | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/exploration/text-vs-code.md b/exploration/text-vs-code.md index d1785360b8..5e13cd0c05 100644 --- a/exploration/text-vs-code.md +++ b/exploration/text-vs-code.md @@ -222,3 +222,16 @@ means that leading and trailing spaces are also trimmed from simple messages. This option is not chosen due to this being somewhat surprising, especially when messages are embedded in host formats that have predefined means of escaping and/or trimming leading and trailing spaces from a value. + +### Start in text, encapsulate code, do not trim + +This is the same as the proposed design, +but with two simplifications: + +- No whitespace is ever trimmed. +- Quoting a pattern with `{{…}}` is dropped as unnecessary. + +With these changes, +all whitespace would need to be explicitly within the "code" part of the syntax, +and patterns could never be separated from statements +without adding whitespace to the pattern. From 692c4660cd42c4420a1346d665fcf053bdbb642d Mon Sep 17 00:00:00 2001 From: Eemeli Aro Date: Tue, 24 Oct 2023 20:09:02 +0300 Subject: [PATCH 24/24] Refer to authors/developers/translators, not "users" --- exploration/text-vs-code.md | 19 ++++++++++--------- 1 file changed, 10 insertions(+), 9 deletions(-) diff --git a/exploration/text-vs-code.md b/exploration/text-vs-code.md index 5e13cd0c05..816d12d9a3 100644 --- a/exploration/text-vs-code.md +++ b/exploration/text-vs-code.md @@ -73,7 +73,7 @@ For example, formatting a message like `You have {$distance} kilometers to go` requires that the numeric value `{$distance}` be formatted as a number according to the locale: `You have 1,234 kilometers to go`. Formatting of replacement variables might also require tailoring. -For example, if the user wants to show fractions of a kilometer in the above example +For example, if the author wants to show fractions of a kilometer in the above example they might include a `minimumFractionDigits` option to get a result like `You have 1,234.5 kilometers to go`. @@ -91,13 +91,13 @@ or as a result of being segmented from some larger volume of text. --- -Users editing a simple message and who wish to add an `input` or `local` annotiation +Developers editing a simple message and who wish to add an `input` or `local` annotiation to the message do not wish to reformat the message extensively. -Users who have messages that include leading or trailing whitespace +Developers who have messages that include leading or trailing whitespace want to ensure that this whitespace is included in the translatable -text portion of the message. Which whitespace characters are displayed at runtime -should not be surprising. +text portion of the message. +Which whitespace characters are displayed at runtime should not be surprising. ## Requirements @@ -109,9 +109,9 @@ Translators (and their tools) are not software engineers, so we want our syntax to be as simple, robust, and non-fussy as possible. Multiple levels of complex nesting should be avoided, along with any constructs that require an excessive -level of precision on the part of non-technical users. +level of precision on the part of non-technical authors. -As MessageFormat 2 will be at best a secondary language to all its users, +As MessageFormat 2 will be at best a secondary language to all its authors and editors, it should conform to user expectations and require as little learning as possible. The syntax should avoid footguns, @@ -122,8 +122,9 @@ values, literals, options, and the like are important, the syntax itself should be restricted to ASCII characters. This allows the message to be parsed visually by humans even when embedded in a syntax that requires escaping. -Whitespace is forgiving. We _require_ the minimum amount of whitespace and allow -users to format or change unimportant whitespace as much as they want. +Whitespace is forgiving. +We _require_ the minimum amount of whitespace and allow +authors to format or change unimportant whitespace as much as they want. This avoids the need for translators or tools to be super pedantic about formatting.