-
-
Notifications
You must be signed in to change notification settings - Fork 36
Add consensus decisions #154
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
2267a3d
b818a4c
799f800
c9a8b88
5151e95
7ef13ce
6288157
4140b1c
532e68d
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,69 @@ | ||
# Consensus Decisions | ||
|
||
During its proceedings, the working group has reached internal consensus on a number of issues. | ||
This document enumerates those, and provides a reference for later actions. | ||
|
||
### Sources | ||
|
||
For more details on the process that lead to these decisions, please refer to the following: | ||
|
||
- **Consensus 1 & 2:** | ||
Identified as prerequisites for maintaining backwards-compatibility with MessageFormat 1 once Consensus 3 & 4 are agreed upon. | ||
Reached during the meetings of the [issue #103](https://github.com/unicode-org/message-format-wg/issues/103) task-force, and codified during the [October 2020 task-force meeting](https://github.com/unicode-org/message-format-wg/blob/HEAD/meetings/task-force/%23103-2020-10-26.md). | ||
Accepted at the [November 2020 meeting](https://github.com/unicode-org/message-format-wg/blob/HEAD/meetings/2020/notes-2020-11-16.md) of the working group. | ||
- **Consensus 3 & 4:** | ||
The core result of the [issue #103](https://github.com/unicode-org/message-format-wg/issues/103) task-force ([minutes](https://github.com/unicode-org/message-format-wg/tree/master/meetings/task-force)). | ||
Reached in principle during the [December 2020 meeting](https://github.com/unicode-org/message-format-wg/blob/HEAD/meetings/2020/notes-2020-12-14.md) of the working group. | ||
Codified in [issue #137](https://github.com/unicode-org/message-format-wg/issues/137). | ||
Discussed and accepted at the [January 2021](https://github.com/unicode-org/message-format-wg/issues/146) and [February 2021](https://github.com/unicode-org/message-format-wg/blob/HEAD/meetings/2021/notes-2021-02-15.md) meetings of the working group. | ||
- **Consensus 5 & 6:** | ||
The solution for [issue #127](https://github.com/unicode-org/message-format-wg/issues/127). | ||
Codified in [issue #137](https://github.com/unicode-org/message-format-wg/issues/137) during the [January 2021 meeting](https://github.com/unicode-org/message-format-wg/issues/146) of the working group. | ||
Discussed and accepted at the [February 2021 meeting](https://github.com/unicode-org/message-format-wg/blob/HEAD/meetings/2021/notes-2021-02-15.md) of the working group. | ||
|
||
## 1: Include message references in the data model. | ||
|
||
**Discussion:** | ||
The implementers would find a way to include references anyways, but including it in the data model (standard) can make it subject to best practices. | ||
It will unfortunately still be possible, but much more difficult, for users to do “the wrong thing” by concatenating strings or messages. | ||
|
||
One of the drawbacks of message references is that referenced messages effectively have a public API (names of parameters, variables, variants, etc.) which must be consistent across all callsites. | ||
This leads us to consensus 2. | ||
|
||
## 2: Allow message references to include parameters in a form that enables their validation. | ||
|
||
**Discussion:** | ||
The variables/fields passed should not be completely untyped and unchecked. | ||
We want a validation mechanism that can allow providing early error feedback to the translators and developers. | ||
We need to decide on when the validation can and should happen, including the meaning of “build time” and “run time” in regards to validation. | ||
|
||
## 3: Allow for selectors to select a case depending on the value of one or more input arguments. | ||
|
||
**Discussion:** | ||
This is a prerequisite for top-level selectors to be able to represent complex messages, without requiring those messages to be split up in an unergonomic manner. | ||
This is an extension or relaxation of what's allowed in MessageFormat 1. | ||
|
||
While message references make it technically possible for the data model to represent multi-argument selectors otherwise, this requires the use of n²-1 artificial "messages", where n is the number of arguments. This is not desirable. | ||
|
||
## 4: Only allow for selectors at the top level of a message. | ||
|
||
**Discussion:** | ||
Requiring selectors to only be available at the top level is a good way of helping to maintain the translatability of messages, as well as otherwise guiding MessageFormat 2 users towards good practices. | ||
|
||
After an in-depth exploration of the problem space, we have determined that while selectors are a necessary feature of MessageFormat, it is not necessary for them to be available within the body of a message, or directly within a case of a parent selector. | ||
|
||
All identified use cases of such constructions may be cleanly represented using a top-level selector that may use more than one input argument to select among a set of messages. | ||
Furthermore, we may enable complete reversibility of message transformations to and from languages such as MessageFormat 1 and Fluent by using message references. | ||
|
||
## 5: Top level selectors together with message references provides the same value as nested selectors at a lower cost. | ||
|
||
**Discussion:** | ||
Nested selectors provides capabilities that may be useful in avoiding variant permutation explosion in edge cases, but the use of them has not been evaluated in production localization systems to date. | ||
The group believes that the known value of this feature can be sufficiently covered by the combination of message references and top level selection features, which together provide a sufficient feature set at a lower cost to the ecosystem than nested selectors would do. | ||
|
||
## 6: The model will be designed to leave the door open for nested selectors being potentially reconsidered in the future. | ||
|
||
**Discussion:** | ||
The cost analysis of the nested selectors feature was performed in the absence of sufficient in-field experience of use in production systems. | ||
In result, the group's decision to not currently incorporate the feature is based on the lack of sufficient known value that would require them, which the group recognizes may change in the future. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Both this paragraph and the following one say "in result". This is also oddly worded. I'd suggest:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @aphillips Would you be open to having the current text be accepted provisionally in this PR, and then improving the language of Consensus 6 in a separate PR? I'm getting the sense that this one is a bit more divisive than the rest, and that it might be good to have a more focused discussion on its particulars. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Go for it, although these suggestions are more about English prosody than content and I think we've about wrung any additional value out of this thread. |
||
In result, it is the intent of the group to design MessageFormat 2 in a way that wouldn't prevent future revisions of the standard to be extended with nested selectors feature. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This reads oddly. I would suggest saying this in a positive way:
Or, if we prefer to talk about group intentions:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The artificiality/undesirability of the n^2-1 messages is in the eye of the beholder. While it is the case that nested select/plural structures beyond a couple of levels produces a lot of rather repetitious (nearly identical but often subtly different--particularly in a forgiving language like English) messages, this is not necessarily undesirable. When the localization system can participate actively, the burden on translators can be reduced.
I am not arguing for folks to have 10-level nested structures, please note. The size of the data is a problem and consistency management becomes a chore. But I often see teams writing 2-3 levels of nest. The best part is, we can make tools that understand this stuff and make it easier to author and it is something I can teach to developers. If we didn't provide this, developers would go back to their bad old ways of writing switches in code or doing substring replacement (grammatical consistency forgotten).
I don't have an alternate suggestion for wording just now, but this is something I have my eye on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It also occurs to me that I may not understand the term "nested" in this context. For me, this is a "nested" structure (in this case a select in a plural--I copied this from a code review and I'd have written it with the select on the outside, but still...):
Is this what we mean by nested? Or would nested mean the more classical MessageFormat 1 like:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current aim is not to allow for one selector to be inside another selector in the same message, instead enabling the representation of messages such as your example by having that top-level selector be able to use more than one variable (e.g. both
unreadCount
andmsgType
) to select one of the cases. Here's what your message would look like with one of the data model candidates we're developing:The n^2 -1 reference in the text is for one of the alternative representations of this message, where we avoid using more than one argument for a selector by extracting variants into separate messages, and one of those is then picked by a parent message. Kind of like this:
The "actual" message is still the last one, but with only one argument being used for the top-level selectors, the three others are also required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@eemeli thanks for this. I had followed the discussion of matrixed selectors elsewhere, which is why I hadn't come out of the woodwork with comments earlier. Your top example and mine are just "misspellings" of each other--functionally equivalent while being syntactically different. As long as I was mentally able to map the two functionally, I wasn't concerned. When I saw and paused to read the proposed consensus, though, I went "huh? must have missed something here..."
I think the more relevant detail here is: "This allows selectors to represent complex messages while avoiding linguistically problematic constructs that can occur when selectors operate only part of a message" (... such as found in MF1)
I don't find this sentence helpful. This doesn't relax any constraints of MF1 (since it's a separate syntax altogether). I suppose it could be "an extension". But really it's just different from the philosophy of MF1.
The latter example is a logical outgrowth of message references. But I don't see how the proposed text leads here: this should be reserved for a discussion of references, as it doesn't have anything to do with number 3?