-
Notifications
You must be signed in to change notification settings - Fork 45
New syntax for meta-data #7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
One of the big wins of FTL was that everything about the message was in the value part of the syntax. There is some level of beauty that you start the message with the ID. Right now, only comments break that. For tooling that is a boost. Also, does the
create challenges in error recovery? The ' ' would also be a typo that would be really hard to debug, as
would be totally legal, right? |
I think my vote would go for semantic comments for meta-data. |
I think we used the I'd love to discuss about semantic comments more. JSDoc-style |
Hi @Pike, thanks for heads up. I'll check. |
@Pike suggested that we separate the meta-information from semantic comments. The reason for this is that he sees semantic comments as relating to the toolchain and the process ( He suggested the following syntax:
The reason to use the brackets is that it closely resembles the way this information will be used. This in turn improves the copy&paste-ability of the syntax:
Pike also likes the idea that everything defined below the |
I like @Pike's proposal and I thinks it's sound. I'd like to hear @zbraniecki's thoughts, of course. I have some reservations, too: I was hoping we could piggy-back on #16 to implement this. Meta-information should be rare enough that maybe it shouldn't get its own syntax. OTOH, it also is what make Fluent and FTL very powerful. On the note of being rare enough, @Pike and I also discussed about not allowing meta-information on messages which have attributes. Such messages are meant to localize UI widgets and should not carry grammatical information. In fact, maybe we should rename meta-information to grammatical data or something similar. |
My initial reaction is that this syntax seems confusing.
This is my thinking too. As we said in the beginning - in all our work with L20n/FTL so far, we failed to find another example of the use case beyond For that reason I find the idea of adding a specific syntax excessive. It adds a new source of potential bugs and errors in malformed content, in order to serve a single use case. It will work, and as we said it's more important how will users retrieve that bit because it'll be way more common, but I'm not sure if we should be adding a whole new data type on On the other hand, functionally, I agree that semantic comments as we're thinking about them are functionally different from meta information like So, I'll probably be reluctantly ok with this proposal, but I have another:
I recognize that it doesn't play with @Pike 's "all localizable info below identifier", but I guess I just don't share this concern. |
The idea behind keeping the localizer data beneath the ID is one of incremental tool support: It allows l10n tools to have the most rudimentary support, like we currently do for pontoon. You get a text area, and anything in that area is to be edited by localizers. The other part about using the [] mark-up (to avoid the word syntax) is that [] denotes the option definition and reference for variants. Meta is the same thing in the reverse direction, and there's beauty in keeping [] as an easy to copy-n-paste markup on both source and target of the reference. I can see us explaining [] as references between messages, and you never need to translate one markup into another. |
Wouldn't it be easier for a tool to gracefully downgrade to a text editor if the whole message, including the comments can be parsed and serialized?
I'm still not completely sold on this reverse direction thing. Grammatical information defined as meta-data has little to do with variants, doesn't it?
There's some beauty in using I don't have any better ideas right now and I see the values of everything previously suggested here. I'm tempted to postpone this issue until a later milestone. |
This sentence describes my sentiment very well. |
I don't want to rush a design decision here. Let's move this out of the scope of 0.2. This means that temporarily the syntax will not give any dedicated way of defining language-specific grammatical data. (As a workaround, it's still possible to create entirely new local messages containing that data and refer to them, e.g. |
Do we get a good baseline to, say, ship L20n on Android without coming to a conclusion here? To the actual conversation, let me try to depict my thinking: brand.ftl:
updates.ftl:
(omg, butchering some other language's grammar here) My point is that when I resolve the variants of brandName, I use I think it's a good idea for the reverse direction to also use |
I believe we should reach a solution here before we release L20n on Android. |
+1 to that. I just don't want to lower to quality of 0.2 by rushing this decision right now. |
@Pike, did you mean
I see what you mean: in a selector-less list of variants, we also use brackets to define variants and we match them from the outside. In case of variants, however, the symmetry is between the definition and the reference. Both use
You'll never find yourself trying to match This is not true for grammatical information. Once it's defined, it's meant to be matched in other
Furthermore, you must not assume that you can reference ...unless it doesn't. What if we used variants for all grammatical information? Variants are private and can be accessed from other messages. Grammatical information will be only added to messages which already may have other grammatical variants. We wouldn't be adding any new syntax. The word "variant" may not be the best one here, but in general, the construct seems to lend itself well to the use-case. In English:
In French:
In Polish:
Semantically, gender isn't a facet of the string value of |
Groundhog Day. That's the train of thought that lead us to traits. |
One a less snarky note, putting meta data into the variants would
|
Yes, I know. I'm looking for solutions everywhere I can find them :) |
Also, I feel like this is related but not accurate. We've always had three types of data: variants, grammatical descriptors and attributes. With traits, we lumped all of them together. Previously (L20n 1.0) descriptors and attributes were expressed with the same syntax. Even earlier (your designs from a long time ago) attributes and variants were together, while descriptors were separate. I feel like we're going in circles. |
Probably bad, or at least unintended. That would only happen if the meta data variant has the
That would be possible with nested What I really dislike about my proposal is that it forces localizers to find names for the meta-data: gender, animacy, etc. I'd much prefer a solution with binary descriptors, like "masculine". I'll come back to this issue next week and try to get some perspective this week. |
After a short break the idea of putting the grammatical information into variants seems bad, I admit. Perhaps it was a necessary step back for me to consider other options :) Over the weekend I did some small-scale user-testing. I presented two FTL files, one in English and another one in Polish to a few friends and asked them to complete the Polish translation. The only thing they knew about FTL beforehand was that translations had unique identifiers. The Polish file also already featured some grammar-sensitive syntax. After completing the task (which went very well) I asked a few follow-up questions. Below is a bullet-point summary of the conclusions:
Based on that, here is my newest proposal:
|
I wonder if "classes" would be confusing as name for these definitions (e.g. gender).
How do you associate two or more classes to a string?
vs
There is one thing that I find confusing though:
|
(I'm going to use the
I'd like to think of them as tags, and actually just call them that:
I understand the rationale. I think there are two ways to go forward and they're not mutually exclusive:
We could start with the first one and add the second one as syntax sugar later on. |
What would
do? I'm concerned that adding two variants with subtle difference would add more confusion than help? |
It would try to match |
After a lot more further thinking: I like @Pike's proposal in #7 (comment) the most. I realized that I don't see a use-case for matching against the values of messages. Doing so would make the translation not portable. If a language has special rules for nouns starting with a vowel, it's much better to match a hashtag |
Tags are binary values attached to messages. They are language-specific and can be used to describe grammatical characteristics of the message. brand-name = Firefox #masculine brand-name = Aurora #feminine #vowel Tags can be used in select expressions by matching a hashtag name to the message: has-updated = { brand-name -> [masculine] … [feminine] … *[other] … } Tags can only be defined on messages which have a value and don't have any attributes.
Tags are binary values attached to messages. They are language-specific and can be used to describe grammatical characteristics of the message. brand-name = Firefox #masculine brand-name = Aurora #feminine #vowel Tags can be used in select expressions by matching a hashtag name to the message: has-updated = { brand-name -> [masculine] … [feminine] … *[other] … } Tags can only be defined on messages which have a value and don't have any attributes.
Just as a mental check, does it mean that we're 100% sure that we will never want to match against the value? It seems to me like we won't, but I want us all to think it through explicitly because if implement what :stas is proposing we will never have an intuitive way to do that :) |
Thanks, @zbraniecki, for asking. If we ever want to change our mind, we can implement a new approach inside of how variant keys match against the selector. If it's a Message, we can first look into its tags and then fall back onto its value. Or we can provide functions that allow the user to be more specific: That said, I doubt that we'll want or need to do that. Famous last words? |
The last proposal is my concern. If we'll end up having a use case, and if that use case will end up being more common than this one, we'll end up having the API that makes the wrong thing easy. If we'll try to have a smart API (check for tags, check for values), then it sounds like it'll work well. I assume we won't allow attributes and tags on the same Message, right? |
I see what you mean. I think we could make the no-syntax variant be a smart one, and then expose
+1
Yes, correct. The rationale is that messages with tags are supposed to be interpolated into other messages. If they need to be displayed in the UI which requires an attribute, a new message can be created for that purpose and it can reference the message with tags. |
Tags are binary values attached to messages. They are language-specific and can be used to describe grammatical characteristics of the message. brand-name = Firefox #masculine brand-name = Aurora #feminine #vowel Tags can be used in select expressions by matching a hashtag name to the message: has-updated = { brand-name -> [masculine] … [feminine] … *[other] … } Tags can only be defined on messages which have a value and don't have any attributes.
Tags are binary values attached to messages. They are language-specific and can be used to describe grammatical characteristics of the message. brand-name = Firefox #masculine brand-name = Aurora #feminine #vowel Tags can be used in select expressions by matching a hashtag name to the message: has-updated = { brand-name -> [masculine] … [feminine] … *[other] … } Tags can only be defined on messages which have a value and don't have any attributes.
Goal
Provide a simple means for defining private meta-data for messages.
Description
Currently, meta-data can be added to messages by using traits. Traits without namespaces are considered private.
#5 and #6 will simplify traits and we'll need a new way to encode meta-data.
The proposal is to use binary tags attached to the value:
The benefit of the binary approach is that there's usually no need to name the property in question (gender).
Discussion
https://groups.google.com/forum/#!topic/mozilla.tools.l10n/dhWfBXHzuZI
The text was updated successfully, but these errors were encountered: