-
-
Notifications
You must be signed in to change notification settings - Fork 36
[Discussion] {{Spannables}} #537
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
(chair hat on) There is a proposal to use option A4 "Hash and Slash" as the design based on a tepid "lack of opposition" consensus in the 2023-11-20 call, so perhaps pay close attention to this option, The foregoing is not a finding of consensus. Our goal will be to choose an approach in the 2023-12-04 call. The more and better we discuss ahead of that, the better. |
I'm ok with the currently proposed syntax. My personal preference order for these choices is:
I don't have a strong stance on standalone markup, except to note that it's much less common in practice than open-close pairs, and that its use cases can be accounted for by either a purpose-built As far as I can tell, the only place where using the same syntax for open & standalone adds some friction is for source message validators that do not access a registry and which do want to require open-close pairing within each single message. In all other cases, we can rely on the registry, the source message, or the implementation to tell us whether the element is open or standalone. So for me the cost-benefit analysis of standalone markup makes it a pretty expensive addition providing rather little gain. |
@eemeli mentions:
Note that this "works" in our current syntax without any changes, since nothing prevents the literal part of a pattern from containing markup. However, the markup doesn't participate in formatting in any way. Making it participate in formatting would require recognizing sigils I generally agree with your other comments.
I think it would be useful to add an example, such as "If tool preparing a message for translation adds XLIFF around placeholders, it might need to know if the placeholder is paired or not, as this affects which tags are generated, even if the tool doesn't know the tag set being marked up" |
Some observations. In the design doc, we currently have markup-open = "#" name ; should be identifier
markup-close = "/" name ; should be identifier The primary "disagreement" we have is about the fate of standalone. Hash-and-slash allows open (or close) placeholders to appear unpaired and @eemeli proposes that we just let |
The proposed “hash and slash” solution is acceptable to me precisely due to making room for standalone syntax which doesn’t need a third sigil. So it’s not “just one more sigil” for me; it’s still three. #542 proposed three solutions to how we can support standalone markup without adding another sigil. |
(thinking out loud) Looking at the use cases in the spannables design this morning with an eye towards the discussion about requirements for the selected design, I see a class of cases where what translators want is:
That is, translators want tools to produce XLIFF's placeholders) for them. We could code that in our syntax, I suppose:
This has the benefit that it allows unpaired open or close code while allowing validation that the translation tooling markup is paired and syntactically correct. Formatting to parts can produce single-pass non-reparsed results. This is the different from what developers want, since it is a PITA to type and difficult to look at--and adds no value to developers (except the deferred benefit of non-borken translations). CAT tools have to process messages anyway and would be better at inserting and removing (and maintaining) this protection than developers. Some developers won't mind learning a message-specific variation on their code syntax and will want direct participation in rendering (that is, single-step format-and-process). This is mostly what we've been talking about as spannables. The above example could then look like this (using @stasm's #/ markup for standalone):
This doesn't quite satisfy what translators want, since it loses a number of checks they'd like to have (and which they get from raw XLIFF processing of HTML or other markup languages). Specifically, the open and close can get out of order without producing an error. To that end, we might want to introduce a non-option expression attributes to help tooling, e.g.:
|
I spy with my little eye another concern that we've somewhat implicitly chose not to address in the 2.0 release: sub-flows, to use the XLIFF term. In essence, in a message
the As for the message in question, my expectation would be that in the real world it ends up either as
or as
In the first case, the developer is formatting to a string and just presumes that HTML will be fine, and that translators will know how to deal with localizable attributes. XSS is a concern that's dealt with Elsewhere™. In the second case, the developer is formatting to parts and separately merging in the In neither case do I believe that strings which may include HTML will use an With the latter case, the "MF2-awareness" of the tools may well be encoded in an MF2-XLIFF transformer, so the translator's view of this string could be something completely different. And for localizable attributes, it might even be able to extract the sub-flow from the parent message. |
One could solve that using a I am thinking that we should keep our eyes on the XLIFF transform. Curious what you think about using attributes here.
I'm also curious why you think so? A namespace would make visible the type of markup to tooling as well as to the formatter runtime. I know you're mostly thinking about the case in which a data model or "format-to" part is handled by the formatter's caller (rather than as part of formatting), but even there I can see how users will want to plug-in and differentiate different markup regimes. Having a namespace prefix tells me if |
If we are comfortable requiring that a single namespace be used for the spanables, that could be in the 'preface' section: In pseudocode: .namespace=html5 or -namespace=html5 scope=spannables |
Also, I don't think we need the id=x. The only case where that would be necessary is with 2 identically named items. But even there, I don't think the tooling needs anything. The IDs can be purely internal, derived from the original message: x{#b stuff1}y{/b stuff2}z{#b stuff3}w{/bstuff4} The tooling would require that the a/b pairs be in order in the translation, but the id numbers can occur in any order. |
Are you thinking of some custom sub-syntax-formatter function? Like this:
With the way we're now going, that'll be a pretty likely outcome.
Can you clarify which attributes you're thinking of here?
Because in most cases it's not necessary as systems which may include HTML in their messages will only use HTML for markup. And when something else is needed as well, then namespaces like Developers are lazy, and they'll go with |
The point of That is, I'm thinking about the problem "how do we enable CAT tools to generate the XLIFF markup the developer intends?" while simultaneously letting developers put markup into messages.
Maybe even less specific than that:
Yes, that's true. But we should keep an eye out to enabling (not requiring) ways to do more complex things. I've been including namespacing in examples not because I don't think folks will use I also remain concerned about "two syntaxes in the same message"--I have multiple examples of places where this has bothered me in the past. |
That won't work, because the implicit (custom)
The latter is what I intended to communicate.
Indeed. Which is why I started to wonder whether we should effectively reserve enough space in the syntax for a |
In HTML, the lack of syntactic distinction between "open" and "standalone" causes problems and hardcoded lists of elements that can be one or the other. Let's not start a new standard with these problems and hacks. I don't feel strongly about the particular syntax, whether Do I understand correctly that "markup" is not going to be in the registry? That makes me nervous. It seems like different organizations will invent different sets of things and how to process them, making messages with markup not-interoperable. |
This is the discussion thread for spannables. Keeping it open in spite of merging the design doc. |
I intend to close this thread after the 2024-01-15 call. |
Per the 2023-11-27 teleconference, this issue is for discussing the design of spannables (also known as open/close/standalone markup).
The design document lives here and should be used as a reference in this discussion.
The text was updated successfully, but these errors were encountered: