-
Notifications
You must be signed in to change notification settings - Fork 45
Integrate small HTML subset into fluent syntax #237
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Interesting idea, though I guess the small subset is the interesting part to nail down. The way that would make sense to me is a small subset of SGML or XML parsing, so something that's completely opaque to tag names. I could see a benefit from doing I'd go as far as to say that we should only allow start and end tags in the same pattern. Like, currently, one can do omg = This is <b>{ $num ->
*[other] bad</b>, <em>I
}{ $num ->
[one] guess this would be bad?
*[other]</em> think
}. Don't you? The exact semantics depend on a few of questions, I think:
If all of these questions had "No" as an answer, this could quite nicely be implemented on the Making sure @zbraniecki and @spookylukey are on this. |
What I had in mind by small subset was:
So essentially some xml/jsx like syntax, completely opaque to tag names so it plays nicely with the React Overlays feature. The parser should make sure that the nesting is well-formed. I am not sure if attributes should support nesting of other elements? might be a good idea if the target is non-browser component systems, such as react, but might be a bad idea when the target is web.
|
Thanks @Pike for letting me know about this. I'm using Fluent in two main contexts - django-ftl and elm-fluent (both of which I wrote/am writing).
and nothing funny should happen to it - it is just normal text. That text should be escaped if we happen to be using it in HTML context, but that is not the business of Fluent. On the other hand, I sometimes want to output HTML, and in those cases need access to the full range of possible HTML constructs, not a limited subset. For me, I imagine having a limited amount of builtin HTML support would most likely just make these two things more complicated. For django-ftl, to support these two types of output, I'm using my own branch of python-fluent, using a more 'escaping' mechanism (see django-ftl docs, outdated PR, discussion ). For elm-fluent, I'm using a very different strategy. This package compiles FTL to Elm files, so it may have some relevance for your intl-codegen work @Swatinem. Like for django-ftl, messages are assumed to be plain text by default, and need Doing this involves elm-fluent being able to parsing the messages as HTML after they have been parsed as FTL, but before rendering out. This is implemented here for elm-fluent, and it is a bit tricky/hacky, but it does work. Overview for this method: For messages that are marked as HTML, we take the FTL This relies on some things e.g. well-formed-ness, and matching opening and closing tags within placeables etc. Since I'm doing this at compile time in elm-fluent, there isn't a runtime overhead associated with HTML parsing etc. - the output of the code I linked is not the final rendered message, but Elm code that will generate the final rendered message. There is obviously a bit more to it than that, don't have time to write more now but I'll happily answer any questions about it. |
Well I had something similar in mind, parse the string with both fluent / messageformat (as I currently stand, I only suport messageformat in intl-codegen, but I would love to suport both syntaxes) and an html parser, and then somehow combine both ASTs into one… And since I want to do all this at compile time anyway, the overhead would be minimal. Apart from the implementation details itself, I think another big benefit would also be that you have one official documentation on how to write your fluent files themselves, instead of diverging and incompatible implementations in I already see this problem with messageformat, where each library (such as my ideas for intl-codegen 2) bolts on its own extensions to the syntax, so the syntax the translators can actually use depends on the library that you use in your code, which gets even worse when you use different implementations in different parts of your code (server, web, native mobile). |
To add briefly to my last comment - an FTL parser that was able to ensure the HTML well-formedness rules that Pike mentioned would be useful for all my use cases too, but only optionally, because of the need for plain text cases, and so it feels like doing this at the Fluent grammar level would be very complicated. |
Fluent has escapes/string literals for that, like |
First off, I really love the fluent syntax so far (even though I have not used it in production yet) compared to MessageFormat.
I also really like the idea of DOM Overlays, but I would like to deeper integrate this into the syntax itself.
But IMO the way that React Overlays are currently implemented is a bit strange, since it parses the final translated message on each usage of
<Localized/>
.I think having a small subset of HTML integrated into the syntax can also make this a lot easier to integrate into component systems that are not browser based, or other usecases such as pre-compilation (I am the maintainer of intl-codegen for which I would love to integrate a DOM Overlay-like feature)
Maybe related to #96 or #175 ?
The text was updated successfully, but these errors were encountered: