-
Notifications
You must be signed in to change notification settings - Fork 45
Clarity about embedded HTML and escaping #96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hi! Thanks for the writeup! Before we dive in - are you familiar with DOM Overlays? Here's documentation of the first version - https://github.com/projectfluent/fluent.js/wiki/DOM-Overlays and today we released v2 in DOM Overlays is how we approach DOM Fragment localization with safety and flexibility. Version 0.2 adds ability for developers to provide elements in the source HTML that get merged with translation. We'll have to document the new features in v2 :) |
Thanks so much for that link, I think had seen it before but had forgotten about it. I'm still at the stage of investigating fluent and seeing whether it fits my needs. I'm currently not thinking of using fluent-dom, because I've got use cases where it won't work (e.g. plain text emails), and because in some cases I really want server-side rendering (for all the usual reasons). I guess fluent-dom may be the way to go in some cases though, or I might need to implement similar functionality if I were to go with server-side rendering. I have questions like - what happens if I'm generating a plain text email and there is a message like |
yep, semantic comments are meant to help with that! |
Uh oh!
There was an error while loading. Please reload this page.
Some examples use HTML snippets in the message e.g. http://projectfluent.org/fluent/guide/text.html
The question then is what happens when this is used. I would not expect fluent to not do any HTML escaping. It it therefore up to the bindings to always HTML-escape the entire returned string when it is inserted into the DOM (client-side) or into a chunk of HTML (server-side). If the message contains any interpolated user supplied input, this is vital for correctness and security (XSS etc.), but in any case we should not be expecting translators to have to know HTML syntax and manually escape ampersands etc.
However, with the above message, the HTML tags would end up as
<i>HTML5</i>
which would be rendered as<i>HTML5</i>
rather than HTML5 - this is not what the example implies to me.Looking around in this repo, it seems the current consensus is in agreement with what I've outlined above (see projectfluent/play#2 for example), and therefore it is the examples that are misleading/confusing.
This leaves the problem of what happens when a translated string actually needs to embed HTML. This seems to be one solution: #16 (comment) . A more lightweight but less robust solution I had been thinking about was a name convention (e.g. any message id that ends
-html
is treated as HTML, anything else not).It is vital for this to be really well defined (and simple to implement), otherwise you end up with XSS, or double escaping, or being unable to embed HTML in translated messages. I'm considering an implementation in Elm, and the only practical way it would work would be to compile FTL messages to Elm functions. For this to work, we'd need to know for every message what type of output (text/HTML) it was returning so that it can have the correct type signature. I'm also considering a Python implementation that would integrate into a Django project, and we'd again need to know very explicitly whether something is returning HTML or plain text.
The text was updated successfully, but these errors were encountered: