Clarity about embedded HTML and escaping #96

spookylukey · 2018-04-12T06:46:14Z

Some examples use HTML snippets in the message e.g. http://projectfluent.org/fluent/guide/text.html

description =
    Loki is a simple micro-blogging
    app written entirely in <i>HTML5</i>.
    It uses FTL to implement localization.

The question then is what happens when this is used. I would not expect fluent to not do any HTML escaping. It it therefore up to the bindings to always HTML-escape the entire returned string when it is inserted into the DOM (client-side) or into a chunk of HTML (server-side). If the message contains any interpolated user supplied input, this is vital for correctness and security (XSS etc.), but in any case we should not be expecting translators to have to know HTML syntax and manually escape ampersands etc.

However, with the above message, the HTML tags would end up as HTML5 which would be rendered as HTML5 rather than HTML5 - this is not what the example implies to me.

Looking around in this repo, it seems the current consensus is in agreement with what I've outlined above (see projectfluent/play#2 for example), and therefore it is the examples that are misleading/confusing.

This leaves the problem of what happens when a translated string actually needs to embed HTML. This seems to be one solution: #16 (comment) . A more lightweight but less robust solution I had been thinking about was a name convention (e.g. any message id that ends -html is treated as HTML, anything else not).

It is vital for this to be really well defined (and simple to implement), otherwise you end up with XSS, or double escaping, or being unable to embed HTML in translated messages. I'm considering an implementation in Elm, and the only practical way it would work would be to compile FTL messages to Elm functions. For this to work, we'd need to know for every message what type of output (text/HTML) it was returning so that it can have the correct type signature. I'm also considering a Python implementation that would integrate into a Django project, and we'd again need to know very explicitly whether something is returning HTML or plain text.

The text was updated successfully, but these errors were encountered:

zbraniecki · 2018-04-12T06:56:34Z

Hi! Thanks for the writeup! Before we dive in - are you familiar with DOM Overlays?

Here's documentation of the first version - https://github.com/projectfluent/fluent.js/wiki/DOM-Overlays and today we released v2 in fluent-dom 0.2.0.

DOM Overlays is how we approach DOM Fragment localization with safety and flexibility. Version 0.2 adds ability for developers to provide elements in the source HTML that get merged with translation. We'll have to document the new features in v2 :)

spookylukey · 2018-04-12T11:07:05Z

Thanks so much for that link, I think had seen it before but had forgotten about it. I'm still at the stage of investigating fluent and seeing whether it fits my needs. I'm currently not thinking of using fluent-dom, because I've got use cases where it won't work (e.g. plain text emails), and because in some cases I really want server-side rendering (for all the usual reasons).

I guess fluent-dom may be the way to go in some cases though, or I might need to implement similar functionality if I were to go with server-side rendering. I have questions like - what happens if I'm generating a plain text email and there is a message like Mme { $surname }. It feels like there needs to be way to communicate the context to the translator so that this kind of thing can be avoided (as per the proposal in #16).

zbraniecki · 2018-04-19T23:32:09Z

yep, semantic comments are meant to help with that!

stasm added the docs label Oct 16, 2018

Swatinem mentioned this issue Feb 10, 2019

Integrate small HTML subset into fluent syntax #237

Open

rugk mentioned this issue Jan 28, 2020

Do not double-encode HTML in i18n PrivateBin/PrivateBin#560

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Clarity about embedded HTML and escaping #96

Clarity about embedded HTML and escaping #96

spookylukey commented Apr 12, 2018 •

edited

Loading

zbraniecki commented Apr 12, 2018

Uh oh!

spookylukey commented Apr 12, 2018

Uh oh!

zbraniecki commented Apr 19, 2018

Uh oh!

Clarity about embedded HTML and escaping #96

Clarity about embedded HTML and escaping #96

Comments

spookylukey commented Apr 12, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

zbraniecki commented Apr 12, 2018

Uh oh!

spookylukey commented Apr 12, 2018

Uh oh!

zbraniecki commented Apr 19, 2018

Uh oh!

spookylukey commented Apr 12, 2018 •

edited

Loading