|
| 1 | +Escaping and markup |
| 2 | +------------------- |
| 3 | + |
| 4 | +In some cases it is common to to have other kinds of markup mixed in to |
| 5 | +translatable text, especially for things like HTML/web outputs. Handling these |
| 6 | +requires extra functionality to ensure that everything is escaped properly, |
| 7 | +especially external arguments that are passed in. |
| 8 | + |
| 9 | +For example, suppose you need embedded HTML in your translated text:: |
| 10 | + |
| 11 | + happy-birthday = |
| 12 | + Hello { $name }, <b>happy birthday!</b> |
| 13 | + |
| 14 | +In this situation, it is important that ``$name`` is HTML-escaped. The rest of |
| 15 | +the text needs to be treated as already escaped (i.e. it is HTML markup), so |
| 16 | +that ``<b>`` is not changed to ``<b>``. |
| 17 | + |
| 18 | +python-fluent supports this use case by allowing a list of ``escapers`` to be |
| 19 | +passed to the ``FluentBundle`` constructor: |
| 20 | + |
| 21 | +.. code-block:: python |
| 22 | +
|
| 23 | + bundle = FluentBundle(['en'], escapers=[my_escaper]) |
| 24 | +
|
| 25 | +An ``escaper`` is an object that defines the following set of attributes. The |
| 26 | +object could be a module, or a simple namespace object you could create using |
| 27 | +``types.SimpleNamespace`` (or ``fluent.runtime.utils.SimpleNamespace`` on Python 2), or |
| 28 | +an instance of a class with appropriate methods defined. The attributes are: |
| 29 | + |
| 30 | +- ``name`` - a simple text value that is used in error messages. |
| 31 | + |
| 32 | +- ``select(**hints)`` |
| 33 | + |
| 34 | + A callable that is used to decide whether or not to use this escaper for a |
| 35 | + given message (or message attribute). It is passed a number of hints as |
| 36 | + keyword arguments, currently only the following: |
| 37 | + |
| 38 | + - ``message_id`` - a string that is the name of the message or term. For terms |
| 39 | + it is a string with a leading dash - e.g. ``-brand-name``. For message |
| 40 | + attributes, it is a string in the form ``messsage-name.attribute-name`` |
| 41 | + |
| 42 | +In the future, probably more hints will be passed (for example, comments |
| 43 | +attached to the message), so for future compatibility this callable should use |
| 44 | +the ``**hints`` syntax to collect remaining keyword arguments. |
| 45 | + |
| 46 | +The callable should return ``True`` if the escaper should be used for that |
| 47 | +message, ``False`` otherwise. For every message and message attribute, the |
| 48 | +``select`` callable of each escaper in the list of escapers is tried in turn, |
| 49 | +and the first to return ``True`` is used. |
| 50 | + |
| 51 | +- ``output_type`` - the type of values that are returned by ``escape``, |
| 52 | + ``mark_escape``, and ``join``, and therefore by the whole message. |
| 53 | + |
| 54 | +- ``escape(text_to_be_escaped)`` |
| 55 | + |
| 56 | + A callable that will escape the passed in text. It must return a value that is |
| 57 | + an instance of ``output_type`` (or a subclass). |
| 58 | + |
| 59 | + ``escape`` must also be able to handle values that have already been escaped |
| 60 | + without escaping a second time. |
| 61 | + |
| 62 | +- ``mark_escaped(markup)`` |
| 63 | + |
| 64 | + A callable that marks the passed in text as markup i.e. already escaped. It |
| 65 | + must return a value that is an instance of ``output_type`` (or a subclass). |
| 66 | + |
| 67 | +- ``join(parts)`` |
| 68 | + |
| 69 | + A callable that accepts an iterable of components, each of type |
| 70 | + ``output_type``, and combines them into a larger value of the same type. |
| 71 | + |
| 72 | +- ``use_isolating`` |
| 73 | + |
| 74 | + A boolean that determines whether the normal bidi isolating characters should |
| 75 | + be inserted. If it is ``None`` the value from the ``FluentBundle`` will be |
| 76 | + used, otherwise use ``True`` or ``False`` to override. |
| 77 | + |
| 78 | +The escaping functions need to obey some rules: |
| 79 | + |
| 80 | +- escape must be idempotent: |
| 81 | + |
| 82 | + ``escape(escape(text)) == escape(text)`` |
| 83 | + |
| 84 | +- escape must be a no-op on the output of ``mark_escaped``: |
| 85 | + |
| 86 | + ``escape(mark_escaped(text)) == mark_escaped(text)`` |
| 87 | + |
| 88 | +- ``mark_escaped`` should be distributive with string |
| 89 | + concatenation: |
| 90 | + |
| 91 | + ``join([mark_escaped(a), mark_escaped(b)]) == mark_escaped(a + b)`` |
| 92 | + |
| 93 | +Example |
| 94 | +~~~~~~~ |
| 95 | + |
| 96 | +This example is for |
| 97 | +`MarkupSafe <https://pypi.org/project/MarkupSafe/>`__: |
| 98 | + |
| 99 | +.. code-block:: python |
| 100 | +
|
| 101 | + from fluent.runtime.utils import SimpleNamespace |
| 102 | + from markupsafe import Markup, escape |
| 103 | +
|
| 104 | + empty_markup = Markup('') |
| 105 | +
|
| 106 | + html_escaper = SimpleNamespace( |
| 107 | + select=lambda message_id=None, **hints: message_id.endswith('-html'), |
| 108 | + output_type=Markup, |
| 109 | + mark_escaped=Markup, |
| 110 | + escape=escape, |
| 111 | + join=empty_markup.join, |
| 112 | + name='html_escaper', |
| 113 | + use_isolating=False, |
| 114 | + ) |
| 115 | +
|
| 116 | +This escaper uses the convention that message IDs that end with |
| 117 | +``-html`` are selected by this escaper. This will match |
| 118 | +``message-html``, ``message.attr-html``, and ``-term-html``, for |
| 119 | +example, but not ``message-html.attr``. |
| 120 | + |
| 121 | +We have set ``use_isolating=False`` here because isolation characters |
| 122 | +can cause problems in various HTML contexts - for example: |
| 123 | + |
| 124 | +:: |
| 125 | + |
| 126 | + signup-message-html = |
| 127 | + Hello guest - please remember to |
| 128 | + <a href="{ $signup_url}">make an account.</a> |
| 129 | + |
| 130 | +Isolation characters around ``$signup_url`` will break the link. For HTML, you |
| 131 | +should instead use the `bdi element |
| 132 | +<https://developer.mozilla.org/en-US/docs/Web/HTML/Element/bdi>`__ in the FTL |
| 133 | +messages when necessary. |
| 134 | + |
| 135 | +Escaper compatibility |
| 136 | +~~~~~~~~~~~~~~~~~~~~~ |
| 137 | + |
| 138 | +When using escapers that with messages that include other messages or terms, |
| 139 | +some rules apply: |
| 140 | + |
| 141 | +- A message or term with an escaper applied can include another message or term |
| 142 | + with no escaper applied (the included message will have ``escape`` called on |
| 143 | + its output). |
| 144 | + |
| 145 | +- A message with an escaper applied can include a message or term with the same |
| 146 | + escaper applied. |
| 147 | + |
| 148 | +- A message with an escaper applied cannot include a message or term with a |
| 149 | + different esacper applied - this will generate a ``TypeError`` in the list of |
| 150 | + errors returned. |
| 151 | + |
| 152 | +- A message with no escaper applied cannot include a message with an escaper |
| 153 | + applied. |
0 commit comments