Skip to content

Implement MessageContext.format #67

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 76 commits into from
Jan 18, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
76 commits
Select commit Hold shift + click to select a range
a540993
Initial implementation of MessageContext
spookylukey May 11, 2018
3059ec2
Beginnings of implementing resolve
spookylukey May 12, 2018
ea68596
Beginnings of resolving external arguments
spookylukey May 12, 2018
97da936
Fixed term/message mixup
spookylukey May 12, 2018
5fd424b
format: Implemented attribute lookup
spookylukey May 13, 2018
f76c965
format: another test
spookylukey May 13, 2018
b280a9d
MessageContext: removed/changed methods that exposed implementation d…
spookylukey May 14, 2018
b5d0a2f
Avoid name clash with Python builtin ReferenceError
spookylukey May 14, 2018
b78f33f
format: Tests for missing attributes
spookylukey May 14, 2018
c77ff6f
format: support for accessing attributes directly
spookylukey May 14, 2018
89ef658
format: initial support for variant forms
spookylukey May 14, 2018
8d879ea
format: implemented select expressions
spookylukey May 14, 2018
cef0734
format: select expression with numbers
spookylukey May 14, 2018
5cb1066
format: implemented function calls
spookylukey May 14, 2018
a011eae
format: improved handling of numbers
spookylukey May 14, 2018
d38d932
utils: added 'cachedproperty' decorator
spookylukey May 14, 2018
fea2742
format: implemented plural rule forms, plus consistent handling of nu…
spookylukey May 14, 2018
8c1b02a
resolver: doc string plus better argument order
spookylukey May 14, 2018
84fca75
format: handle named/keyword arguments to functions
spookylukey May 15, 2018
5f8d8ec
format: support for Term
spookylukey May 15, 2018
5c36e5c
format: fixed handling of missing messages/terms
spookylukey May 15, 2018
0e30a76
format: bulked out some tests
spookylukey May 15, 2018
fc7ad8b
format: Bulked out tests for numbers
spookylukey May 15, 2018
0188bea
format: handling floating point numbers
spookylukey May 15, 2018
104da18
fluent: report missing variants
spookylukey May 15, 2018
bbde8e5
format: implemented NUMBER builtin, with partial application
spookylukey May 15, 2018
e9c9e66
format: test addition
spookylukey May 15, 2018
edf121b
MessageContext: added convenience add_messages_from_file
spookylukey May 15, 2018
874c999
format: made 'args' optional.
spookylukey May 15, 2018
87fc3c3
format: there is no need to support bytestrings, we keep them out at …
spookylukey May 15, 2018
46dea8b
format: Added some docs
spookylukey May 15, 2018
3f028ad
Fixed some errors in README
spookylukey May 15, 2018
c5c5246
MessageContext: removed method that did IO, as per fluent philosophy
spookylukey May 22, 2018
34bc74a
More obvious and convenient way to run tests.
spookylukey May 22, 2018
b42d5a4
format: partial support for NUMBER(currencyDisplay=)
spookylukey May 25, 2018
7f6345a
format: support for remaining NUMBER options
spookylukey May 25, 2018
9f77d70
Updated README
spookylukey May 25, 2018
712d0fb
format: cyclic reference detection
spookylukey May 26, 2018
46a6d48
format: implemented `use_isolating`
spookylukey May 27, 2018
fb91884
Changes forgotten in previous commit 46a6d48 :-(
spookylukey May 28, 2018
5f41120
tox - test against pypy and pypy3
spookylukey May 28, 2018
05e3dbc
Travis - test against pypy and pypy3
spookylukey May 28, 2018
02044b5
types: support for decimal.Decimal as a number type
spookylukey Jun 13, 2018
4d2f711
types: Fixed bug with non-existant options to FluentNumber not raisin…
spookylukey Jun 14, 2018
9343548
Decimal fixup
spookylukey Jun 14, 2018
49c8e3c
format: initial DATETIME implementation
spookylukey Jun 14, 2018
7a2780f
types: check operations on FluentNumber work as expected
spookylukey Jun 15, 2018
f77147c
Fixed some missing tests
spookylukey Jun 15, 2018
eb4fa5e
format: memory and CPU DOS protection
spookylukey Jun 15, 2018
09cdc9a
flake8 and isort fixes
spookylukey Jun 15, 2018
ab24993
MessageContext: documented `use_isolating`
spookylukey Jun 17, 2018
7c914cf
format: documented DATETIME and fluent_date
spookylukey Jun 17, 2018
4c5a0c0
Docs update
spookylukey Jun 17, 2018
313e9b7
types: rewrote FluentNumber to use same options system as FluentDateType
spookylukey Jun 17, 2018
38f7b3a
types: validation for FluentDate dateStyle and timeStyle
spookylukey Jun 17, 2018
8c4120b
Removed some dead code
spookylukey Jun 17, 2018
7311fc6
format: defined and fixed behaviour for missing message body
spookylukey Jul 6, 2018
28d2a3d
format: better handling of FluentNone/missing args
spookylukey Jul 6, 2018
e3dc515
format: better tests for isolating chars
spookylukey Jul 6, 2018
34820d2
MessageContext: added test for missing message condition
spookylukey Jun 25, 2018
cb22168
docs: more docs for custom functions
spookylukey Jul 7, 2018
7e7b907
Moved some things out of resolver.py for re-usabilty
spookylukey Jul 10, 2018
f383425
context: consistent name for functions attribute
spookylukey Jul 24, 2018
b859efb
Merge branch 'master' into implement_format
spookylukey Jul 25, 2018
8d825c1
resolver: initial fixes for spec v0.6 changes
spookylukey Jul 25, 2018
dc6406a
Fixed failing tests on Python 2.7
spookylukey Jul 27, 2018
f0f719c
Fixed unused import - flake8 warning
spookylukey Jul 27, 2018
e2cb337
Merge branch 'master' into implement_format
spookylukey Oct 30, 2018
e59724e
Fixes for Fluent syntax 0.7
spookylukey Oct 30, 2018
3945ad4
Fixed failing test
spookylukey Nov 26, 2018
084d7ce
Docs fixes/improvements as suggested by @zbraniecki
spookylukey Jan 11, 2019
a505be0
Removed unneeded MessageContext.message_ids API
spookylukey Jan 11, 2019
e44543c
MessageContext.add_messages - Don't overwrite existing items
spookylukey Jan 11, 2019
b97f537
MessageContext: cleaned up unnecessary uses of cachedproperty
spookylukey Jan 11, 2019
b600eb2
MessageContext: combined messages and terms dicts
spookylukey Jan 11, 2019
9d84579
exceptions.py -> errors.py for consistency with other Fluent implemen…
spookylukey Jan 11, 2019
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@ python:
- "2.7"
- "3.5"
- "3.6"
- "pypy"
- "pypy3"
- "nightly"
install: pip install tox-travis
script: tox
Expand Down
224 changes: 224 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,230 @@ you're a tool author you may be interested in the formal [EBNF grammar][].
[EBNF grammar]: https://github.com/projectfluent/fluent/tree/master/spec


Installation
------------

pip install fluent

Usage
-----

To generate translations from this Python libary, you start with the
`MessageContext` class:

>>> from fluent.context import MessageContext

You pass a list of locales to the constructor - the first being the desired
locale, with fallbacks after that:

>>> context = MessageContext(["en-US"])


You must then add messages. These would normally come from a `.ftl` file stored
on disk, here we will just add them directly:

>>> context.add_messages("""
... welcome = Welcome to this great app!
... greet-by-name = Hello, { $name }!
... """)

To generate translations, use the `format` method, passing a message ID and an
optional dictionary of substitution parameters. If the the message ID is not
found, a `LookupError` is raised. Otherwise, as per the Fluent philosophy, the
implementation tries hard to recover from any formatting errors and generate the
most human readable representation of the value. The `format` method therefore
returns a tuple containing `(translated string, errors)`, as below.

>>> translated, errs = context.format('welcome')
>>> translated
"Welcome to this great app!"
>>> errs
[]

>>> translated, errs = context.format('greet-by-name', {'name': 'Jane'})
>>> translated
'Hello, \u2068Jane\u2069!'

>>> translated, errs = context.format('greet-by-name', {})
>>> translated
'Hello, \u2068name\u2069!'
>>> errs
[FluentReferenceError('Unknown external: name')]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's a side note for @stasm mostly, but every time I put a readers hat on I see this as a flaw that we place the id as a raw string. I'd love us to consider { name } and { $name } here at least.


You will notice the extra characters `\u2068` and `\u2069` in the output. These
are Unicode bidi isolation characters that help to ensure that the interpolated
strings are handled correctly in the situation where the text direction of the
substitution might not match the text direction of the localized text. These
characters can be disabled if you are sure that is not possible for your app by
passing `use_isolating=False` to the `MessageContext` constructor.

Python 2
--------

The above examples assume Python 3. Since Fluent uses unicode everywhere
internally (and doesn't accept bytestrings), if you are using Python 2 you will
need to make adjustments to the above example code. Either add `u` unicode
literal markers to strings or add this at the top of the module or the start of
your repl session:

from __future__ import unicode_literals


Numbers
-------

When rendering translations, Fluent passes any numeric arguments (int or float)
through locale-aware formatting functions:

>>> context.add_messages("show-total-points = You have { $points } points.")
>>> val, errs = context.format("show-total-points", {'points': 1234567})
>>> val
'You have 1,234,567 points.'


You can specify your own formatting options on the arguments passed in by
wrapping your numeric arguments with `fluent.types.fluent_number`:

>>> from fluent.types import fluent_number
>>> points = fluent_number(1234567, useGrouping=False)
>>> context.format("show-total-points", {'points': points})[0]
'You have 1234567 points.'

>>> amount = fluent_number(1234.56, style="currency", currency="USD")
>>> context.add_messages("your-balance = Your balance is { $amount }")
>>> context.format("your-balance", {'amount': amount})[0]
'Your balance is $1,234.56'

Thee options available are defined in the Fluent spec for
[NUMBER](https://projectfluent.org/fluent/guide/functions.html#number). Some of
these options can also be defined in the FTL files, as described in the Fluent
spec, and the options will be merged.

Date and time
-------------

Python `dateime.datetime` and `datetime.date` objects are also passed through
locale aware functions:

>>> from datetime import date
>>> context.add_messages("today-is = Today is { $today }")
>>> val, errs = context.format("today-is", {"today": date.today() })
>>> val
'Today is Jun 16, 2018'

You can explicitly call the `DATETIME` builtin to specify options:

>>> context.add_messages('today-is = Today is { DATETIME($today, dateStyle: "short") }')

See the [DATETIME
docs](https://projectfluent.org/fluent/guide/functions.html#datetime). However,
currently the only supported options to `DATETIME` are:

* `timeZone`
* `dateStyle` and `timeStyle` which are [proposed
additions](https://github.com/tc39/proposal-ecma402-datetime-style) to the ECMA i18n spec.

To specify options from Python code, use `fluent.types.fluent_date`:

>>> from fluent.types import fluent_date
>>> today = date.today()
>>> short_today = fluent_date(today, dateStyle='short')
>>> val, errs = context.format("today-is", {"today": short_today })
>>> val
'Today is 6/17/18'

You can also specify timezone for displaying `datetime` objects in two ways:

* Create timezone aware `datetime` objects, and pass these to the `format` call
e.g.:

>>> import pytz
>>> from datetime import datetime
>>> utcnow = datime.utcnow().replace(tzinfo=pytz.utc)
>>> moscow_timezone = pytz.timezone('Europe/Moscow')
>>> now_in_moscow = utcnow.astimezone(moscow_timezone)

* Or, use timezone naive `datetime` objects, or ones with a UTC timezone, and
pass the `timeZone` argument to `fluent_date` as a string:

>>> utcnow = datetime.utcnow()
>>> utcnow
datetime.datetime(2018, 6, 17, 12, 15, 5, 677597)

>>> context.add_messages("now-is = Now is { $now }")
>>> val, errs = context.format("now-is",
... {"now": fluent_date(utcnow,
... timeZone="Europe/Moscow",
... dateStyle="medium",
... timeStyle="medium")})
>>> val
'Now is Jun 17, 2018, 3:15:05 PM'


Custom functions
----------------

You can add functions to the ones available to FTL authors by passing
a `functions` dictionary to the `MessageContext` constructor:


>>> import platform
>>> def os_name():
... """Returns linux/mac/windows/other"""
... return {'Linux': 'linux',
... 'Darwin': 'mac',
... 'Windows': 'windows'}.get(platform.system(), 'other')

>>> context = MessageContext(['en-US'], functions={'OS': os_name})
>>> context.add_messages("""
... welcome = { OS() ->
... [linux] Welcome to Linux
... [mac] Welcome to Mac
... [windows] Welcome to Windows
... *[other] Welcome
... }
... """)
>>> print(context.format('welcome')[0]
Welcome to Linux

These functions can accept positioal and keyword arguments (like the `NUMBER`
and `DATETIME` builtins), and in this case must accept the following types of
arguments:

* unicode strings (i.e. `unicode` on Python 2, `str` on Python 3)
* `fluent.types.FluentType` subclasses, namely:
* `FluentNumber` - `int`, `float` or `Decimal` objects passed in externally,
or expressed as literals, are wrapped in these. Note that these objects also
subclass builtin `int`, `float` or `Decimal`, so can be used as numbers in
the normal way.
* `FluentDateType` - `date` or `datetime` objects passed in are wrapped in
these. Again, these classes also subclass `date` or `datetime`, and can be
used as such.
* `FluentNone` - in error conditions, such as a message referring to an argument
that hasn't been passed in, objects of this type are passed in.

Custom functions should not throw errors, but return `FluentNone` instances to
indicate an error or missing data. Otherwise they should return unicode strings,
or instances of a `FluentType` subclass as above.


Known limitations and bugs
--------------------------

* We do not yet support `NUMBER(..., currencyDisplay="name")` - see [this python-babel
pull request](https://github.com/python-babel/babel/pull/585) which needs to
be merged and released.

* Most options to `DATETIME` are not yet supported. See the [MDN docs for
Intl.DateTimeFormat](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/DateTimeFormat),
the [ECMA spec for
BasicFormatMatcher](http://www.ecma-international.org/ecma-402/1.0/#BasicFormatMatcher)
and the [Intl.js
polyfill](https://github.com/andyearnshaw/Intl.js/blob/master/src/12.datetimeformat.js).

Help with the above would be welcome!


Discuss
-------

Expand Down
10 changes: 10 additions & 0 deletions fluent/builtins.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
from .types import fluent_date, fluent_number

NUMBER = fluent_number
DATETIME = fluent_date


BUILTINS = {
'NUMBER': NUMBER,
'DATETIME': DATETIME,
}
81 changes: 81 additions & 0 deletions fluent/context.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
from __future__ import absolute_import, unicode_literals

import babel
import babel.numbers
import babel.plural

from .builtins import BUILTINS
from .resolver import resolve
from .syntax import FluentParser
from .syntax.ast import Message, Term


class MessageContext(object):
"""
Message contexts are single-language stores of translations. They are
responsible for parsing translation resources in the Fluent syntax and can
format translation units (entities) to strings.

Always use `MessageContext.format` to retrieve translation units from
a context. Translations can contain references to other entities or
external arguments, conditional logic in form of select expressions, traits
which describe their grammatical features, and can use Fluent builtins.
See the documentation of the Fluent syntax for more information.
"""

def __init__(self, locales, functions=None, use_isolating=True):
self.locales = locales
_functions = BUILTINS.copy()
if functions:
_functions.update(functions)
self._functions = _functions
self._use_isolating = use_isolating
self._messages_and_terms = {}
self._babel_locale = self._get_babel_locale()
self._plural_form = babel.plural.to_python(self._babel_locale.plural_form)

def add_messages(self, source):
parser = FluentParser()
resource = parser.parse(source)
# TODO - warn/error about duplicates
for item in resource.body:
if isinstance(item, (Message, Term)):
if item.id.name not in self._messages_and_terms:
self._messages_and_terms[item.id.name] = item

def has_message(self, message_id):
try:
self._get_message(message_id)
return True
except LookupError:
return False

def format(self, message_id, args=None):
message = self._get_message(message_id)
if args is None:
args = {}
errors = []
resolved = resolve(self, message, args, errors=errors)
return resolved, errors

def _get_message(self, message_id):
if message_id.startswith('-'):
raise LookupError(message_id)
if '.' in message_id:
name, attr_name = message_id.split('.', 1)
msg = self._messages_and_terms[name]
for attribute in msg.attributes:
if attribute.id.name == attr_name:
return attribute.value
raise LookupError(message_id)
else:
return self._messages_and_terms[message_id]

def _get_babel_locale(self):
for l in self.locales:
try:
return babel.Locale.parse(l.replace('-', '_'))
except babel.UnknownLocaleError:
continue
# TODO - log error
return babel.Locale.default()
15 changes: 15 additions & 0 deletions fluent/errors.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
from __future__ import absolute_import, unicode_literals


class FluentFormatError(ValueError):
def __eq__(self, other):
return ((other.__class__ == self.__class__) and
other.args == self.args)


class FluentReferenceError(FluentFormatError):
pass


class FluentCyclicReferenceError(FluentFormatError):
pass
Loading