Implement Syntax 0.7 #287

stasm · 2018-09-26T10:13:53Z

No description provided.

* Indentation/Whitespace in 0.7 * Apply feedback * Apply 2nd round of feedback

This is a re-write of the runtime parser. It supports Fluent Syntax 0.7, runs against the reference fixtures, has half the lines of code, and is as fast in SpiderMonkey as the old one (and slightly faster in V8). Goals 1. Support 100% of Fluent Syntax 0.7. This includes the indentation relaxation, dropping tabs and CR as syntax whitespace, normalizing new lines to LF, and only allowing numbers and identifiers as variant keys. 2. Maintain good performance. The parser is used in performance-critical code paths. Back in the days of Firefox OS it had to be both fast _and_ produce tightly packed results so that translations don't take up too much space on the device. I think the storage requirements can be relaxed these days. 3. Write code which will be easy to maintain in the future. The parser was first written even before Fluent branched off from L20n. It's seen many changes and additions over the last two years. As new features accrued it became hard to maintain it and also to keep track of all known bugs. My goal for the re-write was not only to clean it up but also to define the conformance story for the future and to improve the testing infrastructure. Design The parser focuses on minimizing the number of false negatives at the expense of increasing the risk of false positives. In other words, it aims at parsing _valid_ Fluent messages with a success rate of 100%, but it may also parse some invalid messages which the reference parser would reject. The parser doesn't perform any validation and may produce entries which wouldn't make sense in the real world. For best results users are advised to validate translations with the fluent-syntax parser pre-runtime. The main parser loop iterates over the beginnings of messages and terms. This is to efficiently skip over comments (which have no use on runtime), and to recover from errors. When a fatal error is encountered, the parser instantly bails out of the currently-parsed message and moves on to the next one. Errors are discarded and are not visible to the users of `FluentResource`. The do carry a minimal description of what went wrong which may be useful when reading the code and for debugging, though. The parser makes an extensive use of sticky regexes which can be anchored to any offset of the source string without slicing it. In some places, it's easier to just check the character currently at the cursor, so it does a fair share of that, too. Conformance My original plan was to base the parser on the EBNF and only parse well-formed syntax. In this PR, I went for something a bit wider than that: a superset of well-formed syntax. The main deviation from the EBNF is related to parsing `VariantExpressions` and `CallExpressions`. The EBNF verifies that the they are called on `Terms` and `Functions` respectively. The optimistic parser doesn't differentiate between `Messages`, `Terms` and `Functions`. I decided to implement it this way because this code might soon change anyways (see projectfluent/fluent#176). Another deviation is that the parser treats commas in argument lists as whitespace, similar to how Clojure treats them in sequence lists. I might suggest we upstream this in the spec, too, because it makes the implementation of args lists _much_ simpler. I based this PR on top of the `zeroseven` branch. The `fluent-syntax` parser already supports Syntax 0.7 and passes the [reference fixtures](https://github.com/projectfluent/fluent/tree/master/test/fixtures). This made it possible to also turn on the reference testing in the runtime parser, too. `make fixtures` creates the parsed results for all reference fixtures; for now they must be verified manually before they're committed. `make test` can be used in development to assert that the output of the runtime parser still matches the committed one.

Rewind index to improve error recovery

stasm changed the title ~~Implement Syntax 0.7 in fluent-syntax~~ Implement Syntax 0.7 Oct 12, 2018

unclenachoduh and others added 3 commits October 12, 2018 15:02

Restrict variant keys to Identifiers and NumberLiterals (#268)

cffc259

Indentation/Whitespace in 0.7 (#284)

52e1854

* Indentation/Whitespace in 0.7 * Apply feedback * Apply 2nd round of feedback

stasm force-pushed the zeroseven branch from 2071570 to 049b95d Compare October 12, 2018 13:04

stasm added 3 commits October 12, 2018 18:06

Consistently require the skip option in isNext* methods (#291)

fa25466

Remove unused skipBlankInline

06de0fb

Remove expectIndent

a494293

stasm mentioned this pull request Oct 15, 2018

Support Syntax 0.7 projectfluent/python-fluent#76

Merged

stasm added 11 commits October 16, 2018 13:51

Use a cursor in ParserStream

5c1c446

Rewind index to improve error recovery

ca17a53

Merge pull request #292 from stasm/rewind

bcf6799

Rewind index to improve error recovery

Refactor and comment skipToNextEntryStart

c0e5ba8

Simplify the trimming of the last pattern element

e9a61ec

Move FluentParserStream to src/stream.js

b09a623

Normalize CRLF to LF in ParserStream

61187c0

Spans should end before CRLF, not in the middle of it

e5cf0c0

Return EOF from takeChar, if encountered

7b03100

Remove @babel/register from tools/

3947017

Commit EOL changes for crlf.ftl

1f21981

stasm force-pushed the zeroseven branch from 9578437 to 1f21981 Compare October 22, 2018 12:51

stasm merged commit be66cd0 into master Oct 23, 2018

This was referenced Oct 23, 2018

fluent-syntax: Allow Placeables inside other Placeables #66

Closed

Allow leading whitespace in variant keys #250

Closed

Support CRLF as EOL #131

Closed

Support nested placeables #266

Closed

stasm deleted the zeroseven branch October 23, 2018 06:49

spookylukey mentioned this pull request Oct 30, 2018

Implement MessageContext.format projectfluent/python-fluent#67

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Implement Syntax 0.7 #287

Implement Syntax 0.7 #287

Uh oh!

stasm commented Sep 26, 2018

Uh oh!

Uh oh!

Implement Syntax 0.7 #287

Implement Syntax 0.7 #287

Uh oh!

Conversation

stasm commented Sep 26, 2018

Uh oh!

Uh oh!