-
Notifications
You must be signed in to change notification settings - Fork 80
Parse Comments as standalone if they're attached to Junk #235
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@zbraniecki, @Pike - This implements a change in Syntax 0.6 to how comments are parsed in case of syntax errors in messages and terms they're attached to. Previously, the comment would be parsed as part of the junk; the new behavior is to salvage the comment as standalone and limit the junk to the message. This has implications for ## A group comment
foo = Foo #Junk comment
foo = Foo Such questions make changing I made changes to Right now all of the uses of # A comment
junk We will need to revisit the design of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should work fine.
One question, in the reference parser, we support three line endings,
Seems that the js parser only handles unix line endings?
Another purely code-style comment is about the outer loop and the re-entry into getEntryOrJunk
. I'm using a lastComment
pattern in compare-locales, but I have no idea if that'd be simpler here, or if it would be in 0.7. Also something that'd be perfectly fine for a separate issue to try out.
fluent-syntax/src/parser.js
Outdated
// Messages or Terms if they are followed immediately by them. However | ||
// they should parse as standalone when they're followed by Junk. | ||
// Consequently, we only attach them once we know that the Message or the | ||
// Term parsed successfully. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if a logic based on lastComment
would be easier or not. I.e., if you hit a Comment
without blankLines, stash it. If you hit a Message
or Term
, use lastComment if given, otherwise push lastComment and entry.
Once we add top-level whitespace with projectfluent/fluent#116, does one of the two become more straightforward? Would we fold that top-level entry into getEntryOrJunk?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I like the lastComment
approach better. I'll implement it in a separate commit so that we can draw inspiration from the entries.push(entry, next)
approach in the future, if needed.
On the question which API tools should use: Right now, pontoon uses a single API for four use-cases (AFAICT): One is to find out details about the reference entry. Given that the message/term came from the parser, it's probably OK to assume that there are no errors. Also, it parses reference and l10n to create the short display version. One would hope that there are no parser errors left here? The other is to sanitize the serialization. It creates a serialized message, parses that, and serializes again. That code puzzles me a bit. https://github.com/mozilla/pontoon/blob/26811442d45a0a7be6ee7cc8703a7f218e866860/pontoon/base/static/js/fluent_interface.js#L810 Last but not least, to sanitize the fluent text are. I'd think we can do better here in making errors visible. For this case in particular, I'd expect that anything that's not the actual message, we should show an error, and refuse to save. Right now, trailing I wonder if we could make that API even take the expected ID explicitly. Anywho, I think the changes you made to parseEntry go in the right direction, so I think we can tweak this in a follow-up. |
Syntax 0.6 changes the parsing logic of Comments. Comments attached to what ends up being Junk are not part of Junk anymore. Instead, they end up as standalone Comments in the final Resource. This PR also changes the behavior of FluentParser.parseEntry() to ignore all valid comments and only start parsing when a Message or a Term is encountered. Comments with syntax errors are not skipped, however.
Thanks for the review, @Pike!
There's #228 which fixes this. I haven't yet had time to reply to @zbraniecki in it.
I actually like the |
Correct, that was my assumption as well.
Agreed here as well.
Hmm, interesting indeed. Perhaps this serves as a prettifier?
I agree that we should improve the source editing experience. Ideally I'd see us using a proper web IDE solution for this, which would likelu want to use the full tooling parser.
The current implementation of
In one iteration of this PR I changed
Cool. Let's see how the upgrade to Syntax 0.6 in Pontoon goes and then we can research the |
Syntax 0.6 changes the parsing logic of
Comments
.Comments
attached to what ends up beingJunk
are not part ofJunk
anymore. Instead, they end up as standaloneComments
in the finalResource
.This PR also changes the behavior of
FluentParser.parseEntry()
to ignore all valid comments and only start parsing when aMessage
or aTerm
is encountered.Comments
with syntax errors are not skipped, however.