Reference parser stores Message Comment as Standalone Comment if Message has errors. #143

zbraniecki · 2018-06-04T13:14:59Z

Extracting from projectfluent/fluent.js#206:

This seems like breakage of the Principle of Least Astonishment - the new behavior changes the semantic role of a bit of information encoded in FTL file depending on an error which doesn't hinder parsers ability to interpret the information. This means that if someone writes a comment meant to be attached to a message, and there's an error in the message, they may end up with a standalone comment.
Now, this is not very consequential right now, but there's nothing guaranteeing that we won't have some behaviors expected for standalone vs. message comments.
Of course once an error happens, we just try to salvage, but in this case we do know that the comment was meant to be attached to the message, and yet we choose to encode it as a standalone.

I believe we should change the behavior of the reference parser.

stasm · 2018-06-05T07:58:31Z

This is related to #132 which is also asking how much of a message we should try to salvage in case of syntax errors.

Of course once an error happens, we just try to salvage, but in this case we do know that the comment was meant to be attached to the message, and yet we choose to encode it as a standalone.

As soon as we enter the error recovery territory, it's hard to make assumptions about what the localizer meant. I question whether we can confidently claim that we know that the comment was meant to be attached to the message. It's a good guess, but might also be a mistake by the localizer.

The rule of thumb for the error recovery that I'd like to suggest, is to salvage as much as possible without risking that other messages or terms are affected. In other words, if a syntax error in one message could end up splitting the message in two and changing the value of the message before it, that's not a good error recovery strategy.

Comments are harmless to the UI and they don't impact translations in any way. I think we should try to salvage as much as we can here, which is what the reference parser currently does.

Pike · 2018-06-05T10:11:14Z

What stas said (which is basically what I also said in 206).

Also, as you type a new message in an ftl file, you'd get a weird user experience, if you'd:

start writing a comment - everything looks good
start a new line for your message - everything looks good
start the message by typing in the ID - the comment that was OK just turned into a syntax error
you need to get all the way to having the first parts of a value, and then it looks OK again
you start a placable by typing {}, and the comment is a syntax error again

zbraniecki · 2018-06-05T14:58:24Z

I'm convinced by @Pike's argument - that's the UX if like to see. I'm wondering if there's another way to achieve it and if there are other cases where we may want to avoid showing something as an error in the process of constructing it - should typing a variable show it as an error after you type only $? Should the expression show up as an error until you close the brace?
Maybe we need some "in-progress" parsing mode which doesn't flag as errors FTL that can be well formed if more characters will be added?

As for @stasm argument - I think the crux is on whether we can reliably say that we know that this comment is attached.
From the parser standpoint we know the moment the comment is followed by a non empty line.
Which means the first letter of the identifier is enough to turn the comment into a message comment.

I'd also like to see my concern addressed - the behavior sets a precedence - the parser knowingly introduces the node into the tree in a different role than the author intended. Having such behavior while constructing FTL seems fine, but having this behavior in a runtime/tooling parser seems like a breakage of the POLA violation. Is there a precedence for that? Are there other places where we should or may introduce similar behavior in the future? Is it a one-off case?

stasm · 2018-06-05T15:06:39Z

I'd also like to see my concern addressed - the behavior sets a precedence - the parser knowingly introduces the node into the tree in a different role than the author intended.

Like I said above, I don't think this is true. We don't know what the author intended—we're in an error scenario. Maybe the message wasn't supposed to start right under the comment? If there's one error in it, how can we be sure there aren't others?

Do you agree that even if they make it into the AST, comments are rather harmless? Do you expect this behavior to cause problems?

zbraniecki · 2018-06-05T15:15:42Z

Do you agree that even if they make it into the AST, comments are rather harmless? Do you expect this behavior to cause problems?

I don't know. It locks us in a behavior that assumes this, which I think should hold true in the future, but may bite us back.

If you don't share this concern, I think we can close this issue.

stasm · 2018-07-02T15:43:43Z

When I last talked to @zbraniecki about this, he described his attitude as not convinced but not blocking. Let's close this issue.

There are also technical reasons for doing so. It would be hard to implement the change requested here in terms of the PEG (which the refparser uses). Implementing the behavior dictated by the spec is, OTOH, easy. projectfluent/fluent.js#235 implemented it in the tooling parser in the fluent-syntax package.

stasm closed this as completed Jul 2, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reference parser stores Message Comment as Standalone Comment if Message has errors. #143

Reference parser stores Message Comment as Standalone Comment if Message has errors. #143

zbraniecki commented Jun 4, 2018

stasm commented Jun 5, 2018

Pike commented Jun 5, 2018

zbraniecki commented Jun 5, 2018

stasm commented Jun 5, 2018

zbraniecki commented Jun 5, 2018

stasm commented Jul 2, 2018