-
-
Notifications
You must be signed in to change notification settings - Fork 36
Fallbacks for variables in syntactically invalid declarations #703
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
It's a syntax error. We say this about syntax errors:
You'll also find buried in the formatting spec this instruction:
That is, in this case, we emit the logo. While a human might intuit that the rest of the example message is intact and it's only missing an equals sign, there's no way for the parser to know for sure. It's also not possible for the parser to know if other Very Bad Things might happen if I do find that it's confusing that the above quoted paragraph (the "If the message..." one) is not more prominent. |
Since that text is under the "Pattern Selection" header, it could be read as only applying to a selectors message. (It could also be read the other way, since selecting a pattern from a set containing 1 pattern is also pattern selection, but I think there's no obvious reading of it.) |
I agree. A syntax error can occur in a simple message:
... so we should clarify that the quoted paragraph applies to all syntax problems. |
Probably the right prominent place for this is the formatting intro, where we need to replace this MAY with MUST: message-format-wg/spec/formatting.md Lines 10 to 11 in 38fdd69
|
... or we could just get rid of "MAY be" (or convert to SHOULD). Some syntax errors are recoverable (at least to the point of not requiring the logo be emitted for the whole message)--if the error is restricted to the inside of an expression, the expression would fall back without escaping to obliterate the message. With lazy evaluation, the message as a whole might even work for some inputs. It's probably better to blow up so that the message gets fixed in pre-prod, but we don't have to require it. Should we allow that? Anyway, should we attempt to clarify pre-preview? @catamorphism do you agree that this is the right behavior? |
Fixes #703 This fix moves the syntax/data model error resolution text verbatim from the bottom of the pattern selection section to the formatting intro (except to replace the words 'pattern selection' with 'formatting', making it more general). In addition, this changes uses US vs. UK spelling in the first sentence of the intro (an editorial nit) and removes the existing similar instruction about fallbacks. This is a normative change because the previous text had "MAY" for the fallback. I also rephrased the "To start..." paragraph to be less chatty by using an imperative (this is an editorial change).
I'm not sure what "this" is :) I can read the current spec in several different ways (let's just focus on syntax errors, for simplicity):
(5) is what allows an implementation to make the set of error-free messages as large as possible. |
Sorry I wasn't clear. The proposal is that syntax/data model errors fail the message outright. I agree that expression-internal syntax errors are sometimes recoverable and that we can allow these to be evaluated lazily (only parsed when needed) or that we can allow expression internal error to fallback appropriately or do something implementation defined. This sort of fallback is described in the expression resolution section, btw.
That isn't quite accurate. It allows an implementation to make the set of error-free formatted messages as large as possible, at the cost of potentially hiding errors in a way that can appear to be "transient". I think it is worth careful consideration, since most parsers will run over the whole message, not skipping patterns until needed. We have a separate error (Invalid Expression) for problems inside of an expression. |
I agree that hiding errors isn't necessarily a good thing! Option (1) is the most consistent with all programming languages I'm aware of (all syntax errors fail the program). And I personally support it; parsing the syntax is complicated enough without the requirement to recover from syntax errors. If there are good error diagnostics, I don't think providing partial output adds much. (That being said, ICU4C makes it hard to give good error diagnostics beyond line and column numbers, and adding better diagnostics is unlikely to happen for the tech preview.) |
I think this has been addressed by subsequent spec changes to the effect that syntax errors are non-recoverable. |
Consider the following syntactically incorrect message:
This is incorrect because the "=" between
$bar
and{|foo|}
is missing.The spec requires the implementation to signal a syntax error, but also provide a partial result.
Should the result of formatting be
foo
since the right-hand side of the declaration is syntactically correct, although the declaration itself isn't? Or should it be{$bar}
because no valid declaration for$bar
exists?There is a similar test case in https://github.com/unicode-org/message-format-wg/blob/main/test/syntax-errors.json#L50 , but this file doesn't specify formatting output for syntactically invalid messages.
The basic issue is that this text from formatting.md:
doesn't say whether to bind a variable to a resolved value if an error is encountered while processing the declaration.
I'll note that a previous version of the messageformat package included a test file that included the following test:
which implies that
$bar
should be bound to a value in the environment even though processing its declaration should cause the formatter to emit a syntax error. However, I don't see a basis for this (or the other interpretation) in the spec.The text was updated successfully, but these errors were encountered: