-
-
Notifications
You must be signed in to change notification settings - Fork 36
Escaping: escaping when a message is stored in a general purpose container #236
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
What's the problem here? |
Is this asking what the escape syntax of MF is? I presume that the storage format's escaping is removed by that the format's reader, e.g. a Java program storing the pattern as a |
Comments migrated from the slidesSlides comment, Mihai Nita (@mihnita), 11:51 AM Apr 21 Are the escapes identical in literals identical to the ones inside placeholders? If yes, why? Saying "yes" can drag in unnecessary implications. Slides comment, Eemeli Aro (@eemeli), 12:37 PM Apr 21 Could you clarify what you mean by "unnecessary implications"? Slides comment, Mihai Nita (@mihnita), 12:30 PM Apr 23 Makes escaping more complex. If the escape rules depend on the current state it means they don't "leak" and create noise outside that state. Take for example the MF1. The pattern inside the placeholder is passed to That requirement to escape the === In HTML there is no good reason to escape So most browsers will even ignore those rules and to the right and intuitive thing:
Even escaping & is often unnecessary. === It it easier to read / write a message (as a human, not a machine) if the escape rules are limited by scope (in text, in a literal, values of the options, maybe selector keys, etc). Take a literal:
There is no need to escape Or maybe in the future a custom function:
There is no good reason to escape === "Global" escaping rules produce unnecessary "noise escapes" Slides comment, Eemeli Aro (@eemeli), 2:25 AM Apr 24 Thank you for the clarification. I agree that e.g. our need to escape Conversely, do you see harm in allowing Slides comment, Mihai Nita (@mihnita), 4:09 PM Apr 24 Accepting both Slides comment, Mihai Nita (@mihnita), 4:32 PM Apr 24 ... Removed first part of the comment, talking about escaping [ ] ... That's also the reason you don't want to mix escaping conventions, and try (as much as possible) to leave that to the storage. Imagine a properties file where some strings require Depending on what API is used on that string after loading. |
Comments migrated from the slidesSlides comment, Mihai Nita (@mihnita), 11:47 AM Apr 21
For the "in memory syntax" they should be already resolved. Slides comment, Eemeli Aro (@eemeli), 5:50 AM Apr 22 You may be right; dropped them from here. We'll still need at least \u and \U within {"quoted literals"}. Slides comment, Mihai Nita (@mihnita), 11:01 AM Apr 23 Even there, I don't think so. These are usually resolved by the "storage layer" When you access the dom in JS the This is usually solved by compiler (C/C++, Java code) or loading (HTML, Java properties). First item in my "MessageFormat syntax: requirements / thoughts [MIH]": And there is a reason why that's first. |
In this bug I've only captured the bullets as documented by Stas after the slides. For my take you can check my "MessageFormat syntax: requirements / thoughts" doc, that I've shared before the syntax was proposed. Here is a copy-paste for convenience (sorry, it is a bit long) Don’t mix syntax concerns with serialization storage concerns What do I mean by this? Design the syntax that is passed in the in-memory string to the message format parser. Similar to C/C++/Java handling of When the content is loaded and in memory, these are already gone, replaced by the proper characters. It means that there should not be any escaping in the plain text elements of The fact that a Unicode character is I should use whatever the convention is for the storage I use (.properties, .xml, strings in code that I extract with Mixing concerns means we end up with a mess where translators are supposed to care about escaping even after the string was extracted from the storage format. And we end up with double encoding (so we need Related, new lines converted to space, collapsing spaces to one space, left/right trimming spaces, indents, these are all storage serialization concerns. |
The current escape rules in the proposed spec are: message-format-wg/spec/message.ebnf Lines 54 to 57 in fe595d5
Is there something that should be added to or removed from these rules, or could this issue be closed? |
No description provided.
The text was updated successfully, but these errors were encountered: