-
-
Notifications
You must be signed in to change notification settings - Fork 36
Consider escaping by doubling the special characters #346
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I think either this isn't a problem, or I'm missing something. Our base assumption is that by the time we're parsing a string with the MF2 syntax grammar, it's already been processed for character escapes by its container. So for instance something like new Intl.MessageFormat("{\u12AB}") would rely on the JS string escape handling to present new Intl.MessageFormat("{\\u12AB}") would try to parse new Intl.MessageFormat("{\\\\u12AB}") would get us to parse In other containers (e.g. .properties files) we would need the same amount of escaping, but in a dedicated MF2 resource syntax we could (and should!) define character sequences like |
Maybe character escapes are not the best example. However, there are plenty of micro-syntaxes that use backslash as an escape. File level escapes such as It would be entirely reasonable to impose our escape regime just as currently written. However developers, translators, tools, and such need to be able to easily write strings (in a myriad of programming languages and runtime environments) with a minimum of arcane rules. We don't provide any other escaping mechanism or reserve any other characters besides An alternative would be to only reserve escape-escape (
... in which the output looks something like
Am I being too paranoid? |
A couple of incomplete thoughts:
|
Actually, we are only required to escape the closing character
There are so many templating micro-syntaxes and resources serve these in so many different ways. My example of markdown as just because it was handy. I'm trying to be conservative about what we do in order to avoid developers and translators becoming confused. And I would be okay with keeping the status quo if others feel I'm being too paranoid. |
I don't even think we require escaping We require escaping only for "things that exit the current state" Current EBNF:
|
They are currently not required to be escaped, not by MF2. String msg = "{The \u1234 and \n don't make it to MF2. But to protect curly bracket (\\{) or backslash (\\\\) we need escaping, I agree}"; What MF2 sees is In other words: what Eemeli said :-) |
I think I remember some discussion about escaping both
We also need to escape |
The reason for escapes is to allow a syntax meaningful character to appear inside the syntax. The
The The other sigils Note that
I raised this issue because, when we use backslash as the escape character, we also have to provide a way for it to appear in literals and text. And we know that programming languages and templating languages and all sorts of things use backslashes as escapes (it's why we're doing it). My experience is that nesting formats are super common and quickly become super confusing. Our syntax will be embedded into different implementers resource format. It will also be used to store strings used by a myriad of developers in many different runtime environments. Each of these will have their own syntactical goo which developers will need to embed into strings. We will no doubt cause some people pain when, like us, their templating language has chosen to make I am willing to accept that we use |
Closing per 2023-06-19 telecon discussion |
Text
is defined such that backslashes in the text must be escaped, e.g.\u12AB
is required to be\\u12AB
or\n
is required to be\\n
. I'm okay with requiring{
and}
literals to be escaped.I am nervous that we're recreating the apostrophe disaster in MF1 by requiring every backslash to be escaped. Developers use e.g.
\n
or other escapes in strings all the time. Making double escapes out of them doesn't seem to be required except to allow for the literal\{
or\}
to appear in a string.I think we might be better off by making the placeholder markers self-escaping, e.g. if you want to have a literal
{
use{{
.{
and}
are rare enough in real text that this might be much less of an impact than requiring every backslash to be doubled.Originally posted by @aphillips in #344 (comment)
The text was updated successfully, but these errors were encountered: