-
-
Notifications
You must be signed in to change notification settings - Fork 36
Allow names to start with a digit #350
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I think position based arguments are inherently flawed model as they provide no value in modern programming languages while they impair error recovery. The argument in this comment is about ability to migrate from MF1 to MF2 and I think this could be handled by migrating positional arguments to named using I claim this is more readable in case of unresolved arguments than |
Is the enhancement request here to:
The title suggests №3, the body of the issue sounds like №2 to me, and the weight of @zbraniecki's point seems to be directed at №1. |
I believe my argument is directed at (1) and (2). |
@alerque The enhancement request here is focused on (2) and (3) and is strictly limited to the syntax. I might separately suggest that migrating implementations could choose to support positional arguments (perhaps outside the standard) as a sop to migration. I have no allergy to a user choosing to have an integer named key. In general I favor putting the least restriction on developers that we can and trusting them to decide how to use our tools. I would not have phrased (2) that way. I would have just said:
In part because your (3) implies that there has to be some non-digit characters somewhere in the name. I am focused on namespacing rules here and the fewer special ones we have, the better, in my mind. @zbraniecki Implementations of Yes, the implementer could create a convention like
This is overstating it, I think? I am not arguing, please note, that positional arguments are a Good Thing or that that are equal to named arguments. Use of positional arguments in MF has been a matter of taste for a while--some developers have "bad taste" 😃. I'm just reluctant to be absolutist ("no value"?) |
That seems like a very convoluted solution. You're saying implementations may optionally chose to replace
As I stated in my previous comment:
You did not respond to that claim.
I disagree.
It's not a matter of taste. It has an actual impact on the localization system capabilities in the area of error recovery. In this case, use of Let me try again. Compare a user story of a non-tech-savvy customer seeing two versions of partially resolved message: Maplet pattern = "{Accept to pay {$amount} for your order.}";
let args = {
"amount": new Intl.MessageFormat.arguments.Currency(42, {currency: "USD"}),
};
let mf2 = new Intl.MessageFormat(["en-US"]);
button.textContent = mf2.formatToString(source, args); Partial output:
Arraylet pattern = "{Accept to pay {$0} for your order.}";
let args = [
new Intl.MessageFormat.arguments.Currency(42, {currency: "USD"}),
];
let mf2 = new Intl.MessageFormat(["en-US"]);
button.textContent = mf2.formatToString(source, args); Partial output:
I claim that this is an significant difference in UX and we should put effort to strengthen the best practice. My solution to make MF1 -> MF2 converter replace positional arguments with
which I believe to be significantly less confusing and less likely to be misread as As to the cost for developers that have to migrate their arrays to maps - they don't. They can write a wrapper (or we can provide them a wrapper) which takes a list of arguments and turns it into a MF2ArgumentMap with keys The additional value of that is that it provides them with greppable way to find use cases of that in their source code and set forth a project to remove the use of such transitionary wrapper by manually converting use of it to actual maps with meaningful argument names. |
@zbraniecki noted:
Actually, when we did this for ARB, we implemented it by making positional arguments into map keys, e.g. So I'm not saying that implementations would "optionally choose". I'm saying "implementations would choose how to convert an array into keys", which could include either
I didn't because I didn't think it was relevant to this request and because to me it is an eye-of-the-beholder problem. Does it matter if the failed message is I see your user case above and I agree with you that Admittedly I'm pointing out that one use for these is migrating MF1 callers as a kind of "proof of utility"...
Well, no, it is a matter of preference in MF1 (and flavors of MF1, such as ARB), where one can choose between using a Map and an array--and some developers prefer one or the other. That's why I said it was a matter of taste. I do know that developers are not keen on being required to go into working code and change this:
into this:
Just so they can use the new formatter. They'll just... call the old one, eh? If the only cost for the new formatter is the need to fix the source pattern (perhaps using a tool) hidden behind
But, again, it is a matter of developer preference. I fully agree that using the source string as a gettext key is bad--but some developers (not me!) think it is actually a feature (this includes, I believe, the developers of gettext). Even if it never occurred to the developers of gettext, they never prohibited it and someone discovered they could make it work. I don't think we have to enforce every best practice at the level of the syntax and I'm willing to let implementers or users with different priorities than my own make their own choices. As far as I can tell, there is no technical reason to disallow numeric keys. Adding positional argument support is not something I think we want to do (and am not proposing it), but I could see existing implementations wanting to provide a migration path. Recommending the |
I think that's the source of our disagreement. I see resilience as an important part of a dynamic system's design which targets cross-roads of technologists and non-technologists to collaboratively produce human readable output. [0]
And I am only saying that I'd prefer to force such key names to use
They wouldn't be, which is an area where I think we do agree. I suggest that such case would require just a helper wrapper from your: final Object[] args = { numMessages, priceOrder, foo };
return res.format("somePatternString", args); // ARB combines resource lookup and formatting to final Object[] argList = { numMessages, priceOrder, foo };
final ImmutableMap<String,Object> argMap = CompatibilityHelperArrayToMap(argList);
return res.format("somePatternString", argMap); and that can be further wrapped for convinience: final Object[] argList = { numMessages, priceOrder, foo };
return res.formatWithArgList("somePatternString", argList); with the helper being hidden inside the customer's API and developers need not to worry about it. In other words, I am pushing back on your claim that this cannot be solved in convenience APIs and we must extend the spec to allow for bad practice in order to avoid blocking adoption on highly disruptive code changes.
I think it's a very vague claim. What is a technical reason in case of a localization system? Is disallowing nested selectors in MF2 due to technical reasons?
I am aligned with you. We should do our due diligence to ensure minimal disruption and fewest possible papercuts for migrators. I believe this is what motivates you to file this issue! I suggest we ensure that the [0] I recognize you said "the most important" - I assume you do not see it as high value, while you see MF1->MF2 migration DX as high value. I see both as equally high value, and resilience as higher in the long term as I think of M2 as a system whose majority of users and use cases over its lifetime have never used MF1. |
You address my concern in your footnote. I didn't say error state was not important. I just don't stack rank it as highly as you appear to. FWIW, I also don't rank MF1->2 migration as high as I think you think I do :-).
These are not the same thing. I am talking about the namespace for keys, not about positional arguments at all. It happens that positional arguments might use integer keys (or not), but this isn't about that. This is about allowing keys to start with and even be composed of just ASCII digits (noting that non-ASCII digits are "just fine" with us!!!) Here's a different example using German orthography subtags:
I am not claiming this. In fact, my example is exactly a "convenience API" that hides the migration entirely. What I'm arguing for is not an extension in order to allow a bad practice. It is to enable the freest possible use of the syntax--which can include some uses that you (or I) might feel are bad.
I thought this was a very non-vague claim. What is the technical argument for disallowing digits in the production in question? In what functional way is a localization system harmed by their existence? I agree that positional integer keys are less good than named ones. The discussion of nested vs. matrix selectors is a very interesting one, but it should be its own issue (if you care to reopen it). It doesn't depend on or, AFAICT, inform the discussion of key values.
My motivation is informed by migration experience, but mostly has to do with: I want the fewest arbitrary/preferential decisions in our syntax that people have to learn (and machines have to check).
I actually think this could be out of scope?
As noted, I'm fine with (possibly non-normative) recommendations for how to migrate. Note that MF1 migration or compat appears nowhere in our goals and deliverables. |
I've been playing fly on the wall for these discussions. I would tend to agree that the salient question should be focused on what would break if you allowed fully numerical names. That's the level which is a appropriate for what is effectively a MUST NOT. Simple bad (or non-optimal) practice, if it can be identified, could be the subject of prescriptions or proscriptions expressed with SHOULD or RECOMMENDED. |
In case we're talking of variant keys like Or in case the issue is around the selector, it would be helpful if it used our current syntax, with which I presume that line ought to read something like:
The
Huh, you're right. We've talked so much about MF1 compatibility that this actually surprises me. I realise that we may mean different things by "MF1 compatibility" as well. For the record, I would consider MF2 to be compatible with MF1 if it's possible to use an MF2 implementation and some set of runtime functions to provide the same external API as an MF1 implementation provides. |
My reading of "MF1 compatibility" is being able to solve the same problem that MF1 does. It is not a drop-in replacement, no code changes required. Maybe a discussion of what "MF1 compatibility" really means would be good. On positional parameters, I agree with Zibi. For a translator there is a lot more context in "You added {0} to {1}" vs "You added {$fileName} to {$folderName}" That is a big plus already. It is also better for leveraging. "Do you play {$sportName}?" and "Do you play {$musicalnstrument}?" will be translated differently in some languages (in Romanian you "... joci tenis" and "... cânți la pian") And in a multi-sentence paragraph is usually split into sentences (segmentation) for better leveraging.
With named parameters the second sentence is leveraged 100%. In general for API design I tend to go with "make it easy to do the right thing, make it hard (but not impossible) to do the wrong thing" In this case, if someone really wants positional parameters, they can do something like this:
And now they can do:
instead of:
So even if I think that positional arguments are bad, one can still do it, if they have a good use case that I couldn't imagine. This is a bit what Addison said, with "Yes, the implementer could create a convention like arg0" Another of my API rules of thumb is "if a big majority of the users are required to write the same duck-tape code to use a certain feature, then that duck-tape belongs in the library" So it depends if we consider positional arguments a feature or a misfeature not. In all the guidelines I've seen / wrote I recommend named arguments over positional ones for localization. |
One extra note: MessageFormat is at times inconvenienced by this need to support both numeric / named parameters. My guess is that MF1 supports positional parameters because that is in the JDK MessageFormat (which is still "the old MF1", yes). Even the doc has some warnings about it:
So I think it is a misfeature (same a |
It looks like most of us agree that positional arguments are not a good practice. And, in fairness to @aphillips, this issue is not about reintroducing them. Instead, it's another instance of the discussion about being lenient on input. In fact, after this discussion, I'm warming up to the idea of dropping the
This point from @aphillips convinces me. Yes, |
Can you share your position on my response to that, which is:
|
I agree with you. We should encourage developers to use descriptive parameter names. At the same time, I'm concerned that for many developers the choice they will face is:
Migrations are hard, require coordination, approvals, are difficult to test at scale and to roll back. The less friction we cause in the first step, the more likely it is that the next steps will happen at all. |
Also, nit-picking a bit: did you know that Argentine pesos are often abbreviated as |
I think this is a false dichotomy. As discussed in this thread and (I believe) agreed upon by Addison, Mihai, Eemeli and me at least, we do not face this dichotomy. We can safely allow for migratory path to translate list of arguments to a map with If you disagree that this is possible, or believe that this will not alleviate the friction for migration, I'd appreciate if you stated such position explicitly. |
Team A will choose your migration tool, which translates |
I think this is a strawman argument Stas. Solvable by a single sentence in spec recommendation on prescribed way of handling MF1 argument lists in MF2. |
But wouldn't the same sentence solve the error scenario problem that you worry about? I work a lot on features that require changes to both code (binaries) and data (configuration), and I've experienced much of the friction that I'm alluding to first-hand. Granted, there may be fewer pieces involved in the migration path you're proposing, but I would nevertheless expect some friction. I think we're getting close to making this a discussion about how is ultimately responsible for implementing bad practices. In one of the previous comment @zbraniecki mentioned nested selectors: why forbid them rather than recommend against them and let developers use them when they need them? The same goes for nested function expressions (#353). My position here is that nesting has impact on the data model, runtime, static analysis, tooling, interchange, migrations, CAT GUI... on top of being a sharp tool. Here, however, the proposal is to make a surgical change to the BNF, which slightly relaxes the grammar of variable names, function names, and option names. Nothing else changes, and we can still recommend against using numerical names. |
My preference would be to not allow for a leading digit, for two primary reasons:
We can sidestep the mismatch between tooling by including something like this in the spec:
An implementation applying such a transform could then provide a helper as suggested by @zbraniecki above in #350 (comment) that could be used by legacy code. That should allow us to remain fully backward-compatible while providing an eventual pathway for the names to be replaced with more descriptive ones, so that translators get more context when working with them. |
Rationale 1. to me reads like: "references are fundamentally a map and not an array", and while you can fake an array with a map, we want to put a bit of friction there, because implementing an array on top of a map ought to be "in your face". To my view, that rationale wins over the "freedom of expression" argument better than Rationale 2. As long as it's possible to do things like $p0 or $_1, you aren't fully guaranteeing better translator experience. Given Rationale 1, providing migration advice in this instance is not just the proper approach, but essential. I would consider such a recommendation a "convention" and call it out as such, as in "SHOULD follow the convention of...". |
I do not see how a sentence in a spec recommendation can solve the scenario where we produce |
@zbraniecki The spec can use a
Any sort of fallback string which includes the dollar sign can be argued to be potentially confusing. See my earlier comment about the Argentine peso. |
@eemeli I still disagree with your logic about disallowing first-digits. Based on the conversation here, are you in agreement that we should allow starting digits but provide guidance on migration? Or strictly opposed to starting digits? My comments on your criteria:
Variable names are just variable names. My argument here is that many names might be useful to an application and there is no reason to impose this particular restriction on what an application developer might want to do. There are many systems that generate variable names from data or where names might naturally start with numbers--not just "array-like accessors". Here are some examples:
No argument about this--we have more in the thread about what to recommend for migration. But this is not a good argument against digits at the start or as full names in the ABNF. For me, we simply have to accept that users (end users, not implementers) will make decisions about how to use our interfaces, APIs, and tools in ways that we might consider to be bad practice or that we would not users to perpetuate. Nothing we do will guarantee that variable and placeholder names are good for translation any more than (say) programming languages prevent single letter variable names other than for loop control. How are translators helped by @stasm Yes, that's what |
I don't like starting digits, because they make it too easy to keep using indexed variables like Not allowing starting digits for variable names is nearly universal across programming languages, and not following that practice sends a strong signal that indexed names are fine, no matter what SHOULD statements we include. No matter what rule we use for variables names, it'll undoubtedly still be possible to pick bad ones; that we can't fix here. But we can -- and should! -- add a little bit of friction for implementation and library/tooling developers who wish to support legacy indexed names. Doing so helps make it clear that MF2 is not itself providing universal support for them, but that that's done separately. Also, regarding my previous spec text recommendation, there appears to be some common practice of using an underscore |
Rejected in 2023-04-10 call |
Is your feature request related to a problem? Please describe.
Currently names can start with a potpourri of characters:
... but ASCII digits are not permitted. Note that the above ranges include digits in a variety of writing systems, including the wide compatibility digits starting at 0xFF21. Just not ASCII digits.
This could be a compatibility problem. Existing MF1 messages can use numbered names, i.e. this is valid:
But this is not valid MF2:
Describe the solution you'd like
Add DIGIT to the
name-start
list (or actually convertALPHA
toalnum
)Describe why your solution should shape the standard
This is a restriction in the standard and thus cannot be part of userland.
Additional context or examples
Use of numbered replacements is super-common in existing messaging schemes, including
printf
type syntaxes and the existing message format. While we require users to add our sigals and decorations when converting, the omission of digits at name start requires developers to go beyond that and actually name all of their replacement variables.Implementations that allow auto-numbered arg lists (similar to MF1) would be seriously inconvenienced by having to change to names-in-a-map. (I had one of these implementations)
I cannot think of a reason why numbers are not allowed? They don't appear to present a parsing hazard of any kind and the
name
production is always marked with a sigal ($
or:
etc.) anyway.The text was updated successfully, but these errors were encountered: