-
-
Notifications
You must be signed in to change notification settings - Fork 36
Discuss variable priorities and shadowing #310
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
The PR discussion thread, for convenience: @stasm stasm 8 days ago @eemeli eemeli 8 days ago For instance, I could well imagine translation systems with the following attributes: Translators are able to define additional local variables for a message. @mihnita mihnita 23 seconds ago In general I don't think it is a good idea to allow translators to override a developer decision. And "A developer may pass the same basket of arguments to multiple messages, not all of which use each of them." does not look like something related to shadowing. There is a reason many linters warn about shadowing. "If shadowing were not allowed, then a combination of the above could result in a runtime error which would be difficult to discover ahead of time." Not necessarily. But TLDR: I don't see valid use-cases for it. In fact the proposal we had used different sigils for local variables (macros at the time) and arguments. |
Adding to it a bit: in one of the discussions Eemeli introduced the idea that we can have a "map of variables" passed at parse time, in addition to the one we pass at formatting time. For example:
So when the messages uses I remember that in a previous discussion we discussed having several "bags of arguments", and the simple answer was to merge the bags before passing them to message format. But anyway, if we allow for more bags, we need to decide the priority. |
To summarize my arguments against this functionality:
This is probably already reasonably well captured in previous comments, so I will not add any details.
For example we previously discussed the risk or circular dependencies in definitions:
The mitigation was to not do hoisting, voted for by all people in that thread: But with this change we reintroduce the risk of circular dependencies:
It is not hoisting, the declaration of a in line 3 can see the declaration of b in line 2, which can see the declaration of a in line 1. In other words, if I do So the spec would have to explain what happens in case 2, how to avoid circular definitions (if we shadow 100%). That is adding complexity to the spec / implementations, for no good benefit I'm fine to add complexity as a price for something useful (balancing price vs benefit). And this PR does not seem to explain the behavior in the above cases.
If memory serves me, I think this was mentioned in the meeting 2 weeks ago. If not, I apologize, this would be a straw-man, please ignore. So, would people expect it, because it is common? a. Many linters warn about shadowing. So it is not necessarily a good thing. b. Ant properties are immutable See https://ant.apache.org/manual/Tasks/property.html: And I've seen huge ant files implementing complex functionality despite this "limitation" So it is not unheard of, and it is not that limiting. c. In Rust you To make it mutable you need to say d. Even in JS In JS
But with this PR we allow re-declaration, so we are even more "lax" than JS |
For my part, I see no reason why variables need to be mutable (aka mut). It
is not a particular imposition on the writer to use a slightly altered
name. The exposition and implementation are both simpler.
…On Thu, Jun 15, 2023 at 3:01 PM Mihai Nita ***@***.***> wrote:
To summarize my arguments against this functionality:
------------------------------
1. There is no useful functionality added by allowing variable
overrides
This is probably already reasonably well captured in previous comments, so
I will not add any details.
------------------------------
1. Adds complexity to the spec and implementations
For example we previously discussed the risk or circular dependencies in
definitions:
let b = {$a ...}
let c = {$b ...}
let a = {$c ...}
The mitigation was to not do hoisting, voted for by all people in that
thread:
(zbraniecki, macchiati, eemeli, stasm, mihnita)
#297 <#297>
But with this change we reintroduce the risk of circular dependencies:
let a = {$foo ...}
let b = {$a ...}
let a = {$b ...}
It is not hoisting, the declaration of a in line 3 can see the declaration
of b in line 2, which can see the declaration of a in line 1.
Unless the definition of b actually "sees" the definition of a in line 1.
So that definition is no completely shadowed by line 3, as a "left over" is
captured in the definition of b.
In other words, if I do let a = {42} let b = {$a} let a = {13} let c =
{$a} {The values are {b} and {c} !}
What is the result?
It it is "The values are 13 and 13 !", this is hoisting-like, and allows
circular dependencies.
And it if is "The values are 42 and 13 !", so the first definition is not
"fully shadowed".
(behaves like macros in C :-)
So the spec would have to explain what happens in case 2, how to avoid
circular definitions (if we shadow 100%).
And implementations will have to follow that spec.
That is adding complexity to the spec / implementations, for no good
benefit
("no benefit" is a personal judgement, see item 1, we can debate it)
I'm fine to add complexity as a price for something useful (balancing
price vs benefit).
And this PR does not seem to explain the behavior in the above cases.
------------------------------
1. People are used to it
If memory serves me, I think this was mentioned in the meeting 2 weeks ago.
If not, I apologize, this would be a straw-man, please ignore.
So, would people expect it, because it is common?
Yes and no.
a. Many linters warn about shadowing.
So it is not necessarily a good thing.
b. Ant properties are immutable
See https://ant.apache.org/manual/Tasks/property.html:
<property name="foo.dist" value="dist"/>
And I've seen huge ant files implementing complex functionality despite
this "limitation"
So it is not unheard of, and it is not that limiting.
c. In Rust you let a = 42, and it is immutable by default.
To make it mutable you need to say let mut a = 42
And in code the review the first question will probably *"why does this
have to be mutable?"*
So the same idea from item a.: this is not a good thing, needs to serve a
purpose.
Even the keyword is let, like we have.
So Rust does not try to accommodate some expectation of mutability.
d. Even in JS let is not like what we do here
In JS let can be updated but not re-declared
let a = 43 // OK
a = 13 // OK
let = 101 // Error
But with this PR we allow re-declaration, so we are even more "lax" than JS
(because for us declaration is initialization, which is a good thing)
—
Reply to this email directly, view it on GitHub
<#310 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACJLEMAP6Y5DFDFNW2LFTN3XLOA4RANCNFSM6AAAAAARPNDAZI>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Presuming that this is in reference to PR #381, that proposal currently includes this:
Taking this one step at a time:
Put together, the message resolves and formats the same as if it had been:
@mihnita, with reference to the above, would you be willing to acknowledge that the behaviour in these cases is explained, and does not include any circular references? |
@eemeli replied overnight, while I had the below to sleep on... but I think it's still worth sharing. There are several reasons why a variable might want to be mutable. In fact, it might be useful not to think of them as "variables" so much as them being the "argument list" or "keys". I'll use the word variable below, but keep in mind that we are not a programming language. If anything we are closer to being a declarative language. One reason to allow mutation is: the variable name may be referenced by the message pattern already. When the pattern is a simple message, altering the name of the variable in successive
@stasm also called out that it makes for a bit of weirdness, in that externally defined variables might be overridden, but locally defined ones produce an error. If we make variables immutable, I would probably favor making it an error to override an externally supplied value (which might produce hard-to-debug errors if the argument list is dynamically generated or used across multiple messages--the latter of these is more common than one might think) Also, we don't allow function nesting, so any transformation that requires multiple function calls results in multiple "waste values" since, as noted above, can't just "fix up" the original value. A reasonable use for reassignment is decoration or normalization of the original value. Stuff like:
... and I might very much want to do such processing in my message because (a) laboriously writing transforms into my expressions is messy and (b) if I need to nest transforms I have to write a custom function to do it (because no nesting). Yes, I can assign a new variable to each transform, but why do I have to? I don't think anyone is proposing "forward definitions" or circular dependencies. If the message refers to an as-yet undefined value, that's an error. One of @mihnita's questions was the answer to this riddle:
I think it makes sense if the answer is:
... because as noted at the start, we are not a programming language and these are not variables: they are keyed values. So The above arguments work best on primitives (string, number, etc.) and less well with values that are objects (person, currency value, inventory item, "Object Foo"), where each I'll be interested to see how we land on this issue, as there is merit to making keyed values into constants. But I think there is reasonable utility to allowing reassignment of a key because I value utility and ease for message authors more than the internal troubles for MF implementers in this case. |
And many localization tools can leverage changes in placeholders already. But that's not the main point. We can just use another names.
Then we can always add another intermediate name before. For the first example:
can be refactored without changing the messages proper (affected by leveraging):
This one:
let $foo_tmp1 = {$foo :text-transform transform=uppercase}
|
Eemeli quotes the proposal to the spec:
I think that the meaning of "binds" is not defined. Zibi used "binding" all the time, and I asked many times for a clarification of what that means. TLDR: I don't think there is one universal definition of "bind" that I don't understand, and everyone else does. There are difference between "static binding", "dynamic binding", "early binding", "late binding" So we don't know what "bind" means here. And we can completely bypass the need to define and understand it by just saying that the local "things" we declare & define (in one single step) are immutable. And I don't think that if we document what binding means in our context helps. |
@mihnita, I earlier asked the question:
As your answer to this appears to boil down to:
I take from this that other than the use of the word "binds", you do acknowledge that the behaviour in the cases you specify is explained and that the spec does not introduce any circular references. I've now updated the PR to read:
Do you have any remaining concerns regarding this PR? To be honest, I think it's simpler to effectively say "you can reassign the value of a variable" rather than what we currently say, which is either "you cannot reassign the value of a variable" or "you can reassign the value of a variable, but only once, and only if it's originally an external variable", depending on how you interpret the current text. If PR #381 is not accepted, I would ask that someone else propose spec text to resolve this ambiguity. |
The chief value of immutability is avoiding mistakes: where people reassign a variable without meaning to, or they use a variable without realizing that it has been changed. It also prevents them from inadvertently messing up an input variable, which is easy to do because they are not declared.
If people have to do actions like this in more than a very small % of cases, then we have bigger problems in terms of usability. |
Just as an interesting piece of history: It tells us how (same level) shadowing ended up in Rust, and the fact that this was not some very useful feature, well though out. More of an accident, considered for removal, and in the end "nobody else really cared" And several people in that thread seem to agree that shadowing is error prone. |
I filed #413 to specifically discuss the question of mutability / immutability. |
Two more reasons for mutability, specifically for allowing local values to override external ones:
Here, we've made it explicit for both the translator and developer formatting this message what the types of { count: '1234', date: 1688630222849 }
If we did not allow for local values to override external ones, the developer would end up writing the message as something like
and if we used a different sigil for local variables (say,
With the first approach, we end up with Now, did you notice that my examples above use If instead we allow local variables to override external ones, only |
I see (1) and (2) as good arguments for allowing local variables to override external variables, but not for mutability in general or for local variables overriding other local variables. External variables are already "special" in the sense that they don't need to be declared with a (I can also imagine ways to achieve the same goals as (1) and (2) without allowing any overriding. For example, a type system -- but that would be a major change and I assume it's been discussed in the past, so I can understand the reasoning behind (1) and/or (2) as workarounds for untypedness.) |
A potential problem with allowing external variables to be overridden is that it allows the creation of unintentional bugs or transient runtime failures. Because the external arguments are undeclared, it is easy for a message to declare a variable that turns out to conflict with a given argument array, producing unintended results. If we make all variables immutable and external and local vars share a namespace, passing an argument that shares a name with a local declaration can cause a message to fail.
If we make all variables immutable but external and local vars do not share a namespace, this problem goes away. However, a local variable cannot be used to augment or annotate an external variable.
If variables are mutable and namespaces are shared, it's easy to write a message that never fails but does produce unintended or unexpected results (from the caller's point of view).
If variables are mutable but namespaces are not shared, its easy for developers or translators to reference the wrong one:
So none of these solutions is "perfect". They are all balancing some tradeoff. Elsewhere on this thread I have tended to favor mutability to allow multiple transforms and the modification of formatting directives (including on a language-specific basis) without modifying the Note that the argument list is not always explicitly composed by the caller. I've worked with integrating message formatting into a number of templating and data languages in which the argument map is precomputed (and often contains hundreds of fields: think of an item detail page on Amazon). It's okay to reference one of these values, but there is always the danger that you unintentionally smash the value (silently, which is bad) or conflict with an immutable value (runtime failures that are harder to diagnose). @macchiati noted:
See my previous discussion about frequency. The percentage of total strings for nearly all of our features is pretty small. Multiple transforms are not that uncommon for us because we don't have subroutines or function nesting. I think a better question is: which feature is more important: multiple transforms or preventing accidental masking? |
Nice outline of the options, and food for thought. After reading it, I
propose:
You can have at most one "let X" for any X.
- Specifically, you can override an input variable with a formatted
version of that input variable, once.
- Otherwise you can't override any variables.
OK
let $count = {$count :number sig=3}
let $date = {$date :datetime skeleton=yMMMd}
match {$count} {$date}
when one {You have {$count} book on {$date}.}
when * {You have {$count} books on {$date}.}
FAIL (input variable set twice)
let $count = {$count :number sig=3}
let $count = {$count :number sig=4}
...
FAIL (local variable, being reset)
let $count = {|42|}
let $count = {$count :number sig=4}
...
…On Sat, Jul 22, 2023 at 1:59 PM Addison Phillips ***@***.***> wrote:
@catamorphism <https://github.com/catamorphism>
A potential problem with allowing external variables to be overridden is
that it allows the creation of unintentional bugs or transient runtime
failures. Because the external arguments are undeclared, it is easy for a
message to declare a variable that turns out to conflict with a given
argument array, producing unintended results.
If we make *all* variables immutable and external and local vars share a
namespace, passing an argument that shares a name with a local declaration
can cause a message to fail.
{ "arg1": "37"}
...
let $arg1 = {|42| :number maxFractionDigits=2} // error
If we make *all* variables immutable but external and local vars do not
share a namespace, this problem goes away. However, a local variable cannot
be used to augment or annotate an external variable.
{ "arg1": "42" }
...
let #arg1 = {$arg1 :number maxFractionDigits=2}
{Now I have to change {$arg} to {#arg}...}
If variables are mutable and namespaces are shared, it's easy to write a
message that never fails but does produce unintended or unexpected results
(from the caller's point of view).
{"arg1": "10000"}
...
let $arg1 = {42}
{This always says {$arg1} == 42}
If variables are mutable but namespaces are not shared, its easy for
developers or translators to reference the wrong one:
{"arg1": "10000"}
...
let #arg1 = {42 :number maxFractionDigits=2}
{This always says {$arg1} == 10000 because it should say {#arg1}}
So none of these solutions is "perfect". They are all balancing some
tradeoff.
Elsewhere on this thread I have tended to favor mutability to allow
multiple transforms and the modification of formatting directives
(including on a language-specific basis) without modifying the pattern
parts of a message. The number of text segments affected can become quite
large when there are multiple match expressions, so I favor the utility a
bit.
Note that the argument list is not always explicitly composed by the
caller. I've worked with integrating message formatting into a number of
templating and data languages in which the argument map is precomputed (and
often contains hundreds of fields: think of an item detail page on Amazon).
It's okay to reference one of these values, but there is always the danger
that you unintentionally smash the value (silently, which is bad) or
conflict with an immutable value (runtime failures that are harder to
diagnose).
@macchiati <https://github.com/macchiati> noted:
let $foo = {$foo :text-transform transform=uppercase}
let $foo = {$foo :trim}
let $foo = {$foo :sanitize target=html}
If people have to do actions like this in more than a very small % of
cases, then we have bigger problems in terms of usability.
See my previous discussion about frequency. The percentage of total
strings for nearly all of our features is pretty small. Multiple transforms
are not that uncommon for us because we don't have subroutines or function
nesting. I think a better question is: which feature is more important:
multiple transforms or preventing accidental masking?
—
Reply to this email directly, view it on GitHub
<#310 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACJLEMFB6AAQL2B4BMPNEXTXRQ5KJANCNFSM6AAAAAARPNDAZI>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
@macchiati Do you mean that it would only ever be possible for local variables to shadow input variables? Or is this fact accidental in your examples? Can I also have |
You could definitely do that, as long as there is no
let $month = {something}
before that statement.
What I've focussed on is the chief example that people cite for wanting to
reassign the meaning of a variable.
…On Sun, Jul 23, 2023 at 2:08 PM Stanisław Małolepszy < ***@***.***> wrote:
@macchiati <https://github.com/macchiati> Do you mean that it would *only
ever* be possible for local variables to shadow input variables? Or is
this fact accidental in your examples? Can I also have let $month =
{$date :datetime skeleton=MMMM} there?
—
Reply to this email directly, view it on GitHub
<#310 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACJLEMHBMO3AEAJZMM762Q3XRWHFNANCNFSM6AAAAAARPNDAZI>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
@macchiati I think that's too hard to explain. I would prefer to find a solution that lets us syntactically check violations rather than relying on (invisible) functionality that has to be carefully explained to developers and translators. Perhaps introduce new keywords:
|
I haven't had a chance to read through all the comments yet, but note that we need to be careful to distinguish between mutabilty and name shadowing. Mutability makes it impractical to implement lazy evaluation with memoization (see #413 (comment)). Name shadowing does not. |
I think it is actually very easy to explain. It is an error to have the line
if you have
above it in the message. |
I'd be ok with allowing local variables to shadow external ones, but not other local variables, i.e. what we effectively now have in the spec and what @macchiati proposes. That would address my main concerns, though my preference would be to allow local variable shadowing as well. I also note that no-one has yet answered the question I ask in my previous comment regarding separate name spaces. |
I actually did notice — really! — but only because I was suspecting a set-up :) But I see your point. Not having to change all variable references in the message body would be a welcome property of the solution that we'll end up designing. It sounds like we have 3 requirements (not everyone agrees on all three, though):
I think we can address all three with namespacing of local variable and by introducing new syntax for re-annotation, as @aphillips suggests:
|
A few more thoughts: In my previous comment, I originally used the The third requirement on my list says:
I realize that this is still phrased in terms of a solution, rather then the use-case. Let me try again:
In the teleconference yesterday I said namespacing and shadowing were orthogonal and I suggested to make a decision about namespecing first. However, I now see that they're likely tightly-coupled, unfortunately. That's beause namespacing without a discussion about shadowing directly contradicts the first requirement above: Be able to re-annotate input variables under the same name without having to rename them in the message body. We need to pair with something like the |
@aphillips I'm pretty sure this has been resolved by our current approach using |
Yes. The Seville consensus was to use input/local and make values immutable, eliminating this issue. These solutions have reached the spec, so closing. Reopen if you (the reader) disagree. |
The PR #305 trying to document the hoisting decision introduced the wording that has effect on the shadowing resolution.
That was not part of a discussion, and the decision is debatable.
The text was updated successfully, but these errors were encountered: