-
-
Notifications
You must be signed in to change notification settings - Fork 36
Forbid unquoted non-numeric literals as expression operands #518
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Copying relevant comments from #516 (review):
|
I'm trying to approach this issue pragmatically, from the perspective of the user:
|
If that is true, then
As mentioned in the linked comment, I think it is important for literal quoting requirements to be consistent throughout the grammar. This is going to be a relatively unfamiliar format for pretty much everyone who uses it, and gratuitous inconsistency seems more detrimental than beneficial even if it does save a few keystrokes.
Counterpoint: given support for |
I think it's desirable to minimize the number of special cases users have to think about. I vote for forbidding all unquoted literals as expression operands. How common is it for an operand to be a numeric literal? Is the benefit of being able to write something like |
Actually, I expect that number literals like
String literals will be rare (I hope) because they represent string concatenation in a party dress.
This question is backwards. Why would we force users to write key-value pairs for options with quoted values, given that most values will be predefined word tokens?? If we required |
That same reasoning applies to operands, so I think we're in violent agreement here. |
I meant it as an example of a potential use-case in which passing a string literal to a function could be useful. I took it from Liquid's Just ot be clear: I'm not proposing that I agree that numeric operands are useful and should be encouraged. |
Cool. But do you still think that we should create a dichotomy in practice between options and operands regarding unquoted literals? For example:
|
My priorities are twofold:
I'm proposing that we separate the concepts of numeric and non-numeric literals, and that we have different rules for them. Unquoted numeric literals can go anywhere. Unquote non-numeric literals can go in variant keys and option values. |
Thanks @stasm. My priorities skew towards developer and translator effort. I think if we talk about literals abstractly (as I've been doing) this can obscure how the literals will be used, particularly in operands.
Isn't your argument backwards? Aren't you glad you can type One of the interesting things about our syntax is that we are untyped and in many cases there will be strings used to convey typed data. You've identified numeric literals (i.e. numbers) as a case. Date/time values are also going to be common:
Another common case will be enumerated values, either built-in or application specific. These are string tokens (often in English) but not natural language. I agree that this depends on the design chosen for spannables (such as in #537 etc.) I just think, for the reasons in my previous comment, that, unless we choose a syntax option that requires it, having only one type of unquoted literal and allowing it anywhere literals are used will make our syntax easier to use than anything that varies. |
It was late for me and I didn't explain this clearly, sorry. I meant that I am glad, however, that
All of these make perfect sense to me. Note that they all start with a digit, and I would qualify them as "numeric literals". I don't want us to be too strict about what is a numeric literal: The common feature of all of them is that they all start with a digit. We can then extend the defnition of a "numeric literals" to datetime values. I realize that perhaps calling them "numeric" is misleading. Can you perhaps suggest a better name? A "digit literal"?
Do you have a concrete example of when this would be useful? What sort of enumerated values would benefit from this syntax? I struggle to even come up with something to use as an example. I'm concerned that not quoting such values makes them not look like values. I hypothesize that I'm not the only developer to whom they look like keywords, commands, function calls, references, etc. OTOH, I don't have any issue with anything starting with a digit to be an unquoted argument. Because of the digit, the "valueness" is made clear. |
My main interest here would be for our syntax to not be weird. By that, I mean that I would prefer for e.g. the rules determining a user's question "Does this literal value need to be quoted?" to be as simple as possible. With our current syntax, those rules are not simple. For example, I'm continuous catching myself when considering that even though I also note that it looks like no-one caught the mistake in @aphillips datetime examples above, where the My preferred solution here would be to only allow the following as unquoted operands or options:
In case you didn't spot it, those rules rather intentionally match JSON. I would separate out the rule for unquoted keys to be a wholly separate one. For that, I would accept any non-empty sequence of Unicode word characters as unquoted. In the syntax, we're treating variant keys as rather different from operators and operands, and we're also using them rather differently when formatting a message. Keys are not in expressions but as their own thing, and they're effectively custom syntax that's enabled by custom functions, rather than values. Each selector (like Put together, the answer I'd like to give to the user of the first paragraph is "In expressions, JSON primitives don't need quotes. In variant keys, words don't need quotes." Simple, clear, not weird. |
I'd be very sad if I wasn't able to type
Can you summarize how |
A more general observation about our unquoted syntax. It occurs to me that we can choose one of three approaches:
Perhaps 1 or 3 would be better than the current 2? |
@eemeli noted:
I agree, although I would go further. The I think the ISO8601-like syntaxes put huge pressure on unquoted if we want to include them. I would favor narrowing unquoted to be somewhat restrictive just to make it easier for folks to understand when @stasm the "maximally liberal" approach has the problem that our syntax depends on sigils. So my tendency is to make things only slightly wider than @eemeli's proposal. The number regex or |
Fewer errors in the messages you write. The simpler the rules, the easier it is to not break them. Us breaking our own current rules in the examples we write is a rather strong indicator that they're suboptimal.
It doesn't allow for at least period
@aphillips Is the rule you're proposing a universal one covering operands, options and keys, or different for some of that? Also, as not all code and source messages are written in English, I feel strongly that if |
The rule I suggested above is for We haven't clarified what we're doing here yet, so we sometimes make bad examples. The idea we're all poking around at here is that unquoted literals should be "word-like" or "identifier-like" and--whoa!--we have an identifier concept. That concept includes namespaces, which do not apply to unquoted, so I think my proposal would be: make unquoted = name
/ number
number = ["-"] (0 / ([1-9]) *DIGIT [ "." 1*DIGIT] [ (%i"e") ("+"/"-") 1*DIGIT] This precludes unquoted time values because it lacks unquoted = name
/ number
/ datetime
; profile of RFC3339 and SEDATE's extension thereof
; see https://www.rfc-editor.org/rfc/rfc3339.html#section-5.6
datetime = 4DIGIT "-" 2DIGIT "-" 2DIGIT [ %s"T" 2DIGIT ":" 2DIGIT [ ":" 2DIGIT [ "." 1*DIGIT ]]
[%s"Z" / ( ("+"/"-") 2DIGIT ":" 2DIGIT ) ]
[ ALPHA *(ALPHA / "_") "/" ALPHA *(ALPHA / "_") ]] This "infects" us with types in some ways, but in a way that users might understand? |
In the teleconference of 2023-12-04 it was agreed that |
I propose that we restrict the literal syntax and disallow unquoted non-numeric literals as expression operands.
Today, literals (quoted and unquoted) can appear in 3 positions:
The first two positions are common and well justified. OTOH, I posit that the third position is very rare, except numeric values. In fact, I'm not aware of strong use-cases for unquoted non-numeric literals as expression operands. Furthermore, I'm concerned about the risk of confusing such unquoted literals for keywords.
Forbidding unquoted non-numeric literals as expression operands will provide two benefits:
Since our syntax uses keywords positioned at the front of a declaration, unquoted literals as expression operands may be confused for novel keywords. Consider:
|now|
and|error|
are clearly recognizable as literal values.It would open new opportunities for introducing new placeholder or expression syntax:
{foo}
. See Open/close design: A familiar HTML-like syntax as an alternative #516 for an example use-case.As a reminder,
{5 :number}
would still be valid under this proposal, as well as:func opt=unquoted
andwhen unquoted
.ABNF-wise, I think we'd need the following two changes:
unquoted
intounquoted-string
andunquoted-numeric
.literal-expression
to"{" [s] (quoted / unquoted-numeric) [s annotation] [s] "}"
The text was updated successfully, but these errors were encountered: