-
Notifications
You must be signed in to change notification settings - Fork 260
[BUG] comma-operator is irrelevant when parsing template-argument-list #103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Ah there's another problem in your parsing document. if (auto e = expression(false)) { // disallow unparenthesized relational comparisons in template args
term.arg = std::move(e);
}
else if (auto i = id_expression()) {
term.arg = std::move(i);
} The |
Any example of code that will suffer from that implementation? |
|
Since it took me a while to spot the ambiguity here (at least partially because I was reading temp_or_object as temporary instead of template), just wanted to confirm that these are the possible interpretations and what parens would be required to disambiguate them:
|
Ah, now I get it. |
I am thinking about simplifying this. Maybe it would be good to introduce a rule that operators need to be separated with whitespace and if you want to write a template you need to put a < b; // correct form of calling < operator
a< b; // error
a<b; // error
temp<int> i1; // ok
temp<
int
> i2; // ok
temp <int> i3; // error I thought about it as I was confused reading your example and my second thought was that we shall support syntax that can be easily interpreted by humans and by compilers. So, maybe we can limit the syntax to only acceptable forms and make it clear what we are looking at. |
That may be a valid engineering solution. I don't care one way or another. But making C++ white space sensitive would be received as a provocation. You'd set off a never-ending amount of bickering. |
Well, it is all about having fewer things to learn, and in most of the coding guidelines we have such rules already:
For sure that is something that will increase the readability of the code and will make writing tools easier as there will be fewer ambiguity cases. That will save us from the code like: return a*pa**ppa**; // 8 which would be forbidden in favor of return a * pa* * ppa**; // 8 and I would love to avoid that kind of ambiguity and puzzle-like code. |
Previously C++ had some whitespace sensitivity requirements, namely that Personally I prefer having whitespace around binary operators, and not having whitespace between a unary operator and its operand. I also prefer not having whitespace between |
Please don't give prefix, infix and postfix operators whitespace-sensitive semantics. Look at the SPECS project from the 90s, which is a context-free reskin of C++. Start there and move it forward to include modern features. Will have to reimagine the declarator syntax, cast-expression, etc. Don't expect it to resemble C++. Can't just wave your hands over C++ and declare it to be context-free, which is what cppfront does currently. I don't think a context-free reskin of C++ is very useful. It presents a big training barrier with almost no benefit. OTOH, nailing down the semantics of the parameter passing directives would be super useful. Designing memory safety types ( |
I think I'm leaning that way as well.
I disagree on the "almost no benefit" here based on how difficult parsing is here. Now, I'm not generally a tool writer, though in a past life I was primary author on some language parsers, so I feel the pain of tool writers having to create context-sensitive parsers. If this lowers the burdens for tool writers and also makes it easier for people to understand, then it's a plus. I have learned multiple programming languages on my own over the last 40 years, I've been writing C++ professionally for 25+ years, have been following standards processes and language development, attending C++ conferences for more than a decade, and it still took me a relatively long time to see the ambiguity in Having said that, I do appreciate the discussion and alternate points of view, so keep it coming. |
It doesn't really matter for compiler writers. C++ is hard to translate for different reasons. I'm not saying no value, but compared to parameter directives or memory safety...? |
I mean that's already the case for cppfront. Declarations use a colon and the type goes after the identifier, the recommended way to cast is to use
Per his comments on the design of the language, that's not Herb's experience (and he should know a thing or two about it 🙂 ), but more importantly not all tool writers are writing compilers. Some are writing refactoring tools. Some are writing intellisense-like plugins. Some are writing simple syntax highlighters. Some are writing static analyzers. None of them have to translate C++, but still are impeded by its syntax. |
I do agree though that being context-free won't be the main driver for adoption: better tools are IMO just a side benefit. Care has to be taken to not end up more verbose than C++, for example (except in cases where the benefit is very clearly obvious to the user). I think SPECS failed in that respect. Swift and Kotlin had that easy: Java and Objective-C were super cumbersome to start with, and just the ability to call existing libraries with a modern syntax was a relief. (The latter, btw, I think is crucial for adoption too: using C++ libraries within cppfront should be seamless, should feel and look like cppfront, and should feel like an improvement). |
I agree with the general sentiment that required whitespace is a no-go. Many people have different code style and even if it may raise eyebrows, requiring that everyone sticks to one general style goes a bit out of realm of a formal language standard (to me, at least). |
I am looking at this from the perspective of the goal of making the language 10x safer. If one of the steps is to require extra space (that is already required by coding guidelines and good practices) I think that this is worth the price. |
I think making progress in https://github.com/hsutter/cppfront#2017-reflection-generation-and-metaclasses is also important for deciding this. Because if the result of that development is that we end up with |
I like the idea that metaclasses could be used to prototype new features/syntax. I've no idea if it could handle the cppfront syntax or not though. Seeing as Circle has already cracked running arbitrary code at compile time, a programmable compiler seems like the obvious next step. I'd be interested to know just how much effort that would take and how much it interests Sean. |
Metaclasses have nothing to do with the grammar. Whitespaces have nothing to do with safety. The grammar is broken. No reason to involve these other issues. |
The choice of Every other modern popular language uses |
I was not clear about the connection between additional spaces and safety. Spaces do not guarantee safety. My point was that e.g. AUTOSAR C++14 coding guidelines or MISRA were created for safety-related systems and they introduce rules that reduce ambiguity and increase readability for developers. The intention is to make it easy to reason about the code. While I was looking for some examples of coding guidelines about that topic I found that (at least code examples I have found) are pretty consistent in using spaces before and after binary operators. |
The grammar is context-free but not LR(1). For a detailed discussion see issue 50. Basically the grammar says that if it can be parsed as a template expression it is an a template expression. If you don't want that you have to add parenthesis. I have written a parsing expression grammar for C++2 that you can use with a parser generator, thus it is certainly context free. The only sad thing is that the grammar is nearly, but not quite LR(1). If you are willing to use a slightly different syntax for templates (e.g., |
Can't believe that's the line this project is taking. |
I would have preferred something else, too. While the behavior is technical speaking well defined (due to the semantic of the PEG grammar) I consider that surprising behavior. And it requires quite some teaching to explain to programmers why they have to add parenthesis if they do not want that behavior. Of course the problem are the alternatives. The traditional C++ approach is context dependent, e.g., the parser recognizes that an identifier is a template type. Which is a nightmare and prevents order independent parsing. Or we change the template syntax to something unambigous (like, e.g., |
I just made the jump in Circle to https://twitter.com/seanbax/status/1592153661438054406 It's a bad choice to keep C++'s original sintax while moving forward. |
Catching up: As Sean knows I'm open to changing the template syntax (when Sean saw this in 2021, I was using @seanbaxter originally asked:
In Cpp2 this is unambiguous, and instantiating I understand the objection that for
Right, I've given that a lot of thought. Super briefly: There's already a parallel today between
Yup. In Cpp2 the deliberate choice is that there is no ambuguity (<-- that was a typo, I was trying to write "ambiguity," but it was such a great typo that now I'm intentionally leaving it there).
Bingo. Again, thanks for the input, and for understand that I'm going to agree to disagree on this one and continue the experiment on this grammatical path for now -- but, as always, staying open to more data and experience. |
I've also tweaked the design note's wording to make it clear that the comma operator issue is a visual ambiguity issue. Today's comma operator is still a factor IMO because arises as a visual ambiguity today in @neumannt's |
While I don't have a proposal for this, I have this thought. VHDL is a language that also has "compile-time code" and "runtime code", and IMO does it well because it was a design goal (rather than a fortunate accident like in C++). To wit: for i in 0 to 7 loop -- this is executed at compile time
my_bus(i + 1) <= my_bus(i); -- this is emitted 8 times
end loop; Caveat for us: Because it describes electronic logic, the runtime part of VHDL is declarative, while the compile-time part is procedural (Pascal-like). Both parts of the language look differently. I think that distinction eased the design a lot. C++ is (apparently) going the opposite direction: From having an OO runtime language and a mix of functional/declarative compile-time language (function and class declarations, and templates), towards making both languages the same. I think it is the right direction for C++. But it's also harder. This way of looking at it might help us support templates seamlessly in a unified compile-time / runtime syntax. The trick would be to first bring what C++ templates can do to the runtime language:
Off the top of my head, I don't know if there are more features that would need to be "ported"? |
https://github.com/hsutter/cppfront/wiki/Design-note%3A-Unambiguous-parsing#a-first-match-wins-there-are-no-comma-expressions-and-a-relational-comparison-in-a-template-argument-must-be-parenthesized
In C++,
b,c
in a template-argument-list cannot be a comma-expression because the grammar sets the template-argument production to conditional-expression, which is higher precedence than expression, which is where the comma is parsed. The notes concerning comma-expression are irrelevant.More generally, I don't see how context insensitivity is achieved. What is changed, other than requiring parens around comparisons in the template-argument-list?
f(temp_or_object<a, b>(c));
Is this disallowed Cpp2 syntax? It's definitely ambiguous.
The text was updated successfully, but these errors were encountered: