-
Notifications
You must be signed in to change notification settings - Fork 260
[SUGGESTION] Quote string literals with backticks #289
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Here is an example of multi-line string literal:
|
So this code:
is equivalent to cpp1
and this code
is equivalent to cpp1:
So the parsing is that two consecutive backticks inside a backtick string are a literal backtick instead of stopping one string and starting another? |
Yes, that's it. |
Thanks, I do appreciate the interest and the thoughtful ideas here. Sorry to decline, but I'm not going to pursue this direction for now. Three things:
Sorry to say no (or at least "not yet") to this, but again thank you! |
This suggestion is a rework from this issue. The syntax of my suggestion is not important, and it can be anything that fits into C++2. I know I must keep things simple and obvious in my suggestion. To do this, I should minimize concepts and keep the syntax familiar to programmers as much as I can. We can have the following string literals:
In the future, C++2 may introduce more string literals as well. Does it resemble this video about C++1 initialization? Maybe a little. But one string literal is enough, because they can be fundamentally the building block of producing other string literals. In other words, C++2 internally can join multiple raw string literals, escape sequences, captures and other language expressions to produce the string value.
I want to suggest a radical change to string literals by starting from the begining of how to write string literals. 3 common symbols double quote
"
, single quote'
and backtick`
are suitable to quote string literals. If we look at how sentences are written in English, it would be obvious that double quote"
and single quote'
are more often used than backtick`
, also an analysis is available here that is interesting because double quote"
is more frequency used than single quote'
. Therefore backtick`
is an appropriate symbol to be the only escape character in string literals, because it's not a common punctation mark in English and most of the other languages, also it was mainly designed for typewriters as described here, maybe that is why markup languages such as Markdown use backtick`
to create inline code inside normal text. It should be explained that JavaScript uses backtick`
for template literals, also D and Go use it for raw string literals. So, backtick`
should be the only character that have a special behaviour in string literals.String literals will be quoted inside backticks
`
, and they don't understand escape sequences and captures until we put them inside a nested backtick`
. Captures may have extra parenthesis for expressions, or when escape sequences are beside them. For example:To write a backtick
`
inside a string literal, we can write double backticks``
. String literals placed side-by-side are concatenated, but a white-space should be between them otherwise they will be treated like a single string literal which contains double backticks``
. For example:In a nutshell,
`User``-``name`
is not equal to`User` `-` `name`
.The goal of my suggestion is to keep it simple to teach and familiar to programmers, that's why I keep symbol
\
for escape sequence such as\n
whereas I could remove or change it in my suggestion.String Expression
As you can see, the syntax is similar to current C++2. Programmers put nested backtick expressions inside string literals, although it can be viewed a little bit different that I'll explain in the next paragraph.
Consider string literal:
`Name: `(user)$\n`Age: `age$``
, let's call it a string expression, it is a combination sequence of the following elements respectively which has to both start and end with a string literal:`Name: `
(user)$
\n
`Age: `
age$
``
String expressions can have one of encoding prefixes
L
,u8
,u
orU
, and they can have suffixes:But that's not enough without character literals.
A string literal is a sequence of character literals, that's why I have to also consider character literals. Character literals like before, can have escape sequences, but the notation is
c`...`
. For example:'n'
becomesc`n`
'\n'
becomesc`\n`
'\x{6e}'
becomesc`\x{6e}`
''
doesn't have any meaning in C++2, becuase character literals cannot be empty andc``
is the backtick`
itself.Character literals placed side-by-side are not concatenated. Multi-character literals must have prefix
b
which means'ABCD'
becomesb`ABCD`
, because multi-character literals have a different underlying type, they should be visually different. For example:We can use other notations for character literals, my recommended notation
c`...`
has two benefits:c``
is simply the backtick`
itself (similar to double backticks inside string literals).`
is enough for both string literals and character literals, and if C++2 use underline_
(or backtick`
) instead of single quote'
as digit separator e.g.1'500'444
becomes1_500_444
(similar to Python language) (or1`500`444
), then it's possible to reserve double quotes"
and single quotes'
for future use either as new operators or new literals.Will your feature suggestion eliminate X% of security vulnerabilities of a given kind in current C++ code?
No.
Will your feature suggestion automate or eliminate X% of current C++ guidance literature?
Yes. It will do in the following ways:
\
or()$
or etc) for each feature in string literals, because in addition to escape sequences and captures, another new expressions can be added later. The point is, all of them are available just with a single backtick`
instead of introducing new escape characters inside string literals such as\
or()$
or etc. A single backtick`
may end the string literal, may be a backtick itself (with double backticks``
) and may be an escape sequence (`\...`
) or a capture (`...$`
) or a combination of them. In addition, more expressions can be allowed besides escape sequences and captures.Will your feature suggestion remove unnecessary syntax or concepts?
Yes, my suggestion is a little verbose. Backtick
`
will be used for quoting both string literals and character literals. Also if we use underline_
or backtick`
as digit separator (e.g.1_500_444
or1`500`444
), then it allows C++2 to use both double quotes"
and single quotes'
either as new operators or new literals. For example:Also escape sequences
\'
and\"
for quotes are not needed anymore, and escape sequence\`
is not needed for backtick.The text was updated successfully, but these errors were encountered: