Description
Currently $R
prefix is used to quote interpolated raw string literals in C++2, e.g. $R"(text)"
.
Template String Literals
Before I start to explain about my suggestion, I should mention that we don't have an empty character literal, therefore ''
is a syntax error in C++1. Instead of being a syntax error, I suggest to make ''
the start and end of interpolated raw string literals:
// := $R"(This is an (interpolated)$ (raw)$ string literal.)";
x1 := ''This is an (interpolated)$ (raw)$ string literal.'';
// := $R"(Escape sequences such as \n does not work.)";
x2 := ''Escape sequences such as \n does not work.'';
// := $R"(Also a single quote ' doesn't close it.)";
x3 := ''Also a single quote ' doesn't close it.'';
To write multiple single quotes '
inside interpolated raw string literals, we can start and end them with more '
characters:
x1 := ''This ' doesn't close it.'';
x2 := '''This '' doesn't close it.''';
x3 := ''''This ''' doesn't close it.'''';
x4 := '''''This '''' doesn't close it.''''';
NOTE 1: Also if they are placed side-by-side, then they will be concatenated with normal string literals (because they both are interpolated), that's the other reason why I suggested ''
instead of """
(in languages such as C#) for them to be easily distinguishable when they are placed side-by-side. To have '
as a character at the begining or end in the content of interpolated raw string literals, we concatenate them with normal string literals instead of allowing optional separators such as new-lines (like in C#) or white-spaces (like in Markdown) between content and quotes. More examples:
// := $R"(This is an )" + $R"(interpolated raw)" + $R"( string literal)";
x1 := ''This is an '' + ''interpolated raw'' + '' string literal'';
// It's a compiler error in current C++2.
// I don't know if it's acceptable for $R"(...)" to be concatenated together.
// := $R"(This is an )"$R"(interpolated raw)"$R"( string literal)";
x2 := ''This is an ''''interpolated raw'''' string literal'';
// ''This is an '' + ''interpolated raw'' + '' string literal''
// NOTE 1
// := $R"('This text is quoted inside ' characters.')";
x3 := "'"''This text is quoted inside ' characters.''"'";
// "'" + ''This text is quoted inside ' characters.'' + "'"
Finally, I have to mention that interpolated raw string literals are not a real raw... maybe we should define a new term for it, something like Interpolated Non-escape-sequenced String Literals. Simply we can call them Template String Literals.
Just like character literals, interpolated raw string literals cannot be empty, becuase multiple '
characters (such as ''
or '''
) always form the beginning of a string literal:
// It's started with '''' and must have a content and ends with ''''
x0 := ''''; // Compiler ERROR!
x1 := ""; // OK.
My suggestion in a nutshell
- These two syntaxes will be for interpolated string literals (non-raw or raw) in C++2:
// Interpolated Non-raw String Literal
x1 := "text";
// Interpolated Raw String Literal
x2 := ''text'';
// or '''text'''
// or more ''''...
- In addition, this syntax will be for non-interpolated string literals (raw only) in C++2:
// Non-Interpolated Raw String Literal
x3 := R"(text)";
// or R"x(text)x"
// or more R"xx...
Why do I suggest this change?
R"(text)"
is a powerfull raw string literal, but most of the time we just want to disable escape sequences and be able to simply write single quotes '
and double quotes "
inside a string literal. Simply we can call them Template String Literals. Using ''
is more readable and more convenient with less typing than $R"(
to start an interpolated raw string literal.
Also ''
is a syntax error in C++1 becuase we don't have an empty character literal in C++, therefore we can use this never used potential syntax, and programmers won't ask why ''
doesn't work (becuase someone may think it should be a null character), they simply learn ''
is the start and end of an interpolated raw (non-escape-sequenced) string literal.
We can categorize string literals in a way that ''
is visually similar to "
, both ''text''
(without escape sequences) and "text"
(with escape sequences) are interpolated string literals because they haven't a prefix, but R"(...)"
is non-interpolated non-escape-sequenced (real raw) string literal because it's prefixed with R
and has paranthesis for more complex texts.
Is there any exprience, data or working implementation available?
My suggestion is similar to raw string literals in C# programming language, but C# 11 uses at least triple double quotes to start and end raw string literals, e.g. """A raw string literal in C# 11"""
. The first and last new-lines of the content won't be ignored in my suggestion, becuase C++2 can concatenate side-by-side interpolated string literals (see NOTE 1 and the example code), but in C# we have to put the content in a separate line if we want to start the content with "
, except that everything is the same.
Also my suggestion is similar to inline code in Markdown launguage, but it uses at least a backtick `
instead of double single quotes ''
, except that everything is the same.
I have to mention Python have triple single quotes '''
and triple double quotes """
for multi-line string literals.
Literally experiences from C#, Python and Markdown languages can be reviewed.