Description
Full name of submitter (unless configured in github; will be published with the issue): Hubert Tong
Reference (section label): lex.pptoken, lex.phases, cpp.cond
Link to reflector thread (if any): N/A
Issue description:
The specification for __has_include
is unclear on how any potential tokenization as a header-name occurs relative to has-include-expression syntax matching and macro expansion.
Consider: https://godbolt.org/z/ThK75Evb1
#define EMPTY
#define IGNORE(X)
#define stdio nosuch
#if __has_include(EMPTY <stdio.h>)
#error Hello! Header name formed!
#endif
Given that, before macro expansion, its is potentially unknown whether <stdio.h>
is within a has-include-expression (e.g., if EMPTY
expands to <iostream>) IGNORE(
), it is unclear whether https://wg21.link/lex.pptoken#4.3.2 specifies that <stdio.h>
in the above forms a header-name. If we take phase 3 of [lex.phases]/1 as separate from phase 4, then a potential interpretation is that <stdio.h>
in the above is tokenized as a header-name regardless of whether it is part of a has-include-expression after macro expansion because it appears after __has_include(
and before a paired )
. It is also a valid interpretation that <stdio.h>
is not tokenized as a header-name because it is not the immediately after __has_include(
.
There is implementation divergence: Clang and both MSVC preprocessor implementations form a header-name; GCC and EDG do not. It actually seems that Clang and MSVC do not maintain proper separation between phase 3 and phase 4.
Suggested resolution:
Replace https://wg21.link/lex.pptoken#4.3.2 to say:
- in phase 3 of translation, immediately after a preprocessing token sequence of
__has_include
followed immediately by(
.
See also https://wg21.link/CWG2190 (which may be NAD).