Description
While there is an annoying issue called "include after import": https://clang.llvm.org/docs/StandardCPlusPlusModules.html#including-headers-after-import-is-problematic, it should still be problematic even after we fix that. Since such style may increase the compilation time (or the size of BMIs, if it exists). And this is fundamental to clang and it looks hard to fix it completely and perfectly in the side side.
And I feel this is straightforward for people to understand:
module;
import std;
#include <vector>
#include <string>
export module M;
...
may have larger BMI and compile slower than:
module;
import std;
export module M;
...
While the example looks silly, it is actually pretty common since the standard library can be used in other headers.
In a private meeting with MSVC developer, he mentioned MSVC have extensions (or plan to?) to skip the standard headers if the std module is imported.
I feel this sounds good and can be pretty helpful to end users.
For the implementation, I don't have a complete design now. In my mind, the immediate idea may be:
- Implement this in the library side completely.
- Implement this in the compiler side completely.
- Implement this within the compiler and the library.
The idea to implement this in the library may require the library to provide an additional header to include all the controlling macros. So that the user (manually) can import std in a way like:
import std;
#include <controlling_macros_for_std_headers>
#include "..."
The idea in the compiler side may need to hardcode all the filenames for the standard headers and skip entering such headers.
The idea to implement this in the compiler side and the library side is the library provides such header and the compiler can insert it automatically.
The idea to implement this is still in the early phase. Any comments are welcomed.
Activity
llvmbot commentedon Feb 5, 2024
@llvm/issue-subscribers-clang-modules
Author: Chuanqi Xu (ChuanqiXu9)
And I feel this is straightforward for people to understand:
may have larger BMI and compile slower than:
While the example looks silly, it is actually pretty common since the standard library can be used in other headers.
In a private meeting with MSVC developer, he mentioned MSVC have extensions (or plan to?) to skip the standard headers if the std module is imported.
I feel this sounds good and can be pretty helpful to end users.
For the implementation, I don't have a complete design now. In my mind, the immediate idea may be:
The idea to implement this in the library may require the library to provide an additional header to include all the controlling macros. So that the user (manually) can import std in a way like:
The idea in the compiler side may need to hardcode all the filenames for the standard headers and skip entering such headers.
The idea to implement this in the compiler side and the library side is the library provides such header and the compiler can insert it automatically.
The idea to implement this is still in the early phase. Any comments are welcomed.
ChuanqiXu9 commentedon Feb 5, 2024
I think the option1 may be best since it can be extended to other libraries. I document it in #80687.
mordante commentedon Feb 18, 2024
I typed a long reply last week, but it seems I forgot to press comment :-/
I think there might be something possible on the library side. However the tricky part would be the macros. For example, feature-test macros, errno, and assert all require proper macro support. So I think
#include <controlling_macros_for_std_headers>
is not feasible since it might change the observable behavior.Maybe it would be possible to add a special pragma to a header to tell the compiler to stop processing. Something along the lines of
In that case the compiler can stop processing when the module
std
has been imported.Stopping at this point means it's still in the
#ifndef _LIBCPP_FOO_H
so the pre-processor needs to assume there is a matching#endif
. This could be tricky when the file is experimental and has an extra#ifdef
like__chrono/tzdb.h
.If multiple nested
#if
makes things hard for the compiler we could exclude these files.Would this be feasible to implement in the compiler?
(I've not consulted the other libc++ developers on their thoughts.)
ChuanqiXu9 commentedon Feb 19, 2024
I don't understand this. In my mind the
controlling_macros_for_std_headers
won't contain control macros for<version>
,<assert>
and<errno>
. Will it still produce observable behavior in this way?Yeah, as you said, it looks not easy due to we need to find the last
#endif
. We may have to preprocess the whole file to find the#endif
. Then I think we need a drastic change to the preprocessor to teach it to skip things... I feel this may not be feasible...mordante commentedon Feb 19, 2024
For example
Inside the module users can now use FTM available in
<string>
and<vector>
. So the transformation should retain this observable behaviour. This seems to be very hard to do correctly. Clang 15 does not know which feature-test macros libc++16 provides. So it would need additional logic. Even that logic is error-prone since libc++ might do things different in a future version, which feels like an unwanted coupling between Clang and libc++.This still requires Clang to parse the entire file, but the pre-processor will "remove" the implementation when the proper module is imported. If this works it might be made more generic by using something like
# !__has_imported(std)
, which could be used for other modules too, like#!__has_imported(fmt)
. This would allow other libaries to use the same optimizations as libc++. (Maybe other compiler vendors would like to do something similar.)This is trivial to do in libc++ and I expect this shouldn't be too hard to implement in Clang. What do you think of this approach?
ChuanqiXu9 commentedon Feb 20, 2024
(IIUC,
FTM
means macro definitions, right?)Got your point. But this is not related to Clang in my mind. If we choose the option 1, libc++ will provide the
controlling_macros_for_std_headers
I called and the users need to introduce#include <controlling_macros_for_std_headers>
explicitly. There is nothing to do with the compiler. So we don't need to worry about version conflicting here.(BTW, maybe we can include headers like in controlling_macros_for_std_headers too, it is not decided.)
On the one hand, it should be easy for the compiler to leak the macro definition
<module-name>_module_imported
then the users can consume it by something like#if defined(<module-name>_module_imported)
.On the other hand, however, it requires the BMI to be present before preprocessing. This is an old fixed bug in clang.
That is:
And the command
will complain things like "failed to find module
a
". This breaks the design in some level. I feel the preprocessor shouldn't be affected by named modules.And this is the reason why I prefer this to be a library solution. A pure library solution is more flexible and generalized to other libraries.
Maybe it is also an idea for the libc++ library to provide a header
<std_module_imported>
and define the macrostd_module_imported
?I guess you may feel it may not be convinient for the users to include an additional header but I guess it may be reasonable for users to understand. And users only need this if they want to mix includes and imports.
using namespace std
before include causes ambiguous reference error #96423