-
Notifications
You must be signed in to change notification settings - Fork 13.5k
Support '-fmodule-file-home-is-cwd' for C++ modules. #135147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-clang-modules Author: Michael Park (mpark) Changes
Full diff: https://github.com/llvm/llvm-project/pull/135147.diff 2 Files Affected:
diff --git a/clang/lib/Serialization/ASTWriter.cpp b/clang/lib/Serialization/ASTWriter.cpp
index a48c05061626a..ae771f8f014e4 100644
--- a/clang/lib/Serialization/ASTWriter.cpp
+++ b/clang/lib/Serialization/ASTWriter.cpp
@@ -1493,42 +1493,44 @@ void ASTWriter::WriteControlBlock(Preprocessor &PP, StringRef isysroot) {
unsigned AbbrevCode = Stream.EmitAbbrev(std::move(Abbrev));
RecordData::value_type Record[] = {MODULE_NAME};
Stream.EmitRecordWithBlob(AbbrevCode, Record, WritingModule->Name);
- }
- if (WritingModule && WritingModule->Directory) {
- SmallString<128> BaseDir;
- if (PP.getHeaderSearchInfo().getHeaderSearchOpts().ModuleFileHomeIsCwd) {
- // Use the current working directory as the base path for all inputs.
- auto CWD = FileMgr.getOptionalDirectoryRef(".");
- BaseDir.assign(CWD->getName());
- } else {
- BaseDir.assign(WritingModule->Directory->getName());
- }
- cleanPathForOutput(FileMgr, BaseDir);
-
- // If the home of the module is the current working directory, then we
- // want to pick up the cwd of the build process loading the module, not
- // our cwd, when we load this module.
- if (!PP.getHeaderSearchInfo().getHeaderSearchOpts().ModuleFileHomeIsCwd &&
- (!PP.getHeaderSearchInfo()
- .getHeaderSearchOpts()
- .ModuleMapFileHomeIsCwd ||
- WritingModule->Directory->getName() != ".")) {
- // Module directory.
- auto Abbrev = std::make_shared<BitCodeAbbrev>();
- Abbrev->Add(BitCodeAbbrevOp(MODULE_DIRECTORY));
- Abbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Blob)); // Directory
- unsigned AbbrevCode = Stream.EmitAbbrev(std::move(Abbrev));
+ auto BaseDir = [&]() -> std::optional<SmallString<128>> {
+ if (PP.getHeaderSearchInfo().getHeaderSearchOpts().ModuleFileHomeIsCwd) {
+ // Use the current working directory as the base path for all inputs.
+ auto CWD = FileMgr.getOptionalDirectoryRef(".");
+ return CWD->getName();
+ }
+ if (WritingModule->Directory) {
+ return WritingModule->Directory->getName();
+ }
+ return std::nullopt;
+ }();
+ if (BaseDir) {
+ cleanPathForOutput(FileMgr, *BaseDir);
+ // If the home of the module is the current working directory, then we
+ // want to pick up the cwd of the build process loading the module, not
+ // our cwd, when we load this module.
+ if (!PP.getHeaderSearchInfo().getHeaderSearchOpts().ModuleFileHomeIsCwd &&
+ (!PP.getHeaderSearchInfo()
+ .getHeaderSearchOpts()
+ .ModuleMapFileHomeIsCwd ||
+ WritingModule->Directory->getName() != ".")) {
+ // Module directory.
+ auto Abbrev = std::make_shared<BitCodeAbbrev>();
+ Abbrev->Add(BitCodeAbbrevOp(MODULE_DIRECTORY));
+ Abbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Blob)); // Directory
+ unsigned AbbrevCode = Stream.EmitAbbrev(std::move(Abbrev));
+
+ RecordData::value_type Record[] = {MODULE_DIRECTORY};
+ Stream.EmitRecordWithBlob(AbbrevCode, Record, *BaseDir);
+ }
- RecordData::value_type Record[] = {MODULE_DIRECTORY};
- Stream.EmitRecordWithBlob(AbbrevCode, Record, BaseDir);
+ // Write out all other paths relative to the base directory if possible.
+ BaseDirectory.assign(BaseDir->begin(), BaseDir->end());
+ } else if (!isysroot.empty()) {
+ // Write out paths relative to the sysroot if possible.
+ BaseDirectory = std::string(isysroot);
}
-
- // Write out all other paths relative to the base directory if possible.
- BaseDirectory.assign(BaseDir.begin(), BaseDir.end());
- } else if (!isysroot.empty()) {
- // Write out paths relative to the sysroot if possible.
- BaseDirectory = std::string(isysroot);
}
// Module map file
diff --git a/clang/test/Modules/relocatable-modules.cpp b/clang/test/Modules/relocatable-modules.cpp
new file mode 100644
index 0000000000000..27e7330835b34
--- /dev/null
+++ b/clang/test/Modules/relocatable-modules.cpp
@@ -0,0 +1,43 @@
+// RUN: rm -rf %t
+// RUN: mkdir -p %t
+// RUN: split-file %s %t
+
+// CHECK-NOT: MODULE_DIRECTORY
+
+// RUN: cd %t
+// RUN: %clang_cc1 -std=c++20 -emit-module-interface %t/a.cppm -o %t/a-abs.pcm
+
+// RUN: llvm-bcanalyzer --dump --disable-histogram %t/a-abs.pcm | FileCheck %s
+// RUN: llvm-bcanalyzer --dump --disable-histogram %t/a-abs.pcm \
+// RUN: | FileCheck %s --check-prefix=INPUT-ABS -DPREFIX=%t
+
+// RUN: %clang_cc1 -std=c++20 -emit-module-interface %t/a.cppm -o %t/a-rel.pcm \
+// RUN: -fmodule-file-home-is-cwd
+
+// RUN: llvm-bcanalyzer --dump --disable-histogram %t/a-rel.pcm | FileCheck %s
+// RUN: llvm-bcanalyzer --dump --disable-histogram %t/a-rel.pcm \
+// RUN: | FileCheck %s --check-prefix=INPUT-REL
+
+// INPUT-ABS: <INPUT_FILE {{.*}}/> blob data = '[[PREFIX]]{{/|\\}}a.cppm'
+// INPUT-REL: <INPUT_FILE {{.*}}/> blob data = 'a.cppm'
+
+//--- a.cppm
+export module a;
+
+// RUN: cd %S
+// RUN: %clang_cc1 -std=c++20 -emit-header-unit -xc++-user-header Inputs/cxx-header.h \
+// RUN: -o %t/cxx-header-abs.pcm
+
+// RUN: llvm-bcanalyzer --dump --disable-histogram %t/cxx-header-abs.pcm | FileCheck %s
+// RUN: llvm-bcanalyzer --dump --disable-histogram %t/cxx-header-abs.pcm \
+// RUN: | FileCheck %s --check-prefix=HU-INPUT-ABS -DPREFIX=%S
+
+// RUN: %clang_cc1 -std=c++20 -emit-header-unit -xc++-user-header Inputs/cxx-header.h \
+// RUN: -fmodule-file-home-is-cwd -o %t/cxx-header-rel.pcm
+
+// RUN: llvm-bcanalyzer --dump --disable-histogram %t/cxx-header-rel.pcm | FileCheck %s
+// RUN: llvm-bcanalyzer --dump --disable-histogram %t/cxx-header-rel.pcm \
+// RUN: | FileCheck %s --check-prefix=HU-INPUT-REL
+
+// HU-INPUT-ABS: <INPUT_FILE {{.*}}/> blob data = '[[PREFIX]]{{/|\\}}Inputs{{/|\\}}cxx-header.h'
+// HU-INPUT-REL: <INPUT_FILE {{.*}}/> blob data = 'Inputs{{/|\\}}cxx-header.h'
|
@llvm/pr-subscribers-clang Author: Michael Park (mpark) Changes
Full diff: https://github.com/llvm/llvm-project/pull/135147.diff 2 Files Affected:
diff --git a/clang/lib/Serialization/ASTWriter.cpp b/clang/lib/Serialization/ASTWriter.cpp
index a48c05061626a..ae771f8f014e4 100644
--- a/clang/lib/Serialization/ASTWriter.cpp
+++ b/clang/lib/Serialization/ASTWriter.cpp
@@ -1493,42 +1493,44 @@ void ASTWriter::WriteControlBlock(Preprocessor &PP, StringRef isysroot) {
unsigned AbbrevCode = Stream.EmitAbbrev(std::move(Abbrev));
RecordData::value_type Record[] = {MODULE_NAME};
Stream.EmitRecordWithBlob(AbbrevCode, Record, WritingModule->Name);
- }
- if (WritingModule && WritingModule->Directory) {
- SmallString<128> BaseDir;
- if (PP.getHeaderSearchInfo().getHeaderSearchOpts().ModuleFileHomeIsCwd) {
- // Use the current working directory as the base path for all inputs.
- auto CWD = FileMgr.getOptionalDirectoryRef(".");
- BaseDir.assign(CWD->getName());
- } else {
- BaseDir.assign(WritingModule->Directory->getName());
- }
- cleanPathForOutput(FileMgr, BaseDir);
-
- // If the home of the module is the current working directory, then we
- // want to pick up the cwd of the build process loading the module, not
- // our cwd, when we load this module.
- if (!PP.getHeaderSearchInfo().getHeaderSearchOpts().ModuleFileHomeIsCwd &&
- (!PP.getHeaderSearchInfo()
- .getHeaderSearchOpts()
- .ModuleMapFileHomeIsCwd ||
- WritingModule->Directory->getName() != ".")) {
- // Module directory.
- auto Abbrev = std::make_shared<BitCodeAbbrev>();
- Abbrev->Add(BitCodeAbbrevOp(MODULE_DIRECTORY));
- Abbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Blob)); // Directory
- unsigned AbbrevCode = Stream.EmitAbbrev(std::move(Abbrev));
+ auto BaseDir = [&]() -> std::optional<SmallString<128>> {
+ if (PP.getHeaderSearchInfo().getHeaderSearchOpts().ModuleFileHomeIsCwd) {
+ // Use the current working directory as the base path for all inputs.
+ auto CWD = FileMgr.getOptionalDirectoryRef(".");
+ return CWD->getName();
+ }
+ if (WritingModule->Directory) {
+ return WritingModule->Directory->getName();
+ }
+ return std::nullopt;
+ }();
+ if (BaseDir) {
+ cleanPathForOutput(FileMgr, *BaseDir);
+ // If the home of the module is the current working directory, then we
+ // want to pick up the cwd of the build process loading the module, not
+ // our cwd, when we load this module.
+ if (!PP.getHeaderSearchInfo().getHeaderSearchOpts().ModuleFileHomeIsCwd &&
+ (!PP.getHeaderSearchInfo()
+ .getHeaderSearchOpts()
+ .ModuleMapFileHomeIsCwd ||
+ WritingModule->Directory->getName() != ".")) {
+ // Module directory.
+ auto Abbrev = std::make_shared<BitCodeAbbrev>();
+ Abbrev->Add(BitCodeAbbrevOp(MODULE_DIRECTORY));
+ Abbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Blob)); // Directory
+ unsigned AbbrevCode = Stream.EmitAbbrev(std::move(Abbrev));
+
+ RecordData::value_type Record[] = {MODULE_DIRECTORY};
+ Stream.EmitRecordWithBlob(AbbrevCode, Record, *BaseDir);
+ }
- RecordData::value_type Record[] = {MODULE_DIRECTORY};
- Stream.EmitRecordWithBlob(AbbrevCode, Record, BaseDir);
+ // Write out all other paths relative to the base directory if possible.
+ BaseDirectory.assign(BaseDir->begin(), BaseDir->end());
+ } else if (!isysroot.empty()) {
+ // Write out paths relative to the sysroot if possible.
+ BaseDirectory = std::string(isysroot);
}
-
- // Write out all other paths relative to the base directory if possible.
- BaseDirectory.assign(BaseDir.begin(), BaseDir.end());
- } else if (!isysroot.empty()) {
- // Write out paths relative to the sysroot if possible.
- BaseDirectory = std::string(isysroot);
}
// Module map file
diff --git a/clang/test/Modules/relocatable-modules.cpp b/clang/test/Modules/relocatable-modules.cpp
new file mode 100644
index 0000000000000..27e7330835b34
--- /dev/null
+++ b/clang/test/Modules/relocatable-modules.cpp
@@ -0,0 +1,43 @@
+// RUN: rm -rf %t
+// RUN: mkdir -p %t
+// RUN: split-file %s %t
+
+// CHECK-NOT: MODULE_DIRECTORY
+
+// RUN: cd %t
+// RUN: %clang_cc1 -std=c++20 -emit-module-interface %t/a.cppm -o %t/a-abs.pcm
+
+// RUN: llvm-bcanalyzer --dump --disable-histogram %t/a-abs.pcm | FileCheck %s
+// RUN: llvm-bcanalyzer --dump --disable-histogram %t/a-abs.pcm \
+// RUN: | FileCheck %s --check-prefix=INPUT-ABS -DPREFIX=%t
+
+// RUN: %clang_cc1 -std=c++20 -emit-module-interface %t/a.cppm -o %t/a-rel.pcm \
+// RUN: -fmodule-file-home-is-cwd
+
+// RUN: llvm-bcanalyzer --dump --disable-histogram %t/a-rel.pcm | FileCheck %s
+// RUN: llvm-bcanalyzer --dump --disable-histogram %t/a-rel.pcm \
+// RUN: | FileCheck %s --check-prefix=INPUT-REL
+
+// INPUT-ABS: <INPUT_FILE {{.*}}/> blob data = '[[PREFIX]]{{/|\\}}a.cppm'
+// INPUT-REL: <INPUT_FILE {{.*}}/> blob data = 'a.cppm'
+
+//--- a.cppm
+export module a;
+
+// RUN: cd %S
+// RUN: %clang_cc1 -std=c++20 -emit-header-unit -xc++-user-header Inputs/cxx-header.h \
+// RUN: -o %t/cxx-header-abs.pcm
+
+// RUN: llvm-bcanalyzer --dump --disable-histogram %t/cxx-header-abs.pcm | FileCheck %s
+// RUN: llvm-bcanalyzer --dump --disable-histogram %t/cxx-header-abs.pcm \
+// RUN: | FileCheck %s --check-prefix=HU-INPUT-ABS -DPREFIX=%S
+
+// RUN: %clang_cc1 -std=c++20 -emit-header-unit -xc++-user-header Inputs/cxx-header.h \
+// RUN: -fmodule-file-home-is-cwd -o %t/cxx-header-rel.pcm
+
+// RUN: llvm-bcanalyzer --dump --disable-histogram %t/cxx-header-rel.pcm | FileCheck %s
+// RUN: llvm-bcanalyzer --dump --disable-histogram %t/cxx-header-rel.pcm \
+// RUN: | FileCheck %s --check-prefix=HU-INPUT-REL
+
+// HU-INPUT-ABS: <INPUT_FILE {{.*}}/> blob data = '[[PREFIX]]{{/|\\}}Inputs{{/|\\}}cxx-header.h'
+// HU-INPUT-REL: <INPUT_FILE {{.*}}/> blob data = 'Inputs{{/|\\}}cxx-header.h'
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't take a close look. But I don't feel we shall do it behind a flag. I think we should make BMI of C++20 modules (at least C++20 named modules) relocatable.
In our downstream, IIUC, we make it by always enabling "-fmodules-embed-all-files " . See discussion here: #72383
Correct me if I didn't understand your problem.
Hm.. I'm not too sure about C++ named modules but for header units, the absolute path of an imported PCM via As far as I can tell, |
Ah, interesting. It looks like indeed, this doesn't apply to C++ named modules.
|
BTW, I think
With |
If possible, I'll suggest you to do similar things for header units. And I think it is better to avoid using the a lot flags for existing clang header modules. I feel the user interfaces (including flags) of clang header modules is confusing to users. C++20 modules may be a chance to provide a more uniform or more clear interface to users. |
828378b
to
06b533a
Compare
Yeah, we do actually use |
If you did |
We don't do anything special downstream. As far as I know it already works today. |
If |
Well, it "works" but my understanding is that it's not officially supported today in Clang. My understanding is that header units technically are anonymous, and therefore don't have a module name. Is that correct? What we do downstream roughly is to just give the header unit a module name corresponding to its path.
then to use it, provide:
This makes it such that an |
Separately, even with named modules, with or without |
06b533a
to
74e3f0a
Compare
Do you use |
Hi Richard! Hm, no we do not. I haven't seen this before. I can try it though 🤔 EDIT: Just tried it out... doesn't seem to do anything for me. Still getting absolute paths in the PCM. |
'-fmodule-file-home-is-cwd' was added in llvm@646e502 to support relocatable PCMs for Clang modules. This PR extends the functionality for standard C++ modules.
74e3f0a
to
37a5c57
Compare
By the way @zygoloid, it looks like you reviewed https://reviews.llvm.org/D51568 which had a similar goal, back in 2018 that didn't get committed. |
Actually, isn't this a problem even with |
Going back to the idea of not writing the paths of imported PCMs at all; currently the condition to not write the paths is if the imported PCM is a named module. Perhaps this condition can be extended to omit the paths if the PCM was found through |
|
IIRC, this relates to build systems. While not writing the imported BMI paths in the BMI, the cost is, the build system must provide a full set of I think this is not related to |
Synced with @ChuanqiXu9 offline about this. Summarizing the discussion so far:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I have some concern about the interface. But I am in the camp that we'd better to get best practice from practice instead of pure imagination. And given Meta is doing the experiments in practice, I think it is fine to let it go.
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/88/builds/10403 Here is the relevant piece of the build log for the reference
|
Summary: This diff specifies `-fmodule-file-home-is-cwd` flag which allows to produce relocatable PCMs without the internal patch from T32246672. In the current state and our use of Clang, this flag ends up just being ignored. However, in the rollout of Clang with the internal patch reverted and [llvm#135147](llvm/llvm-project#135147) backported, it will end up having the same effect of storing relative paths in PCMs as before. Reviewed By: dmpolukhin Differential Revision: D72999617 fbshipit-source-id: 222857daebed514951c72882e46b0c6478938544
Summary: This diff specifies `-fmodule-file-home-is-cwd` flag which allows to produce relocatable PCMs without the internal patch from T32246672. In the current state and our use of Clang, this flag ends up just being ignored. However, in the rollout of Clang with the internal patch reverted and [llvm#135147](llvm/llvm-project#135147) backported, it will end up having the same effect of storing relative paths in PCMs as before. Reviewed By: dmpolukhin Differential Revision: D72999617 fbshipit-source-id: 222857daebed514951c72882e46b0c6478938544
-fmodule-file-home-is-cwd
was added in 646e502 to support relocatable PCMs for ObjC and Clang modules. This PR extends the functionality to allow C++ modules to be relocatable.C++ named modules today do not directly write the imported PCM's path (presumably because we know we have to be able to discover the location of the PCM through one of the
-fmodule-file=<name>=<pcm>
flags). However, theINPUT_FILE
fields of the PCM still contains absolute paths of the input files. @ChuanqiXu9 mentioned that there's been a discussion of-fmodules-embed-all-files
in #72383 that would embed the input files into the PCM itself. My understanding is that this effectively ignores the absolute paths of the input files in the PCM. Without-fmodules-embed-all-files
though, the input files remain absolute paths today, which can cause unnecessary cache misses in distributed builds.My actual use case is for C++ header units. It has the same problem of absolute path input files, but has the additional problem that header units do directly write the imported PCM's absolute path. Since header units technically are anonymous and cannot rely on
-fmodule-file=<name>=<pcm>
to discover the location of the PCM, it doesn't seem like it's feasible for header units to simply omit this like named modules do... but I'm not certain about this.The proposed fix here is to extend
-fmodule-file-home-is-cwd
which already exists, such that