-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[C++20] [Modules] All BMIs are irreproducible #62269
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@llvm/issue-subscribers-clang-modules |
(I removed [libc++] in the title since this looks not related to libc++)
It looks like the key issue is that you can't reuse the same BMI in different machines due to the path to the source files is not the same, is it? If yes, could you try to use Note that the |
FWIW at Google we use Clang Header Modules (not C++20 modules) with a hermetic/distributed build - so there's probably some combo of flags that avoids this problem, but I don't know what it is off hand (& it might not be fully integrated with C++20 modules, or might be some weird cc1 flags, etc) |
@ChuanqiXu9 Yes. This could also affect a distribution that packages a prebuilt variant of a module against their stable-release clang. A vendor might think it's ok to distribute the BMIs because they use the same exact AST-compatible clang version, but in the end things break because they might have compiled the packge in a different location than the final installation location. I tried adding
@dwblaikie I've tried finding such a combo, but wasn't successful yet. I think this with PCHs one can use some magic from https://clang.llvm.org/docs/Modules.html#command-line-parameters and import PCHs direcly. I believe this issue is specific to the standard variant of modules though. The posted script will fail with something like this: fatal error: malformed or corrupted AST file: 'could not find file
'/home/<path_to_sandbox>/sandbox/mymodule.cppm'
referenced by AST file 'mymodule.pcm'' which is suboptimal but at least noticeable. Working around the issue with |
I think this is something worth mentioning in the documentation. But I don't foresee anybody shipping BMIs. The current BMI is extremely sensitive to changes in compiler flags, for example language version used, using pthreads or not, the target architecture. I think it is great to fix it for Bazel, but we should make it clear to vendors not to ship BMIs. That why I hope build sytems will fill that role once we have a std module in libc++. |
@mordante I'm a little bit confusing about the term |
I think it is unnecessarily strict for named modules to verify the existence, time and content of input file and headers. Clang did so for clang modules for implicitly building and header semantics. But it is not useful for named modules. It should be the build system's responsibility to maintain the dependency between source files and the BMI. I'll try to remove the check for the existence, time and content of input files when we compile C++20 named modules. |
@aaronmondal Also it should worth to benchmarking which one is faster: to copy the BMI in different machine and build the BMI directly. There are multiple people complaining to me the size of BMI is large. So it may be problem to pass them by network. Also it may not be too slow to build the BMI directly since it doesn't evolve the middle and back end. |
Another issue (that I probably shouldve mentioned in the OP 😅) is that if you build a BMI locally and then install it via the regular CMake mechanism, the BMIs will still reference each other's locations in the build directory. As long as the builddirectory is still around you won't notice it, but if you remove the build directory I'd expect the BMIs to stop working. For instance, if you build
@ChuanqiXu9 Absolutely! I'm OOO for like 2 days. I'll start benchmarking (probably using eomii/rules_ll#98) after that. Testing things like network transfers vs local build time can be a bit tricky though as I'll probably get kicked out of remote execution free tiers soon 🤣 (~200 GB cache transfers and ~400min single-core CPU usage for a single from-scratch LLVM build lol 😅 https://app.buildbuddy.io/invocation/be9102f9-6977-47b0-8ab9-25ddb38c6265#timing). |
I'm super confused by this - the intent, I believe, is that C++20 modules never resolve implicitly - that they must be passed to the compiler explicitly (I guess the main intent is that they don't get /built/ implicitly, but I expecteed they also wouldn't be depended on implicitly - though I realize there is this module storage path that I think we debated a few months back too... :/ ). So I wouldn't expect to be able to use a module without specifying it and all its dependencies on the command line. Guess I've got something wrong here. |
When we use a C++20 named module, we need to specify the corresponding BMI on the command line definitely. But we don't need to specify all its dependencies. Since the dependency information should be recorded in the above BMI (either the full content or the link to other BMI). |
That surprises me/seems suspect/potentially problematic. I'd expect the need to pass in each BMI file the compiler's going to read, for whatever reason (either direct or indirect dependency). I believe that's what we're doing for explicit Clang Header Modules at the moment. |
I feel it is good for C++20 Named modules. Since the encapsulation is a main goal for C++20 Named modules. For example,
It is weird if the user of module 'a' need to write |
@dwblaikie Consider the following three targets:
The same is not true for the linking step. We still need @ChuanqiXu9 please correct me if I said anything incorrect above. |
Not a correction but a supplement. More than one users complains the size of BMI to me. So I am wondering if it is better to make the BMI not contain all the information from the other BMIs. It may be good to make the BMI contains a relative path to the dependent BMIs. But it is relatively far now. And it doesn't matter what should appear in the command line. I strongly think only the direct dependency should appear in the command line. |
Yayy I just acquired access to enough cloud infra that I can start testing the impact of these things at pretty much arbitrary scale soon 🥳. There will be some tradeoff between compile time, file access and filesize/network transfers. I'll let you know if I have any interesting results.
I agree. Anything not contained in a BMI would have to be tracked somehow though. I think somewhat related to potentially split BMIs are shared libraries and |
Yeah, btw, currently we think it is not good to distribute BMIs. So only the tools (build systems, analyzers) should know about BMIs. So I don't want to make it super complex. |
I'm fairly sure, today, BMIs do not contain all the data of their dependents - like, you need access to all of the BMIs. And I /hope/ that today the BMIs do not allow/contain implicit references to their dependent BMIs. I don't think any of this adds complexity - it's the build system that's going to handle this, not a human. but I guess I should test things... Yep, for now you do have to specify all pcms, I think?
So I'm not sure why Oh, I think for explicit Clang Header Modules when we're using them at Google we stuff the entire source file inside the module for diagnostics so we don't have to ship around a separate file to the builds that consume the module. Not sure that's the way everyone should do it in the long term, probably not. But we could support some prefix mapping change (like |
@dwblaikie Your errors occur because you directly imported For instance, the following code from rules_ll/examples/modules_draft_example implements the example from the C++ standard. I've added load("@rules_ll//ll:defs.bzl", "ll_binary", "ll_library")
# Copy-paste from https://eel.is/c++draft/module.
# The draft does not contain info on file extensions for module implementation
# and interface units. This example shows how files as in the draft need to be
# laid out in rules_ll.
# A notable aspect of this example is that a module partition does not require
# an interface and does not need to specify one to be usable in other parts of
# the module.
# Technically, there is no "module partition interface unit" and/or "module
# partition implementation unit". Officially, both of them are "module
# partitions" that just differ in their contribution to the external interface
# of the primary module.
# In rules_ll, the `interfaces` attribute declares whether a source file is
# precompiled or not. rules_ll itself has no notion of module partitions.
# A potentially unintuitive consequence of this is that module partitions
# generally should go in the `interfaces` attribute. We may also require
# additional targets for "internal" partitions, despite the C++ syntax
# suggesting module partitions are all on the "same level" in the
# module-internal hierarchy.
# Two files in the same `interfaces` attribute cannot see each other's
# precompiled BMIs. We need an additional target for "internal" partitions to
# let rules_ll know that the BMI for these internal partitions need to be built
# before other partitions are built.
# It should be clear from this example, that module layout needs to be
# well-thought-out. Otherwise, dependency graphs, regardless of build system
# can become hard to comprehend.
# For more intuitive examples, see the other module_* examples.
ll_library(
name = "Internals",
compile_flags = ["-std=c++20", "-v"],
exposed_interfaces = {
# Module partition implementation unit. This is imported in TU2, so we
# need to build TU3 before we can build TU2.
"TU3.cppm": "A:Internals",
},
)
ll_library(
name = "A",
srcs = [
# Primary module implementation unit. Requires the BMI for module
# partition A:Internals (TU3), and the BMI for the primary module
# interface unit of A (TU1).
"TU4.cpp",
],
compile_flags = ["-std=c++20", "-v"],
exposed_interfaces = {
# Primary module interface unit. Requires the BMI for the module
# partition A:Foo (TU2).
"TU1.cppm": "A",
},
interfaces = {
# Module partition interface unit. Requires the BMI for module partition
# A:Internals (TU3).
"TU2.cppm": "A:Foo",
},
deps = [":Internals"],
)
# Main file not in the example. Useful to see that things work.
ll_binary(
name = "modules_draft_example",
srcs = ["main.cpp"],
compile_flags = ["-std=c++20", "-v"],
link_flags = ["--verbose"],
visibility = ["@//:__pkg__"],
deps = [":A"],
) The Searching for Output
|
Ah, right, thanks for the catch. So... yeah - does look like currently mod2 has some reference to the location of mod1. (deleting mod1.pcm causes the build to fail)
As mentioned, this seems unfortunate to me. The build system's going to need to know the dependencies anyway, so I don't see the benefit to having these indirect inputs be implicit & I'm worried making the dependencies less clear/more implicit will result in confusion and bugs in build system integration, etc. (I can't quite figure out where that path is stored - just running |
The build systems will maintain a dependency graph. So I think they understand the dependencies well. Given we do strict check when reading BMIs, I feel it won't be hurtful to make the dependency implicit. For moving BMIs, I did the experiment:
It is good for the following:
And it doesn't work after we move b.pcm to directory
But it works after we wrote:
@aaronmondal so this may be a workaround currently. |
It's problematic in the case of this bug, for instance - that the pcm files can't be moved around because they have paths hardcoded in them. If build systems maintain a dep graph and understand them well, I don't see the value in having the dependencies being implicit. |
Thinking about this some more I think this is actually a difficult question. No compiler is fully explicit when it comes to dependencies. We only specify
Thanks, I'll play around with this. I'm starting to think that BMIs should to contain all imported BMIs. Otherwise we'd be in a situation where we'd have to add every partition via an explicit flag. Yes it eats up a lot of disk space, but at least we'd have short command lines and strong encapsulation while still being explicit. Users requiring incremental builds already have gigantic artifact caches, a few hundred gigs here and there might not make too much of a difference. For distributed builds it might even be cheaper to have such fat BMIs because it would reduce chache checks to a single request per BMI instead of potentially hundreds of BMIs. A single fat BMI could potentially be compressed and/or stripped more effectively than separate BMIs. Everyone not using incremental builds, such as package managers for distributions can just delete all intermediary BMIs after the build and only keep the fat BMI for the primary module interface unit. This could also make it straightforward to distribute BMIs in the long-term - I just need to send you a single BMI and its corresponding archive instead of an entire directory with a specific layout that your build system needs to parse and convert to its own internal representation. Imagine having to explicitly specify every header used during a build. Systems would go OOM before compilation even starts 😅 |
I'm thinking something like // A-partition.cppm
export module A:partition;
// A.cppm
export module A;
import A:partition;
// main.cpp
import A;
|
I understand your concern a little bit more. But given this will introduce a break change between the compiler and the build systems. I'll try to kick off another discussion about this later. After all, this is another different issue than the original issue in the issue report. @aaronmondal Although it is still unclear if it is good to make a BMI contain all the information of its imported modules (we need more discussion), there are two notes:
|
@ChuanqiXu9 I've just tested your patch. It fixes the issues regarding general portability and things now work without IRREPRODUCIBLE PATHS:
/some/absolute/path/to/sandbox/mymodule.cppm
/some/absolute/path/to/sandbox/mymodule.cppm
/some/absolute/path/to/sandbox/include/myheader.h
/some/absolute/path/to/sandbox/mymodule.cppm
/some/absolute/path/to/sandbox/include/myheader.h |
Hmm, sandboxed compilation doesn't work either.
Details
ERROR: /home/aaron/aaronmondal/rules_ll/examples/modules_example/BUILD.bazel:8:11: LlCompileObject modules_example/module_d/module_d/d_interface.interface.o failed: (Exit 1):
clang++ failed: error executing command (from target //modules_example:module_d)
(cd /home/aaron/.cache/bazel/_bazel_aaron/57c173f35bd67c0c2e50f3183aeb3e4c/sandbox/linux-sandbox/11844/execroot/_main && \
exec env - \
LINK=bazel-out/k8-fastbuild-ST-1b2103630309/bin/external/llvm-project-overlay~17-init-bcr.3~llvm_project_overlay~llvm-project/llvm/llvm-link \
LLD=bazel-out/k8-fastbuild/bin/external/rules_ll~override/ll/ld.lld \
LLVM_SYMBOLIZER_PATH=bazel-out/k8-fastbuild-ST-1b2103630309/bin/external/llvm-project-overlay~17-init-bcr.3~llvm_project_overlay~llvm-project/llvm/llvm-symbolizer \
PATH=bazel-out/k8-fastbuild/bin/external/rules_ll~override/ll:bazel-out/k8-fastbuild-ST-1b2103630309/bin/external/llvm-project-overlay~17-init-bcr.3~llvm_project_overlay~
llvm-project/lld \
bazel-out/k8-fastbuild-ST-1b2103630309/bin/external/llvm-project-overlay~17-init-bcr.3~llvm_project_overlay~llvm-project/clang/clang++ -fcolor-diagnostics -Wdate-time -no-c
anonical-prefixes '-fdebug-compilation-dir=.' '-fcoverage-compilation-dir=.' -fno-omit-frame-pointer -Xarch_host -glldb -Xarch_host -gdwarf-5 -fprofile-instr-generate -fcover
age-mapping -c -fPIC -Xarch_host -MJbazel-out/k8-fastbuild/bin/modules_example/module_d/module_d/d_interface.interface.cdf -nostdinc '--gcc-toolchain=NONE' '-resource-dir=baz
el-out/k8-fastbuild/bin/external/llvm-project-overlay~17-init-bcr.3~llvm_project_overlay~llvm-project/clang/staging' -D_LIBCPP_ENABLE_EXPERIMENTAL -D_LIBCPP_REMOVE_TRANSITIVE
_INCLUDES -D_LIBCPP_NO_ABI_TAG '-std=c++20' bazel-out/k8-fastbuild/bin/modules_example/module_d/d_interface.pcm -o bazel-out/k8-fastbuild/bin/modules_example/module_d/module_
d/d_interface.interface.o)
# Configuration: 89ebe2beaed34419e8a3a142f17133d21d4bac949fea5ac4e8c46f4edc69163f
# Execution platform: @rules_ll~override//rbe/default/config:platform
Use --sandbox_debug to see verbose messages from the sandbox and retain the sandbox build root for debugging
fatal error: cannot open file '/home/aaron/.cache/bazel/_bazel_aaron/57c173f35bd67c0c2e50f3183aeb3e4c/sandbox/linux-sandbox/11839/execroot/_main/modules_example/d_interface.c
ppm': No such file or directory
1 error generated. ERROR: /home/aaron/aaronmondal/rules_ll/examples/modules_example/BUILD.bazel:8:11: LlCompileObject modules_example/module_d/d_implementation.o failed: (Exit 1): clang++ faile
d: error executing command (from target //modules_example:module_d)
(cd /home/aaron/.cache/bazel/_bazel_aaron/57c173f35bd67c0c2e50f3183aeb3e4c/sandbox/linux-sandbox/11850/execroot/_main && \
exec env - \
LINK=bazel-out/k8-fastbuild-ST-1b2103630309/bin/external/llvm-project-overlay~17-init-bcr.3~llvm_project_overlay~llvm-project/llvm/llvm-link \
LLD=bazel-out/k8-fastbuild/bin/external/rules_ll~override/ll/ld.lld \
LLVM_SYMBOLIZER_PATH=bazel-out/k8-fastbuild-ST-1b2103630309/bin/external/llvm-project-overlay~17-init-bcr.3~llvm_project_overlay~llvm-project/llvm/llvm-symbolizer \
PATH=bazel-out/k8-fastbuild/bin/external/rules_ll~override/ll:bazel-out/k8-fastbuild-ST-1b2103630309/bin/external/llvm-project-overlay~17-init-bcr.3~llvm_project_overlay~
llvm-project/lld \
bazel-out/k8-fastbuild-ST-1b2103630309/bin/external/llvm-project-overlay~17-init-bcr.3~llvm_project_overlay~llvm-project/clang/clang++ -fcolor-diagnostics -Wdate-time -no-c
anonical-prefixes '-fdebug-compilation-dir=.' '-fcoverage-compilation-dir=.' -fno-omit-frame-pointer -Xarch_host -glldb -Xarch_host -gdwarf-5 -fprofile-instr-generate -fcover
age-mapping -c -fPIC -Xarch_host -MJbazel-out/k8-fastbuild/bin/modules_example/module_d/d_implementation.cdf -nostdinc '--gcc-toolchain=NONE' '-resource-dir=bazel-out/k8-fast
build/bin/external/llvm-project-overlay~17-init-bcr.3~llvm_project_overlay~llvm-project/clang/staging' -idirafterbazel-out/k8-fastbuild/bin/external/llvm-project-overlay~17-i
nit-bcr.3~llvm_project_overlay~llvm-project/clang/staging/include -isystembazel-out/k8-fastbuild/bin/external/llvm-project-overlay~17-init-bcr.3~llvm_project_overlay~llvm-pro
ject/libcxx/include -isystemexternal/llvm-project-overlay~17-init-bcr.3~llvm_project_overlay~llvm-project/libcxx/include -isystemexternal/llvm-project-overlay~17-init-bcr.3~l
lvm_project_overlay~llvm-project/libcxxabi/include -isystemexternal/llvm-project-overlay~17-init-bcr.3~llvm_project_overlay~llvm-project/libunwind/include -I/nix/store/ffrix9
v1lqbqpzivj0ycjnj85qa1n0l7-openssl-3.0.8-dev/include -isystem/nix/store/bhhb91angi23wsr42apz1in1n9b9ilg3-glibc-2.37-8-dev/include -isystem/nix/store/ijzs7ds7l2my641a73wywwhdj
by1sgd5-libxcrypt-4.4.33/include -isystem/nix/store/6ivk5qxp0i371n2xvpm6ndibqskzax3v-libdrm-2.4.115-dev/include -isystem/nix/store/6ivk5qxp0i371n2xvpm6ndibqskzax3v-libdrm-2.4
.115-dev/include/libdrm -isystem/nix/store/hbvpkpp821c8jzrh3dybg35wbj5w121l-elfutils-0.189-dev/include -isystem/nix/store/nlgn42yfizrgx9mj52dlxsq7j8cjxl4i-numactl-2.0.16/incl
ude -isystem/nix/store/j798nkcqrz73xvamdn24rh4x0p4aqml1-libglvnd-1.6.0-dev/include -isystem/nix/store/xxkx7bmjpcc5zxx8r3xvxb345y7vv6wl-libX11-1.8.4-dev/include -isystem/nix/s
tore/jbc3f1agds5a6d0yas6i30hwr36al8jd-xorgproto-2021.5/include -D_LIBCPP_ENABLE_EXPERIMENTAL -D_LIBCPP_REMOVE_TRANSITIVE_INCLUDES -D_LIBCPP_NO_ABI_TAG '-std=c++20' '-fmodule-
file=d=bazel-out/k8-fastbuild/bin/modules_example/module_d/d_interface.pcm' modules_example/d_implementation.cpp -o bazel-out/k8-fastbuild/bin/modules_example/module_d/d_impl
ementation.o)
# Configuration: 89ebe2beaed34419e8a3a142f17133d21d4bac949fea5ac4e8c46f4edc69163f
# Execution platform: @rules_ll~override//rbe/default/config:platform
Use --sandbox_debug to see verbose messages from the sandbox and retain the sandbox build root for debugging
fatal error: cannot open file '/home/aaron/.cache/bazel/_bazel_aaron/57c173f35bd67c0c2e50f3183aeb3e4c/sandbox/linux-sandbox/11839/execroot/_main/external/llvm-project-overlay
~17-init-bcr.3~llvm_project_overlay~llvm-project/libcxx/include/ostream': No such file or directory
Details
ERROR: /home/aaron/aaronmondal/rules_ll/examples/modules_example/BUILD.bazel:28:11: LlCompileObject modules_example/a/a/a.interface.o failed: (Exit 1): clang++ failed: error
executing command (from target //modules_example:a)
(cd /home/aaron/.cache/bazel/_bazel_aaron/57c173f35bd67c0c2e50f3183aeb3e4c/sandbox/linux-sandbox/11838/execroot/_main && \
exec env - \
LINK=bazel-out/k8-fastbuild-ST-1b2103630309/bin/external/llvm-project-overlay~17-init-bcr.3~llvm_project_overlay~llvm-project/llvm/llvm-link \
LLD=bazel-out/k8-fastbuild/bin/external/rules_ll~override/ll/ld.lld \
LLVM_SYMBOLIZER_PATH=bazel-out/k8-fastbuild-ST-1b2103630309/bin/external/llvm-project-overlay~17-init-bcr.3~llvm_project_overlay~llvm-project/llvm/llvm-symbolizer \
PATH=bazel-out/k8-fastbuild/bin/external/rules_ll~override/ll:bazel-out/k8-fastbuild-ST-1b2103630309/bin/external/llvm-project-overlay~17-init-bcr.3~llvm_project_overlay~
llvm-project/lld \
bazel-out/k8-fastbuild-ST-1b2103630309/bin/external/llvm-project-overlay~17-init-bcr.3~llvm_project_overlay~llvm-project/clang/clang++ -fcolor-diagnostics -Wdate-time -no-c
anonical-prefixes '-fdebug-compilation-dir=.' '-fcoverage-compilation-dir=.' -fno-omit-frame-pointer -Xarch_host -glldb -Xarch_host -gdwarf-5 -fprofile-instr-generate -fcover
age-mapping -c -fPIC -Xarch_host -MJbazel-out/k8-fastbuild/bin/modules_example/a/a/a.interface.cdf -nostdinc '--gcc-toolchain=NONE' '-resource-dir=bazel-out/k8-fastbuild/bin/
external/llvm-project-overlay~17-init-bcr.3~llvm_project_overlay~llvm-project/clang/staging' -D_LIBCPP_ENABLE_EXPERIMENTAL -D_LIBCPP_REMOVE_TRANSITIVE_INCLUDES -D_LIBCPP_NO_A
BI_TAG '-std=c++20' '-fmodule-file=b=bazel-out/k8-fastbuild/bin/modules_example/b/b.pcm' bazel-out/k8-fastbuild/bin/modules_example/a/a.pcm -o bazel-out/k8-fastbuild/bin/modu
les_example/a/a/a.interface.o)
# Configuration: 89ebe2beaed34419e8a3a142f17133d21d4bac949fea5ac4e8c46f4edc69163f
# Execution platform: @rules_ll~override//rbe/default/config:platform
Use --sandbox_debug to see verbose messages from the sandbox and retain the sandbox build root for debugging
fatal error: module file '/home/aaron/.cache/bazel/_bazel_aaron/57c173f35bd67c0c2e50f3183aeb3e4c/sandbox/linux-sandbox/11829/execroot/_main/bazel-out/k8-fastbuild/bin/modules
_example/b/b.pcm' not found: module file not found
note: imported by module 'a' in 'bazel-out/k8-fastbuild/bin/modules_example/a/a.pcm'
1 error generated. Note how the information of no-longer-existent sandboxes still leaks to downstream actions. I think the I wasn't able to reproduce the behavior in a self-contained script, but it seems that the |
Ah, I was able to reproduce it. The issue arises when a BMI is imported via fatal error: module file '/some/absolute/path/to/sandbox3/mymodule.pcm' not found: module file not found
note: imported by module 'myothermodule' in 'myothermodule.pcm'
1 error generated. mkdir include
echo "" > include/myheader.h
printf '
module;
#include "myheader.h"
export module mymodule;
' > mymodule.cppm
printf '
module;
export module myothermodule;
import mymodule;
' > myothermodule.cppm
printf '
import mymodule;
auto main() -> int { return 0; }
' > main.cpp
# Prepare first sandbox.
mkdir -p sandbox/include
cp mymodule.cppm sandbox/mymodule.cppm
cp include/myheader.h sandbox/include/myheader.h
# Simulate first sandboxed compilation.
cd sandbox
# Adding `-Xclang -fmodules-embed-all-files` makes the next compilation pass,
# but still leaves irreproducible, absolue paths in the pcm.
clang -v -std=c++20 -Iinclude --precompile mymodule.cppm -o mymodule.pcm
cp mymodule.pcm ..
cd ..
rm -rd sandbox
echo
# Prepare second sandbox.
mkdir sandbox2
cp mymodule.pcm sandbox2/mymodule.pcm
# Simulate second sandboxed compilation.
cd sandbox2
clang -v -std=c++20 -c mymodule.pcm -o mymodule.o
cp mymodule.o ..
cd ..
rm -rd sandbox2
# Prepare third sandbox.
mkdir sandbox3
cp mymodule.pcm sandbox3/mymodule.pcm
cp myothermodule.cppm sandbox3/myothermodule.cppm
# Simulate third sandboxed compilation.
cd sandbox3
clang -v -std=c++20 -fmodule-file=mymodule=mymodule.pcm --precompile myothermodule.cppm -o myothermodule.pcm
cp myothermodule.pcm ..
cd ..
rm -rd sandbox3
# Prepare fourth sandboxed compilation.
mkdir sandbox4
cp mymodule.pcm sandbox4/mymodule.pcm
cp myothermodule.pcm sandbox4/myothermodule.pcm
# Simulate fourth sandboxed compilation.
cd sandbox4
clang -v -std=c++20 -fmodule-file=mymodule=mymodule.pcm -c myothermodule.pcm -o myothermodule.o
cp myothermodule.o ..
cd ..
rm -rd sandbox4
# Print strings. If you used `-Xclang -fembed-module-files` this still contains
# irreproducible absolute paths to no longer existent sandbox.
printf '
IRREPRODUCIBLE PATHS:
'
strings mymodule.pcm | grep sandbox
strings myothermodule.pcm | grep sandbox
# Cleanup.
rm -rd include
rm main.cpp mymodule.cppm myothermodule.cppm mymodule.pcm myothermodule.pcm mymodule.o |
Oh, sorry for closing this prematurely. I thought the problem the issue talked is about checking the source path in the BMI. And I thought the the BMIs path's in BMI is another independent problem. Given the discussion in https://discourse.llvm.org/t/c-20-modules-should-the-bmis-contain-paths-to-their-dependent-bmis/70422/2, it looks like we're going to remove all such links from BMIs, which will be a breaking change. Will such change help your situation? Also if you have special opinions on this topic, you can comment on that link directly. |
Yes I think that would solve the issue. I've elaborated in the discussion. |
I created #62707. And let's track the new issue there since the current issue is really long and hard to follow for others. Close the issue if you don't mind. |
@ChuanqiXu9 @mordante Sorry for creating this issue so late. This is what I mentioned in the last meeting.
Modules reference BMIs via absolute paths. This makes it impossible to e.g run module precompilation on remote executors. It makes all precompilations irreproducible to a degree that you couldn't reuse the same BMIs on two identically setup machines where just the username differs.
The workaround currently employed in
rules_ll
is to disable sandboxing and remote execution for precompilations, but that can significantly impact performance down the line. Irreproducibility is transitive. If a single module is used, all other targets depending on that can't be built remotely and are irreproducible across systems. Since libcxx is pretty much always at the root of any dependency graph, this means that any build using modulestd
(https://reviews.llvm.org/D144994) will be fully irreproducible.This doesn't only affect Bazel, it affects CMake as well but isn't noticeable during local builds since CMake doesn't use sandboxing in the same way.
For module
std
we're already telling users build the BMIs themselves, but maybe as warning to any distro maintainers experimenting with this: At the moment all precompilations leak absolute paths from the build machines into the BMIs.The text was updated successfully, but these errors were encountered: