Description
It would be great if we could build libstd as well as dependent crates (in cargo miri
) without codegen ("check" mode). This would (a) save time, (b) avoid having to generate MIR that works both in Miri and when compiled normally, and (c) hopefully drastically simplify cross-building.
@Aaron1011 looked into this mostly due to (b), as this turned into quite a challenge for unwinding. I am copying from his post:
The long-term solution is to switch to using cargo check
. This completely disables codegen in a way that's built into the compiler. We will no longer need to worry about any assumptions made by the codegen backend, since that code will simply never be executed.
However, this turned out to be more complicated than I anticipated. There are several moving parts here:
- Making
xargo
runcargo check
Currently, xargo
unconditionally uses cargo build
when compiling libstd
. I've opened a PR to add a cargo_mode = "check"
option to Xargo.toml
: japaric/xargo#267
- Making
cargo-miri
runcargo check
.
This is complicated by the fact that we currently run cargo rustc
in order to pass arguments to the last (and only to the last) invocation of rustc
. Sadly, there appears to be no way to combine cargo check
and cargo rustc
- you can either use cargo rustc
and have no way to skip codegen, or use cargo check
and have no way to pass final-crate-specific rustc arguments.
To work around this issue, I've added a hack to serialize arguments to an environment variable. When we compile the final crate, we deserialize these arguments from the environment variable, and manually pass them to the rustc
invocation.
- Making build dependencies use the proper
libstd
.
This is by far the trickiest part of this entire PR. When Cargo invokes our cargo-miri
wrapper, we have three cases to worry about:
-
Build dependencies (including build scripts): We pass through all arguments completely unmodified to rustc. Miri does not interact in any way with build scripts, so we want to treat them as if we were doing a normal run of cargo.
-
Normal dependencies: We add our custom sysroot, but still invoke rustc. Since we are in cargo check mode, this will cause rustc to produce metadata for the normal dependenices of our runtime crate, using our custom libstd.
-
The target itself (e.g. a test or a binary): We invoke miri, and actually begin execution.
Handles these three cases ensures that build dependencies are built using the normal platform libstd
(as if cargo-miri
did not exist), while normal dependencies and our target crate are built against our custom libstd).
Unfortunately, distinguishing between these three cases is a huge pain. I'm currently relying on the following tricks:
- Use
CARGO_MANIFEST_DIR
to detect when our target crate is being built.
The CARGO_MANIFEST_DIR
is set by Cargo to the directory containing the manifest of the package currently being built. During the initial, non-wrapper invocation of cargo miri
(e.g. when the user types cargo miri run
on the command line), we determine the manifest directly for the crate they are building. We then compare this to CARGO_MANIFEST_DIR
when we are being invoked by Cargo as a wrapper.
However, this fails to distinguish between a build.rs
and the actual compilation, since both use the same manifest directory. This brings us to the second trick.
- Inspect the
--emit=
flag passed torustc
byCargo.
This trick relies on the fact that we are using cargo check
to build the crate. When Cargo compiles a build dependency, the --emit=
flag will always contain link
- this is because we always need to produce a runnable binary for build scripts.
When we are building normal dependencies, the --emit=
flag will not contain link
. This is how cargo check
tells rustc
not to perform codegen - the --emit=
flag will be --emit=dep-info,metadata
.
By checking for the presence of link
, we can determine whether or not Cargo is trying to compile a build dependency. Note that the same crate could theoretically be build as both - e.g. you could add [dependencies] cc=x.y.z
and [build-dependencies] cc=x.y.z
to your Cargo.toml
.
Adding more information to Cargo
While I believe assumptions behind the above Cargo hacks are fairly sound, this is not really a viable long-term solution. For example, Cargo could choose to stop passing the --emit
flag if rustc
would have it default to what Cargo already wanted.
Ideally, Cargo would set an environment variable to let us know which of the three cases we are in - target crate, build dependency, or normal dependency. I'm currently working on a PR that does just that.
Conclusion
I think our best path forward is to:
- Land
cargo check
support inxargo
- Merge some form of this PR, possible in multiple pieces. Depending on how long review takes, it might make sense to merge the
eh_catch_typeinfo
hack intorustc
so that nightly users can have a (somewhat) working Miri again - Work with the Cargo team to expose the information we need in a cleaner way. I think this information could be of use to other Cargo wrappers. In particular, build dependencies have very different semantics from regular dependencies (e.g. they will be built for a different target during cross-compilation), but wrappers have no clean way of determining which they are building.