Skip to content

Build dependencies without codegen ("cargo check") where possible #1057

Closed
@RalfJung

Description

@RalfJung

It would be great if we could build libstd as well as dependent crates (in cargo miri) without codegen ("check" mode). This would (a) save time, (b) avoid having to generate MIR that works both in Miri and when compiled normally, and (c) hopefully drastically simplify cross-building.

@Aaron1011 looked into this mostly due to (b), as this turned into quite a challenge for unwinding. I am copying from his post:


The long-term solution is to switch to using cargo check. This completely disables codegen in a way that's built into the compiler. We will no longer need to worry about any assumptions made by the codegen backend, since that code will simply never be executed.

However, this turned out to be more complicated than I anticipated. There are several moving parts here:

  1. Making xargo run cargo check

Currently, xargo unconditionally uses cargo build when compiling libstd. I've opened a PR to add a cargo_mode = "check" option to Xargo.toml: japaric/xargo#267

  1. Making cargo-miri run cargo check.

This is complicated by the fact that we currently run cargo rustc in order to pass arguments to the last (and only to the last) invocation of rustc. Sadly, there appears to be no way to combine cargo check and cargo rustc - you can either use cargo rustc and have no way to skip codegen, or use cargo check and have no way to pass final-crate-specific rustc arguments.

To work around this issue, I've added a hack to serialize arguments to an environment variable. When we compile the final crate, we deserialize these arguments from the environment variable, and manually pass them to the rustc invocation.

  1. Making build dependencies use the proper libstd.

This is by far the trickiest part of this entire PR. When Cargo invokes our cargo-miri wrapper, we have three cases to worry about:

  1. Build dependencies (including build scripts): We pass through all arguments completely unmodified to rustc. Miri does not interact in any way with build scripts, so we want to treat them as if we were doing a normal run of cargo.

  2. Normal dependencies: We add our custom sysroot, but still invoke rustc. Since we are in cargo check mode, this will cause rustc to produce metadata for the normal dependenices of our runtime crate, using our custom libstd.

  3. The target itself (e.g. a test or a binary): We invoke miri, and actually begin execution.

Handles these three cases ensures that build dependencies are built using the normal platform libstd (as if cargo-miri did not exist), while normal dependencies and our target crate are built against our custom libstd).

Unfortunately, distinguishing between these three cases is a huge pain. I'm currently relying on the following tricks:

  1. Use CARGO_MANIFEST_DIR to detect when our target crate is being built.

The CARGO_MANIFEST_DIR is set by Cargo to the directory containing the manifest of the package currently being built. During the initial, non-wrapper invocation of cargo miri (e.g. when the user types cargo miri run on the command line), we determine the manifest directly for the crate they are building. We then compare this to CARGO_MANIFEST_DIR when we are being invoked by Cargo as a wrapper.

However, this fails to distinguish between a build.rs and the actual compilation, since both use the same manifest directory. This brings us to the second trick.

  1. Inspect the --emit= flag passed to rustc by Cargo.

This trick relies on the fact that we are using cargo check to build the crate. When Cargo compiles a build dependency, the --emit= flag will always contain link- this is because we always need to produce a runnable binary for build scripts.

When we are building normal dependencies, the --emit= flag will not contain link. This is how cargo check tells rustc not to perform codegen - the --emit= flag will be --emit=dep-info,metadata.

By checking for the presence of link, we can determine whether or not Cargo is trying to compile a build dependency. Note that the same crate could theoretically be build as both - e.g. you could add [dependencies] cc=x.y.z and [build-dependencies] cc=x.y.z to your Cargo.toml.

Adding more information to Cargo

While I believe assumptions behind the above Cargo hacks are fairly sound, this is not really a viable long-term solution. For example, Cargo could choose to stop passing the --emit flag if rustc would have it default to what Cargo already wanted.

Ideally, Cargo would set an environment variable to let us know which of the three cases we are in - target crate, build dependency, or normal dependency. I'm currently working on a PR that does just that.

Conclusion

I think our best path forward is to:

  1. Land cargo check support in xargo
  2. Merge some form of this PR, possible in multiple pieces. Depending on how long review takes, it might make sense to merge the eh_catch_typeinfo hack into rustc so that nightly users can have a (somewhat) working Miri again
  3. Work with the Cargo team to expose the information we need in a cleaner way. I think this information could be of use to other Cargo wrappers. In particular, build dependencies have very different semantics from regular dependencies (e.g. they will be built for a different target during cross-compilation), but wrappers have no clean way of determining which they are building.

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-cargoArea: affects the cargo wrapper (cargo miri)C-projectCategory: a larger project is being tracked here, usually with checkmarks for individual steps

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions