Skip to content

bootstrap atop nix atop non-nix (old, Linux) OS fails with inscrutable errors #115073

Closed
@pnkfelix

Description

@pnkfelix

After being introduced to Nix by a colleague, I have been trying to use the Nix package manager as a basis for rustc development.

However, if I install Nix atop an old Linux OS distribution (or, more precisely, atop a Linux OS that has a relatively old glibc version), I hit problems during the bootstrap step for building rustc locally.

For example, if I follow these steps:

  1. Setup a fresh Ubuntu 16 Linux machine. (Its possible that you might see this atop some newer Ubuntus. I just grabbed something I was confident was old enough (2019) that it would suffice to see the problem at hand.)
  2. Install Nix on that machine, using e.g. https://github.com/DeterminateSystems/nix-installer#the-determinate-nix-installer
  3. Run nix develop nixpkgs#rustc, to establish a subshell that has a nix-based context with the dependencies necessary to build rustc (things like the C compiler, cmake, python3, etc)
  4. curl -O https://static.rust-lang.org/dist/rustc-1.71.1-src.tar.gz to download the source distribution from the project
  5. tar xzf rustc-1.71.1-src.tar.gz
  6. cd rustc-1.71.1-src/
  7. echo 'profile = "compiler"' > config.toml
  8. ./x.py build --stage 1

then, for me, that latter command terminates with:

[...]
   Compiling clap_derive v4.2.0
   Compiling clap v4.2.4
error[E0519]: the current crate is indistinguishable from one of its dependencies: it has the same crate-name `clap_derive` and was compiled with the same `-C metadata` arguments. This will result in symbol conflicts between the two.
   --> /home/ubuntu/.cargo/registry/src/index.crates.io-6f17d22bba15001f/clap-4.2.4/src/lib.rs:101:9
    |
101 | pub use clap_derive::{self, *};
    |         ^^^^^^^^^^^

For more information about this error, try `rustc --explain E0519`.
error: could not compile `clap` (lib) due to previous error
warning: build failed, waiting for other jobs to finish...
failed to run: /home/ubuntu/scratch/rustc-1.71.1-src/build/x86_64-unknown-linux-gnu/stage0/bin/cargo build --manifest-path /home/ubuntu/scratch/rustc-1.71.1-src/src/bootstrap/Cargo.toml

There's a couple of different issues here.

  1. The error message above is pretty unfortunate. I believe its due to a problem that was fixed by PR Fix symbol conflict diagnostic mistakenly being shown instead of missing crate diagnostic #111461 (and we're only seeing it here because we are bootstrapping 1.71 atop 1.70, and 1.70 didn't have the fix that is provided by PR Fix symbol conflict diagnostic mistakenly being shown instead of missing crate diagnostic #111461).
  2. Even with the fix provided by PR Fix symbol conflict diagnostic mistakenly being shown instead of missing crate diagnostic #111461, the error message is still going to be a bit frustrating. On the current rust-repo, I instead see:
   Compiling clap_derive v4.2.0
   Compiling clap v4.2.4
error[E0463]: can't find crate for `clap_derive`
   --> /home/ubuntu/.cargo/registry/src/index.crates.io-6f17d22bba15001f/clap-4.2.4/src/lib.rs:101:9
    |
101 | pub use clap_derive::{self, *};
    |         ^^^^^^^^^^^ can't find crate

For more information about this error, try `rustc --explain E0463`.
  1. Older rustc versions may give error messages that provide a bit of a better clue as to what is going wrong here. E.g. with trying to bootstrap Rust 1.68.1, I see:
error: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.28' not found (required by /home/ubuntu/scratch/rustc-1.68.1-src/build/bootstrap/debug/deps/libserde_derive-0df88016bb9bb232.so)
   --> /home/ubuntu/.cargo/registry/src/github.com-1ecc6299db9ec823/serde-1.0.137/src/lib.rs:292:1
    |
292 | extern crate serde_derive;
    | ^^^^^^^^^^^^^^^^^^^^^^^^^^

error[E0432]: unresolved imports `self::__private`, `self::__private`
   --> /home/ubuntu/.cargo/registry/src/github.com-1ecc6299db9ec823/serde-1.0.137/src/lib.rs:274:5
    |
274 | use self::__private as export;
    |     ^^^^^^^^^^^^^^^^^^^^^^^^^
275 | #[allow(unused_imports)]
276 | use self::__private as private;
    |     ^^^^^^^^^^^^^^^^^^^^^^^^^^

For more information about this error, try `rustc --explain E0432`.
error: could not compile `serde` due to 2 previous errors
failed to run: /home/ubuntu/scratch/rustc-1.68.1-src/build/x86_64-unknown-linux-gnu/stage0/bin/cargo build --manifest-path /home/ubuntu/scratch/rustc-1.68.1-src/src/bootstrap/Cargo.toml


I now think I understand, in broad strokes, why this is happening.

It is happening because on Nix, we need to patch the binary's dynamic linker (.interp) and dynamic library search path (rpath/RUNPATH) so that they point at the Nix-specific values ...

def fix_bin_or_dylib(self, fname):
"""Modifies the interpreter section of 'fname' to fix the dynamic linker,
or the RPATH section, to fix the dynamic library search path
This method is only required on NixOS and uses the PatchELF utility to
change the interpreter/RPATH of ELF executables.

... and the logic in bootstrap.py that drives this choice is a heuristic that assumes if you're using Nix, it must be NixOS.

def should_fix_bins_and_dylibs(self):
"""Whether or not `fix_bin_or_dylib` needs to be run; can only be True
on NixOS.
"""
if self._should_fix_bins_and_dylibs is not None:
return self._should_fix_bins_and_dylibs
def get_answer():
default_encoding = sys.getdefaultencoding()
try:
ostype = subprocess.check_output(
['uname', '-s']).strip().decode(default_encoding)
except subprocess.CalledProcessError:
return False
except OSError as reason:
if getattr(reason, 'winerror', None) is not None:
return False
raise reason
if ostype != "Linux":
return False
# If the user has asked binaries to be patched for Nix, then
# don't check for NixOS or `/lib`.
if self.get_toml("patch-binaries-for-nix", "build") == "true":
return True

The problem is that some contributors are going to use Nix outside of NixOS, e.g. in the manner described by the steps above, and they need some kind of accommodation here.

(To be clear: Most people using Nix, inside or outside of NixOS, are not going to be using our distributed tar balls nor running the x.py in those tarballs at all. Most people using Nix are going to use Nix's own package management system, which has already incorporated their own logic for patching the binaries in the necessary manner here.)


So, action items:

  1. At bare minimum, the bootstrap config.toml should be slightly generalized, to provide a user-accessible key that will control that fix_bin_or_dylib patching behavior, where one can explicitly opt-in, opt-out, or fall back on whatever heuristic is currently in place to infer the right value here. (this exists, though it perhaps should be generalized slightly... see comment below)
  2. After generalizing the config.toml, next is to try to generalize the aforementioned heuristic logic to cover this non-NixOS case.
  3. Finally, I would like to explore whether any other issues in the rust repo related to these various "cannot resolve crate" type errors are actually due to this kind of failure to patch the binary (i.e., failure to account for a mismatch between the dynamic linker and/or libc assumed at build time, vs the actual dynamic linker and/or libc we encounter at runtime). This step is a bit less concrete because I am currently not certain whether the other issues that I noted while looking at this actually are instances of such a mismatch, but if there is a chance that they are, then we should consider extending the rustc --version --verbose output to try to provide some hints that might tell us that is the underlying problem here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    C-bugCategory: This is a bug.T-bootstrapRelevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap)

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions