Skip to content

Rust 1.43 fails to compile itself #71502

Closed
@thomasjfox

Description

@thomasjfox

Hi all,

my usual test: Can Rust compile itself after a compiler upgrade.
Rust 1.43 dies at this stage:

Building stage0 compiler artifacts (x86_64-unknown-linux-gnu -> x86_64-unknown-linux-gnu)
running: "/datastore/dev/rust/bin/cargo" "build" "--target" "x86_64-unknown-linux-gnu" "-Zbinary-dep-depinfo" "-j" "32" "--release" "--frozen" "--features" " llvm" "--manifest-path" "/datastore/rpmbuild/BUILD/rustc-1.43.0-src/src/rustc/Cargo.toml" "--message-format" "json-render-diagnostics"
error: failed to run `rustc` to learn about target-specific information

Caused by:
  process didn't exit successfully: `/datastore/rpmbuild/BUILD/rustc-1.43.0-src/build/bootstrap/debug/rustc - --crate-name ___ --print=file-names --crate-type bin --crate-type rlib --crate-type dylib --crate-type cdylib --crate-type staticlib --crate-type proc-macro --print=sysroot --print=cfg` (exit code: 1)

--- stderr
error: character literal may only contain one codepoint
 --> <anon>:3:35
  |
3 |          messages will be sent to 'stderr'.
  |                                   ^^^^^^^^
  |
help: if you meant to write a `str` literal, use double quotes
  |
3 |          messages will be sent to "stderr".
  |                                   ^^^^^^^^

error: character literal may only contain one codepoint
 --> <anon>:6:35
  |
6 |          messages will be sent to 'stderr'.
  |                                   ^^^^^^^^
  |
help: if you meant to write a `str` literal, use double quotes
  |
6 |          messages will be sent to "stderr".
  |                                   ^^^^^^^^

error: unterminated character literal
 --> <anon>:9:20
  |
9 | FATAL: Cannot open '/dev/null' for writing.
  |                    ^

error: aborting due to 3 previous errors


command did not execute successfully: "/datastore/dev/rust/bin/cargo" "build" "--target" "x86_64-unknown-linux-gnu" "-Zbinary-dep-depinfo" "-j" "32" "--release" "--frozen" "--features" " llvm" "--manifest-path" "/datastore/rpmbuild/BUILD/rustc-1.43.0-src/src/rustc/Cargo.toml" "--message-format" "json-render-diagnostics"
expected success, got: exit code: 101

I didn't identify the problematic code yet, I just wanted to create a tracking issue first.
Rust 1.43 can compile other crates fine, I quickly tested rust-zip or rust-askama.

Activity

jonas-schievink

jonas-schievink commented on Apr 24, 2020

@jonas-schievink
Contributor

Do you need this for anything? This is not a supported configuration

thomasjfox

thomasjfox commented on Apr 24, 2020

@thomasjfox
Author

Do you need this for anything? This is not a supported configuration

yes, Linux distributions often need to recompile a (compiler) package with some patches applied or after a major LLVM / glibc / other major update.

See #69953 for the discussion on the Rust 1.42.0 release.

infinity0

infinity0 commented on Apr 24, 2020

@infinity0
Contributor

This is not a supported configuration

Really? I thought this has been an attempted guarantee since years ago. (Even though it was broken for various previous releases.) Debian certainly needs this guarantee and I will have to figure out how to fix this when packaging 1.43 for Debian.

This one seems like a new case not similar to the previous cases, gah.

jonas-schievink

jonas-schievink commented on Apr 24, 2020

@jonas-schievink
Contributor

Well, we don't test this at all, so it's not surprising that it breaks almost every release. Why can't Debian just use the previous release to bootstrap?

infinity0

infinity0 commented on Apr 24, 2020

@infinity0
Contributor

We do, but the porters (for tier-2 architectures like riscv64) and other people occasionally need to rebuild stuff afterwards, when the old version is no longer available (has been moved to archives).

Mark-Simulacrum

Mark-Simulacrum commented on Apr 24, 2020

@Mark-Simulacrum
Member

It's not really unsupported so much as not tested. I would ask that if Debian (or others) want this to work we find out before the stable release -- ideally when beta is cut, or earlier -- that way we can do much more to fix it.

I'll try to take a look at this specific failure soon and see if there's a simple patch.

est31

est31 commented on Apr 24, 2020

@est31
Member

Bootstrapping rustc twice would make merging PRs much slower than it's today. However, maybe one could add it only to the beta branch's CI. Or, alternatively, add rustc to crater. The "old state" would serve as check whether it can bootstrap normally, the "new state" would check whether it can bootstrap itself. There's a proposal to add servo already: rust-lang/crater#133

Mark-Simulacrum

Mark-Simulacrum commented on Apr 24, 2020

@Mark-Simulacrum
Member

I don't think there's been any proposal that we test this in PR CI? To be clear, I don't currently believe that's something we want to do, though we'd generally be willing to accept PRs that make self-bootstrapping possible, which is why I suggested testing on beta releases (instead of stable releases, where we can't do much so all patches must be downstream).

est31

est31 commented on Apr 24, 2020

@est31
Member

I don't think there's been any proposal that we test this in PR CI?

It wasn't in response to any proposal or such, just about how to stop regressions like this from happening. It has been mentioned in the thread that it's not being tested. So naturally one would wonder how the testing would look like so I wrote down my thoughts.

Also, yes, optimally the users would test beta releases (question to @thomasjfox have you tested 1.44 beta yet?), and I've filed a pr for rav1e today to do precisely this, but the entire premise of CI as well as crater is to be proactive about breakage.

infinity0

infinity0 commented on Apr 24, 2020

@infinity0
Contributor

Understood, I will make some more effort to test the beta releases! We had done so a few times in the past but not really made it a regular part of the process.

infinity0

infinity0 commented on Apr 24, 2020

@infinity0
Contributor

I don't think there's been any proposal that we test this in PR CI?

It wasn't in response to any proposal or such, just about how to stop regressions like this from happening.

I agree it would be too onerous for regular PRs, but perhaps it could be done when anointing nightly as beta, and then on all PRs to beta, that sounds not too expensive.

cuviper

cuviper commented on Apr 27, 2020

@cuviper
Member

@thomasjfox

I didn't identify the problematic code yet, I just wanted to create a tracking issue first.

Have you found this yet? Maybe something wrong with your /dev/null? I also don't see "messages will be sent to 'stderr'" anywhere in the Rust sources, but googling indicates this often comes from syslog. I don't know why that would be fed into your rustc stdin though...

FWIW, the automatic rebuild in Fedora rawhide was just fine:
https://koji.fedoraproject.org/koji/taskinfo?taskID=43860983

thomasjfox

thomasjfox commented on Apr 28, 2020

@thomasjfox
Author

Yesterday evening I continued with the investigation. I managed to recompile the Fedora 31 source rpm on my Fedora 31 workstation. So it must be something triggered by my "custom distro" setup.

The plan for today is to limit the number of parallel build jobs to one and run "forkstat" in the background. I want to isolate the exact command invocation that triggers the error.

The /dev/null device should be fine, the build process is not running inside a container or chroot.

thomasjfox

thomasjfox commented on Apr 28, 2020

@thomasjfox
Author

Have you found this yet? Maybe something wrong with your /dev/null?

I think we might be onto something here. After I had another failed Rust 1.43.0 re-compile attempt, I noticed /dev/null was broken:

# ll /dev/null 
-rw-r--r-- 1 root root 0 Apr 28 18:14 /dev/null

(tmux crashed on startup, that's how I noticed)

I've rebooted the devbox, /dev/null is fine again and I will now compile the previous Rust 1.42.0 twice. If /dev/null stays fine, I'll re-try with Rust 1.43.0 and see if /dev/null gets replaced.

thomasjfox

thomasjfox commented on Apr 28, 2020

@thomasjfox
Author

the only remotely related change to /dev/null handling from 1.42.0 to 1.43.0 is this c2bbe33
in src/libstd/sys/unix/process/process_common.rs. It looks totally unsuspicious to me.

cuviper

cuviper commented on Apr 28, 2020

@cuviper
Member

That kind of error usually occurs when one uses something like -o /dev/null, versus redirecting >/dev/null, since -o usually writes temp names first and then renames over the target. I did find this in one test, new in 1.42 via #67458:

// compile-flags: -o /dev/null

cuviper

cuviper commented on Apr 28, 2020

@cuviper
Member

When I run that test manually as root, I do indeed get a clobbered /dev/null:

# rustc -o /dev/null src/test/ui/non-ice-error-on-worker-io-fail.rs
warning: unused import: `crate::task::Waker`
  --> src/test/ui/non-ice-error-on-worker-io-fail.rs:34:13
   |
34 |         use crate::task::Waker;
   |             ^^^^^^^^^^^^^^^^^^
   |
   = note: `#[warn(unused_imports)]` on by default

# ls -l /dev/null
-rw-r--r--. 1 root root 3562 Apr 28 09:48 /dev/null
Mark-Simulacrum

Mark-Simulacrum commented on Apr 28, 2020

@Mark-Simulacrum
Member

Running tests as root is indeed very trusting :)

But it seems like in this case we should probably write to e.g. /dev/rust-nonexistent-path instead of null to avoid the accidental breakage? Alternatively maybe we can move this test to a make test and chmod our permissions away...

cuviper

cuviper commented on Apr 28, 2020

@cuviper
Member

I don't know if a nonexistent path will exercise the intended failure mode, but a simple chmod is insufficient to stop root.

thomasjfox

thomasjfox commented on Apr 28, 2020

@thomasjfox
Author

There's the immutable xattr, even root can't delete those files. This is how Android malware persists across factory resets:
https://arstechnica.com/information-technology/2020/04/solved-how-android-backdoor-called-xhelper-survives-factory-resets/

But requiring filesystem xattrs just for one unit test is a no no.
Another option would be a read only bind mount, but again, the setup is overkill.

May be a read only place in proc like /proc/version would do:

[root@host ~]# rm -f /proc/version 
rm: cannot remove '/proc/version': Operation not permitted

[root@host ~]# echo "test" >>/proc/version 
-bash: echo: write error: Input/output error

I've started a build without the unit test in question for the night, results will be in tomorrow.

thomasjfox

thomasjfox commented on Apr 29, 2020

@thomasjfox
Author

I've started a build without the unit test in question for the night, results will be in tomorrow.

with the mentioned unit test disabled, I can finally build Rust 1.43.0 with itself. Rustception all the way, thanks!

I've verified again it didn't blow up with Rust 1.42.0, even though the unit test was part of it already.
May be the way rustc handles output file creation changed in 1.43.0, so these two in combination led to failure.

I've tested the /proc/version approach:

[root@host]# rustc -o /proc/version non-ice-error-on-worker-io-fail.rs 
warning: unused import: `crate::task::Waker`
  --> non-ice-error-on-worker-io-fail.rs:34:13
   |
34 |         use crate::task::Waker;
   |             ^^^^^^^^^^^^^^^^^^
   |
   = note: `#[warn(unused_imports)]` on by default

error: failed to write bytecode to /proc/version.non_ice_error_on_worker_io_fail.3a1fbbbh-cgu.0.rcgu.bc.z: No such file or directory (os error 2)

error: aborting due to previous error

Using an existing directory as output file gives a different error message:

[root@host]# rustc -o existing-dir-instead-of-file non-ice-error-on-worker-io-fail.rs 
error: the generated executable for the input file "non-ice-error-on-worker-io-fail.rs" conflicts with the existing directory "existing-dir-instead-of-file"

error: aborting due to previous error
cuviper

cuviper commented on Apr 29, 2020

@cuviper
Member

I think -o /does-not-exist/output will be fine, even though it's a different IO error. It still triggers the old ICE on 1.41.0, and is caught as a proper fatal error on 1.42+, even as root.

Mark-Simulacrum

Mark-Simulacrum commented on Apr 30, 2020

@Mark-Simulacrum
Member

I'd be happy to review a PR here, FWIW, it sounds like it should be a pretty simple update to the UI test.

added a commit that references this issue on May 2, 2020

Rollup merge of rust-lang#71782 - cuviper:leave-dev-null-alone, r=Mar…

2ee49eb
thomasjfox

thomasjfox commented on May 15, 2020

@thomasjfox
Author

Just wanted to say thank you for the quick fix and happy birthday to Rust!

Also Rust 1.43.1 compiled itself without problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    C-bugCategory: This is a bug.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      Participants

      @cuviper@infinity0@thomasjfox@jonas-schievink@Mark-Simulacrum

      Issue actions

        Rust 1.43 fails to compile itself · Issue #71502 · rust-lang/rust