Description
I use a gitian build environment to compile rust source in a well-defined/stabile environment. When using rust 1.32 (rust-1.32.0-1.el7.x86_64.rpm on Centos 7), I can deterministically build object/binary (with the help of RUSTFLAGS="--remap-path-prefix=%{_builddir}=BUILDDIR -C link-arg=-Wl,--build-id=0x%{githash},-S"
while running inside an rpmbuild
) -- like, the build process when using 1.32 is so deterministic between runs that I can use diff
(or cmp
or sha256sum
) to verify that two products/executables produced on different runs are identical.
However, as of 1.33 (rust-1.33.0-2.el7.x86_64.rpm on Centos 7), I get significant variation from one run to another:
$ size *float-[12]/build/usr/bin/program
text data bss dec hex filename
6154818 157808 688 6313314 605562 20190329-float-1/build/usr/bin/program
6148249 157328 688 6306265 6039d9 20190329-float-2/build/usr/bin/program
Among other things, the layout of the address space seems to vary (sample):
$ sdiff <(objdump -d *float-1/build/usr/bin/program) <(objdump -d *float-2/build/usr/bin/program) | head -20
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx904e-20190329-float-1/bui | xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx904e-20190329-float-2/bui
Disassembly of section .init: Disassembly of section .init:
00000000001438b0 <_init>: | 00000000001430b0 <_init>:
1438b0: 48 83 ec 08 sub $0x8,%rsp | 1430b0: 48 83 ec 08 sub $0x8,%rsp
1438b4: 48 8b 05 c5 19 6c 00 mov 0x6c19c5(%rip) | 1430b4: 48 8b 05 ed 00 6c 00 mov 0x6c00ed(%rip)
1438bb: 48 85 c0 test %rax,%rax | 1430bb: 48 85 c0 test %rax,%rax
1438be: 74 05 je 1438c5 <_init+ | 1430be: 74 05 je 1430c5 <_init+
1438c0: e8 53 01 00 00 callq 143a18 <.plt.g | 1430c0: e8 53 01 00 00 callq 143218 <.plt.g
1438c5: 48 83 c4 08 add $0x8,%rsp | 1430c5: 48 83 c4 08 add $0x8,%rsp
1438c9: c3 retq | 1430c9: c3 retq
Disassembly of section .plt: Disassembly of section .plt:
00000000001438d0 <.plt>: | 00000000001430d0 <.plt>:
1438d0: ff 35 ba 16 6b 00 pushq 0x6b16ba(%rip) | 1430d0: ff 35 42 ff 6a 00 pushq 0x6aff42(%rip)
1438d6: ff 25 bc 16 6b 00 jmpq *0x6b16bc(%rip | 1430d6: ff 25 44 ff 6a 00 jmpq *0x6aff44(%rip
Was something intentionally changed in 1.33 that might cause this behavior?
EDIT incorporating subsequent info:
I found a smallish open-source project that demonstrates the issue. Run
build.sh.txt -- under 1.32, I get all good; under 1.33, I get bad stuff.
Activity
jonas-schievink commentedon Mar 30, 2019
AFAIK we have very few tests for reproducible builds, so this probably slipped in by accident.
cc @rust-lang/compiler - do we care enough about reproducible builds to consider this a stable-to-stable regression at this point? I know that quite a lot of people want reproducibility in their builds.
eddyb commentedon Mar 30, 2019
This looks like a linker thing? Did we do anything wrt using lld or something like that?
cc @alexcrichton
eddyb commentedon Mar 30, 2019
@jhfrontz Can you post a diff of symbol names and sizes? The assembly will ofc differ a lot, but I expect everything to be the same size, just shuffled around.
jhfrontz commentedon Mar 31, 2019
@eddyb asks:
Are you wanting the output of
nm
on the two same-but-different binaries? Or maybe (since you said size) you're wanting the output ofobjdump -t
? But since you said shuffled around, maybe you're wantingobjdump -h
?I'll include the last (
objdump -h
): diff-h.txtjhfrontz commentedon Apr 1, 2019
@eddyb I got approval to include the full
objdump -t
output:objdump-t-1.txt.gz
objdump-t-2.txt.gz
.
alexcrichton commentedon Apr 1, 2019
This is most frequently an accidental regression in LLVM, although we've had bugs slip in with rustc as well! No major change to linker behavior in 1.32 -> 1.33.
@jhfrontz the best way to get this fixed is if a test case can be reduced to isolate the issue. Would it be possible to minimize this to extract a small piece of code which exhibits the reproducibility issue?
jhfrontz commentedon Apr 12, 2019
@alexcrichton asks:
I haven't been able to -- my simple "hello, world" toy programs always result in the same binary. It's only when I build production (proprietary) code that I'm seeing this. Full disclosure: I barely know enough rust to be able to run
cargo
. If there is a model "major project" in rust that I could/should run through my deterministic build environment, I'd be glad to try it and report back.eddyb commentedon Apr 13, 2019
Sorry, I didn't see the notification.
I was hoping for a diff that would be easy to spot the differences in, but I guess I'll download the two files and run kdiff3 on them, when I get to the office
eddyb commentedon Apr 13, 2019
I just did this:
And... the hashes don't match. These are using the same rustc binary, right?
Can you run
cargo
with-v
and grab the-C metadata
arguments? If they differ between the two runs, then your problem is not even in rustc/LLVM.If they're the same, try getting the sorted symbol list from a small
rlib
.jhfrontz commentedon Apr 16, 2019
I'm using the same binary (everything is the same, including the time -- via libfaketime).
I'm building with
cargo --frozen
so, I'm pretty sure by definition the actual contents metadata is the same (thoughcargo -v -C metadata
yieldserror: Found argument '-C' which wasn't expected, or isn't valid in this context
).However, the ordering of the crates listed in the metadata from two builds is different. Also, the
src_path
entries are different (but I would expect for that to be irrelevant in the presence of the--remap-path-prefix
mentioned above ). At least, the src_path is irrelevant when using 1.32.Attached are the dumps from
nm
on two different runs' libuuid rlibs. They're both quite different.3-libuuid-nm.txt
4-libuuid-nm.txt
Just to recall -- this still works as expected (producing deterministic binary) if I revert to rustc version 1.32. I ran it twice with 1.32 and
cargo -v metadata
output was similarly differently ordered between the runs (but seemingly the same contents, with the exception ofsrc_path
).eddyb commentedon Apr 17, 2019
Ah, yeah,
--remap-path-prefix
isn't perfect, so this could be a regression related to that.Also, by
cargo -v
and-C metadata
I meant running whatever cargo command you're running with-v
(i.e. verbose) and extracting the-C metadata
from therustc
invocations it prints to stderr.--frozen
is for your dependencies, but if you're building in a different directory, that might still affect certain details.Not sure what you mean by "metadata" here (I was referring to a hash Cargo passes to each rustc invocation).
eddyb commentedon Apr 17, 2019
Nominating for discussion at the next @rust-lang/compiler meeting, as this seems more and more like it's a regression in how we handle source paths.
41 remaining items
eddyb commentedon May 5, 2019
apoelstra commentedon May 5, 2019
eddyb commentedon May 5, 2019
michaelwoerister commentedon May 6, 2019
Hey, to re-iterate: There is no way to have reproducible builds via Cargo from different source directories if Cargo feeds the RUSTFLAGS value into the
-Cmetadata
argument torustc
. In my opinion this is a bug in Cargo. Either Cargo needs to provide a way to pass the--remap-path-prefix
argument torustc
without RUSTFLAGS or the new behavior should be made optional.eddyb commentedon May 6, 2019
michaelwoerister commentedon May 6, 2019
jhfrontz commentedon May 7, 2019
OK, it turns out that it is cargo that is causing the issue (@eddyb found that I wasn't actually using the 1.32 version of cargo in my earlier experiments). Now, with cargo-1.32 and rust-1.34, I get deterministic output:
eddyb commentedon May 7, 2019
I've just hidden a bunch of comments that were really confusing because of unrigorous debugging.
Also, as we've determined this was entirely caused by Cargo, I've opened rust-lang/cargo#6914 instead, and I'm closing this issue on the Rust side.