Skip to content

regression in deterministic codegen from 1.32 to 1.33 when using --remap-path-prefix to compensate for change in source dir #59542

Closed
@jhfrontz

Description

@jhfrontz

I use a gitian build environment to compile rust source in a well-defined/stabile environment. When using rust 1.32 (rust-1.32.0-1.el7.x86_64.rpm on Centos 7), I can deterministically build object/binary (with the help of RUSTFLAGS="--remap-path-prefix=%{_builddir}=BUILDDIR -C link-arg=-Wl,--build-id=0x%{githash},-S" while running inside an rpmbuild) -- like, the build process when using 1.32 is so deterministic between runs that I can use diff (or cmp or sha256sum) to verify that two products/executables produced on different runs are identical.

However, as of 1.33 (rust-1.33.0-2.el7.x86_64.rpm on Centos 7), I get significant variation from one run to another:

$ size *float-[12]/build/usr/bin/program
   text	   data	    bss	    dec	    hex	filename
6154818	 157808	    688	6313314	 605562	20190329-float-1/build/usr/bin/program
6148249	 157328	    688	6306265	 6039d9	20190329-float-2/build/usr/bin/program

Among other things, the layout of the address space seems to vary (sample):

$ sdiff <(objdump -d *float-1/build/usr/bin/program) <(objdump -d *float-2/build/usr/bin/program) | head -20

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx904e-20190329-float-1/bui |	xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx904e-20190329-float-2/bui


Disassembly of section .init:					Disassembly of section .init:

00000000001438b0 <_init>:				      |	00000000001430b0 <_init>:
  1438b0:	48 83 ec 08          	sub    $0x8,%rsp      |	  1430b0:	48 83 ec 08          	sub    $0x8,%rsp
  1438b4:	48 8b 05 c5 19 6c 00 	mov    0x6c19c5(%rip) |	  1430b4:	48 8b 05 ed 00 6c 00 	mov    0x6c00ed(%rip)
  1438bb:	48 85 c0             	test   %rax,%rax      |	  1430bb:	48 85 c0             	test   %rax,%rax
  1438be:	74 05                	je     1438c5 <_init+ |	  1430be:	74 05                	je     1430c5 <_init+
  1438c0:	e8 53 01 00 00       	callq  143a18 <.plt.g |	  1430c0:	e8 53 01 00 00       	callq  143218 <.plt.g
  1438c5:	48 83 c4 08          	add    $0x8,%rsp      |	  1430c5:	48 83 c4 08          	add    $0x8,%rsp
  1438c9:	c3                   	retq   		      |	  1430c9:	c3                   	retq   

Disassembly of section .plt:					Disassembly of section .plt:

00000000001438d0 <.plt>:				      |	00000000001430d0 <.plt>:
  1438d0:	ff 35 ba 16 6b 00    	pushq  0x6b16ba(%rip) |	  1430d0:	ff 35 42 ff 6a 00    	pushq  0x6aff42(%rip)
  1438d6:	ff 25 bc 16 6b 00    	jmpq   *0x6b16bc(%rip |	  1430d6:	ff 25 44 ff 6a 00    	jmpq   *0x6aff44(%rip

Was something intentionally changed in 1.33 that might cause this behavior?

EDIT incorporating subsequent info:

I found a smallish open-source project that demonstrates the issue. Run
build.sh.txt -- under 1.32, I get all good; under 1.33, I get bad stuff.

Activity

added
A-codegenArea: Code generation
T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.
on Mar 30, 2019
jonas-schievink

jonas-schievink commented on Mar 30, 2019

@jonas-schievink
Contributor

AFAIK we have very few tests for reproducible builds, so this probably slipped in by accident.

cc @rust-lang/compiler - do we care enough about reproducible builds to consider this a stable-to-stable regression at this point? I know that quite a lot of people want reproducibility in their builds.

eddyb

eddyb commented on Mar 30, 2019

@eddyb
Member

This looks like a linker thing? Did we do anything wrt using lld or something like that?
cc @alexcrichton

eddyb

eddyb commented on Mar 30, 2019

@eddyb
Member

@jhfrontz Can you post a diff of symbol names and sizes? The assembly will ofc differ a lot, but I expect everything to be the same size, just shuffled around.

jhfrontz

jhfrontz commented on Mar 31, 2019

@jhfrontz
Author

@eddyb asks:

Can you post a diff of symbol names and sizes?

Are you wanting the output of nm on the two same-but-different binaries? Or maybe (since you said size) you're wanting the output of objdump -t ? But since you said shuffled around, maybe you're wanting objdump -h ?

I'll include the last (objdump -h): diff-h.txt

jhfrontz

jhfrontz commented on Apr 1, 2019

@jhfrontz
Author

@eddyb I got approval to include the full objdump -t output:
objdump-t-1.txt.gz
objdump-t-2.txt.gz

.

alexcrichton

alexcrichton commented on Apr 1, 2019

@alexcrichton
Member

This is most frequently an accidental regression in LLVM, although we've had bugs slip in with rustc as well! No major change to linker behavior in 1.32 -> 1.33.

@jhfrontz the best way to get this fixed is if a test case can be reduced to isolate the issue. Would it be possible to minimize this to extract a small piece of code which exhibits the reproducibility issue?

jhfrontz

jhfrontz commented on Apr 12, 2019

@jhfrontz
Author

@alexcrichton asks:

Would it be possible to minimize this to extract a small piece of code which exhibits the reproducibility issue?

I haven't been able to -- my simple "hello, world" toy programs always result in the same binary. It's only when I build production (proprietary) code that I'm seeing this. Full disclosure: I barely know enough rust to be able to run cargo. If there is a model "major project" in rust that I could/should run through my deterministic build environment, I'd be glad to try it and report back.

eddyb

eddyb commented on Apr 13, 2019

@eddyb
Member

Sorry, I didn't see the notification.
I was hoping for a diff that would be easy to spot the differences in, but I guess I'll download the two files and run kdiff3 on them, when I get to the office

eddyb

eddyb commented on Apr 13, 2019

@eddyb
Member

I just did this:

gunzip ~/Downloads/objdump-t-{1,2}.txt.gz
cat ~/Downloads/objdump-t-1.txt | grep _ZN | sed -E 's/^.* +([^ ]+)$/\1/' | sort > sym1
cat ~/Downloads/objdump-t-2.txt | grep _ZN | sed -E 's/^.* +([^ ]+)$/\1/' | sort > sym2

And... the hashes don't match. These are using the same rustc binary, right?

Can you run cargo with -v and grab the -C metadata arguments? If they differ between the two runs, then your problem is not even in rustc/LLVM.
If they're the same, try getting the sorted symbol list from a small rlib.

jhfrontz

jhfrontz commented on Apr 16, 2019

@jhfrontz
Author

I'm using the same binary (everything is the same, including the time -- via libfaketime).

---> Package rust.x86_64 0:1.33.0-2.el7 will be installed
--> Processing Dependency: rust-std-static(x86-64) = 1.33.0-2.el7 for package: rust-1.33.0-2.el7.x86_64
--> Processing Dependency: libLLVM-7.so(LLVM_7)(64bit) for package: rust-1.33.0-2.el7.x86_64
--> Processing Dependency: libLLVM-7.so()(64bit) for package: rust-1.33.0-2.el7.x86_64

I'm building with cargo --frozen so, I'm pretty sure by definition the actual contents metadata is the same (though cargo -v -C metadata yields error: Found argument '-C' which wasn't expected, or isn't valid in this context).

However, the ordering of the crates listed in the metadata from two builds is different. Also, the src_path entries are different (but I would expect for that to be irrelevant in the presence of the --remap-path-prefix mentioned above ). At least, the src_path is irrelevant when using 1.32.

Attached are the dumps from nm on two different runs' libuuid rlibs. They're both quite different.

3-libuuid-nm.txt
4-libuuid-nm.txt

Just to recall -- this still works as expected (producing deterministic binary) if I revert to rustc version 1.32. I ran it twice with 1.32 and

  • the cargo -v metadata output was similarly differently ordered between the runs (but seemingly the same contents, with the exception of src_path).
  • the libuuid rlib files were identical (filenames, contents, symbol tables).
  • the resulting binaries were identical.
eddyb

eddyb commented on Apr 17, 2019

@eddyb
Member

Ah, yeah, --remap-path-prefix isn't perfect, so this could be a regression related to that.

Also, by cargo -v and -C metadata I meant running whatever cargo command you're running with -v (i.e. verbose) and extracting the -C metadata from the rustc invocations it prints to stderr.

I'm building with cargo --frozen so, I'm pretty sure by definition the actual contents metadata is the same

--frozen is for your dependencies, but if you're building in a different directory, that might still affect certain details.

However, the ordering of the crates listed in the metadata from two builds is different.

Not sure what you mean by "metadata" here (I was referring to a hash Cargo passes to each rustc invocation).

eddyb

eddyb commented on Apr 17, 2019

@eddyb
Member

Nominating for discussion at the next @rust-lang/compiler meeting, as this seems more and more like it's a regression in how we handle source paths.

41 remaining items

eddyb

eddyb commented on May 5, 2019

@eddyb
apoelstra

apoelstra commented on May 5, 2019

@apoelstra
eddyb

eddyb commented on May 5, 2019

@eddyb
michaelwoerister

michaelwoerister commented on May 6, 2019

@michaelwoerister
Member

Hey, to re-iterate: There is no way to have reproducible builds via Cargo from different source directories if Cargo feeds the RUSTFLAGS value into the -Cmetadata argument to rustc. In my opinion this is a bug in Cargo. Either Cargo needs to provide a way to pass the --remap-path-prefix argument to rustc without RUSTFLAGS or the new behavior should be made optional.

eddyb

eddyb commented on May 6, 2019

@eddyb
michaelwoerister

michaelwoerister commented on May 6, 2019

@michaelwoerister
jhfrontz

jhfrontz commented on May 7, 2019

@jhfrontz
Author

OK, it turns out that it is cargo that is causing the issue (@eddyb found that I wasn't actually using the 1.32 version of cargo in my earlier experiments). Now, with cargo-1.32 and rust-1.34, I get deterministic output:

$ find . -name create_test_daemon_conf | grep build/usr/bin | xargs sha256sum
b0d3d894800eb6c80286e5830a6f9c28394f9bb0ed6d1a52827bd813e8848e6a  ./artifacts-2/build/usr/bin/create_test_daemon_conf
b0d3d894800eb6c80286e5830a6f9c28394f9bb0ed6d1a52827bd813e8848e6a  ./artifacts/build/usr/bin/create_test_daemon_conf
eddyb

eddyb commented on May 7, 2019

@eddyb
Member

I've just hidden a bunch of comments that were really confusing because of unrigorous debugging.

Also, as we've determined this was entirely caused by Cargo, I've opened rust-lang/cargo#6914 instead, and I'm closing this issue on the Rust side.

added
T-cargoRelevant to the cargo team, which will review and decide on the PR/issue.
and removed
T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.
on Aug 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Labels

A-codegenArea: Code generationA-reproducibilityArea: Reproducible / deterministic buildsT-cargoRelevant to the cargo team, which will review and decide on the PR/issue.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

    Development

    No branches or pull requests

      Participants

      @alexcrichton@eddyb@pnkfelix@Centril@apoelstra

      Issue actions

        regression in deterministic codegen from 1.32 to 1.33 when using --remap-path-prefix to compensate for change in source dir · Issue #59542 · rust-lang/rust