Description
During compiler development, I periodically take snapshots of my git working tree, copy them to my workstation, and then run a bunch of tests. The script I run on the workstation currently looks like this:
cd $GOROOT/test
git checkout $TRYREV
go install -toolexec=toolstash cmd/compile
go build -a -v -gcflags=-d=inlfuncswithclosures=0 -toolexec='toolstash -cmp' std cmd
go install -v std cmd/cgo cmd/go cmd/link
go install -v -x cmd/cgo cmd/go cmd/link # why??
go run run.go
go test -short std cmd
I've added the second go install -v -x cmd/cgo cmd/go cmd/link
invocation because otherwise tests like cmd/link's TestDWARF fail because cmd/go says cmd/link is stale. I would expect the one go install
invocation to be sufficient, but the go install -x
output (e.g., https://gist.github.com/mdempsky/bdfc2247df2bba822b833b78309660e9 has the output from my most recent script run) confirms that cmd/cgo, cmd/go, and cmd/link are all being relinked and reinstalled.
The staleness issue bites me quite frequently during my workflow, but I haven't managed to isolate steps that reliably reproduce the failure.
/cc @bcmills
Activity
ianlancetaylor commentedon May 26, 2021
This is my understanding.
Running
go install cmd/link
will update the installedlink
tool if the hash code has changed. The hash code of linking a program includes the hash code ofcmd/link
itself. So the first install ofcmd/link
gives you alink
tool with a hash code that records thelink
tool used to build it. When you rungo install
again, the link tool has changed, so the hash code has changed, so the linker gets rebuilt and reinstalled. This time, it won't actually change, so anothergo install cmd/link
will see that hash code matches and it won't do anything.This is basically why
make.bash
builds the tools three times. See this comment from cmd/dist/build.go:At least, that is my understanding.
thanm commentedon May 26, 2021
It is interesting to get some insight into this problem.
I run the cmd/link TestDWARF tests by hand a good deal (since I spend a fair amount of time making changes to the dwarf code), and it is frustrating to have the the test inform me that the linker is stale right after I just did "go install cmd/link".
Maybe we need a new "go installthreetimes cmd/link" or something like that.
mdempsky commentedon May 26, 2021
I see. I knew bootstrapping reinstalls multiple times, but I guess I assumed cmd/go would handle the details after that.
If it stays the user's responsibility to run
go install
to fixpoint and things are working as intended, then that's surprising, but okay.rsc commentedon May 26, 2021
Re "the user's responsiblity", this is not something that affects ordinary users. It is something that affects only people who install the tools that the go command uses to build things, namely 'cmd'.
The go command should not go out of its way to add special cases for developing the toolchain itself. That is the job of cmd/dist.
Apologies for needing to run 'go install' multiple times to reach a fixed point, but this is a fundamental consequence of the interaction between (1) cmd/link essentially depending on itself and (2) content-hash-based staleness.
mvdan commentedon May 26, 2021
For this reason I use
toolstash restore && go install cmd/xxx
if I'm hacking on the toolchain itself. This speeds up the iteration by making sure that I always build with the master version (stored withtoolstash save
), so I don't need to rebuild half of std and cmd again.If we agree that this is something we should simply document better, maybe we can add it to #30074.
mdempsky commentedon May 26, 2021
Ack, I agree this issue has limited scope of people affected. But as someone who works primarily on cmd/compile, I still consider myself a user of cmd/go. That's all I meant.
I'm not very familiar with cmd/dist's full functionality. After I make changes to cmd/compile, is there a cmd/dist invocation to bring things back to a fixed point with the minimal amount of redundant work?
I generally avoid running the full make.bash/all.bash script because I like the flexibility of tweaking my test script to reorder tests to prioritize whatever particular corner case I'm actively dealing with. Waiting for the complete triple-bootstrap really slows down my test cycles when I'm trying to figure out why my change broke fixedbugs/issue12345.go.
Yeah, I'm thinking I should use
toolstash restore
in my script instead of-toolexec=toolstash
. The latter is just muscle memory from doing cmd/compile-only hacking.But does that actually address the fixpoint issue?
mdempsky commentedon May 26, 2021
Edit: Nevermind, this was likely my fault. I forgot that I'd changed ~/pkg/toolstash/compile to a symlink so I could easily switch between a few stashed compilers. I guess that worked with
toolstash save
but not withtoolstash restore
.My build script is currently:But I keep running into a situation where the initialgo install cmd/compile
command seems to try recompiling cmd/compile with a broken compiler instead of the known-good version that I stashed and restored.For example, right now I have this reliable behavior from cmd/go:Why is cmd/go giving me different behavior depending on whether I use-toolexec=toolstash
ortoolstash restore
immediately before?(I'm not surprised about the compiler error: that's explained by a previous change I made. But I've reverted it in my source tree, and it wasn't in the stashed compiler, so I don't understand why the failure is persisting.)rsc commentedon May 27, 2021
If you want a complete fixed-point where nothing at all in the tree appears to be stale after modifying the sources to cmd/compile or cmd/link, then the minimum is
Looking at the inputs and outputs of each (the input to all steps is the same modified source code, so I'm only looking at the toolchain inputs, namely which toolchain is being used):
Inputs = original toolchain.
Outputs = new toolchain as compiled by original.
Inputs = new toolchain as compiled by original.
Outputs = new toolchain as compiled by (new toolchain as compiled by original).
Inputs = new toolchain as compiled by (new toolchain as compiled by original).
Outputs = new toolchain as compiled by (new toolchain as compiled by (new toolchain as compiled by original)).
Because we are using content-based (as opposed to mtime-based) staleness, a particular build result is indexed by the content hashes of the input sources (not changing here) and the binaries being used.
The binaries used as inputs in steps 1 and 2 are certainly different, since the one in step 2 has the modifications compiled into it.
The binaries used as inputs in steps 2 and 3 are different as well, at least when the modifications affect generated code.
Consider the case where the modifications contain an optimization that applies to some part of the compiler.
The output of step 1 = input of step 2 is an unoptimized compilation of the modified compiler sources.
The output of step 2 = input of step 3 is the optimized compilation of the modified compiler sources.
In the absence of compiler bugs, these two toolchains, although different in binary form, should be semantically equivalent.
Assuming those two toolchains really are semantically equivalent, they should produce identical binaries as output, except for the fact that the input hashes recorded in the binaries are different. We can't just rewrite the hashes, because maybe the toolchains aren't semantically equivalent. That is, maybe there's a bug in the compiler (it happens!).
Assuming the two toolchains really behave identically, then step 3 is a semantic repeat of step 2, just using a toolchain with different binaries. So it should produce the same outputs. But these outputs will have been compiled with themselves as their own input hashes, proving that we really have reached a fixed point and that the toolchains really are semantically equivalent, at least as applied to their own source code.
If go list says any part of cmd is stale after step 3, then there is a bug in the toolchain that is causing it not to produce deterministic outputs. This is what the staleness check in cmd/dist looks for.
Finally we need to recompile the standard library with the fixed-point compilers, which is step 4.
All this only matters if you are trying to make the compilers themselves not be reported as stale. The go command has no problem invoking stale compilers, so when I'm working on the toolchain, I just do one
go install cmd
(orgo install cmd/link
etc, whatever I'm changing) and then immediately move on togo test -short std
. If you trust that the existing compiler works well enough to compile your changes, then there's no problem not forcing the fixed point. You're just trusting instead of verifying. At that point the only tests that fail are the ones that are picky about staleness.bcmills commentedon May 27, 2021
For what it's worth, I just made a lot of those tests a lot less picky about staleness for #46347; see CL 322629.
rsc commentedon May 27, 2021
Should this issue be closed?