Closed
Description
The scaleway builders are all saying:
root@scw-e8738f:~# journalctl -f -u rundockerbuildlet.service
...
Nov 15 20:41:28 scw-e8738f rundockerbuildlet[4707]: 2017/11/15 20:41:28 Creating scaleway-prod-05 ...
Nov 15 20:41:28 scw-e8738f rundockerbuildlet[4707]: 2017/11/15 20:41:28 Error creating scaleway-prod-05: exit status 125, docker: Error response from daemon: Conflict. The name "/scaleway-prod-05" is already in use by container e1a1f4e89f3a073402f2ae513753db661c41a4727e808215758364c313850357. You have to remove (or rename) that container to be able to reuse that name..
Nov 15 20:41:28 scw-e8738f rundockerbuildlet[4707]: See 'docker run --help'.
Nov 15 20:41:29 scw-e8738f rundockerbuildlet[4707]: 2017/11/15 20:41:29 Creating scaleway-prod-05 ...
Nov 15 20:41:29 scw-e8738f rundockerbuildlet[4707]: 2017/11/15 20:41:29 Error creating scaleway-prod-05: exit status 125, docker: Error response from daemon: Conflict. The name "/scaleway-prod-05" is already in use by container e1a1f4e89f3a073402f2ae513753db661c41a4727e808215758364c313850357. You have to remove (or rename) that container to be able to reuse that name..
Nov 15 20:41:29 scw-e8738f rundockerbuildlet[4707]: See 'docker run --help'.
Metadata
Metadata
Assignees
Type
Projects
Relationships
Development
No branches or pull requests
Activity
gopherbot commentedon Nov 15, 2017
Change https://golang.org/cl/78032 mentions this issue:
dashboard: remove linux-arm as a trybot
dashboard: remove linux-arm as a trybot
bradfitz commentedon Nov 16, 2017
I nuked all 50 instances and recreated and now I see them just deleting & restarting the containers:
Why? Maybe because of #22749 is making the containers start up too slowly?
bradfitz commentedon Nov 16, 2017
Hm... archive/tar changes?
The docker logs are saying:
Note the
tar file contained invalid name ""
.I probably built the linux-arm binary with Go tip when I addressed #21839, and now the new tip-built buildlet can't parse the coordinator's Go 1.8 tar files.
/cc @dsnet
dsnet commentedon Nov 16, 2017
What's the code that generates the tar file (or reads it)? The "0 files, 0 dirs" is also interesting.
bradfitz commentedon Nov 16, 2017
The
The "0 files, 0 dirs"
just means it's failing on the first entry and the counters for number of files and number of dirs haven't been incremented yet.The tar.gz comes from x/build/cmd/coordinator which gets it either from gitmirror (which just does
exec.CommandContext(ctx, "git", "archive", "--format=tgz", rev)
) or it gets it directly from Gerrit's equivalent export-a-tarball handler.There's only one use of tar.NewWriter that we're using in the coordinator (to generate a VERSION file), but that push works fine. You can see it in the log above (the first one,
extracted tarball into /workdir/go: 1 files, 1 dirs (1.338425ms)
)bradfitz commentedon Nov 16, 2017
Btw, I re-pushed a new buildlet binary built at Go 1.9 and they're all working again. So it does seem to be something at tip.
dsnet commentedon Nov 16, 2017
Is there any way to get access to the tar.gz file being passed in?
bradfitz commentedon Nov 16, 2017
With some work I suppose.
dsnet commentedon Nov 16, 2017
I have a theory.
I added support for parsing the PAX records for an esoteric feature of tar, which are global PAX records, denoted by a typeflag of
TypeXGlobalHeader
. The relevant new code may return a header without a filename (since they don't make sense for global headers).Assuming this is the issue, you probably want to either push the
validRelPath
check to be within each case of the switch statement or check theHeader.Typeflag
up front .dsnet commentedon Nov 16, 2017
Confirmed. I ran
git archive --format=tgz ee6321b5405504f1d090d01a4703ec9ff6b218ea | gzip -d | head -c 512| hexdump -C
and got:You can see a typeflag of 'g' at offset 0x9c.
I'm rather surprised that git outputs archives using this esoteric feature.
bradfitz commentedon Nov 16, 2017
Do you think the Go tar reader should expose these records by default, or do you think they should be opt-in?
dsnet commentedon Nov 16, 2017
They were actually always returned in Go1.9 and earlier, but returned a bogus file containing the raw contents of the PAX headers.
The buildlet logic actually happened to work by chance. There aren't any hard restrictions on what filename to use for global headers, so the one used by git just happened to work for
validRelPath
. So if you looked onto the filesystem of the builders, you would have found a bogus file called "pax_global_header".I was actually surprised that the buildet worked at all in Go1.9 and earlier. The surprising behavior is that
tar.Header{Typeflag: tar.TypeXGlobalHeader}.FileInfo().Mode().IsRegular()
reports true, when I would have expected false. Had this been false, then it would hit the default case in the logic, making the problem obvious.Unfortunately, I don't know how to fix
tar.Header.FileInfo.Mode
to report false here. The problem is thatos.FileMode.IsRegular
has no way to indicate that something is not a regular file without saying that it is something else since it only checks no mode bits are set.gopherbot commentedon Nov 16, 2017
Change https://golang.org/cl/78355 mentions this issue:
archive/tar: use placeholder name for global PAX records
gopherbot commentedon Nov 18, 2017
Change https://golang.org/cl/78546 mentions this issue:
cmd/buildlet: ignore pax global header when untarring
archive/tar: use placeholder name for global PAX records