Skip to content

building on illumos gets stuck occasionally #35261

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jclulow opened this issue Oct 31, 2019 · 7 comments
Closed

building on illumos gets stuck occasionally #35261

jclulow opened this issue Oct 31, 2019 · 7 comments

Comments

@jclulow
Copy link
Contributor

jclulow commented Oct 31, 2019

As I've been trying to look into #35085, I've noticed the builder become stuck a couple of times. Most recently I've dug in to see where.

The last build action:

2019/10/31 01:59:08 [0xc0000b8580] Running /var/tmp/workdir-host-illumos-amd64-jclulow/go/src/make.bash with args ["/var/tmp/workdir-host-illumos-amd64-jclulow/go/src/make.bash"] and env ["GOMAXPROCS=4" "TMPDIR=/var/tmp" "USER=gobuild" "GOROOT_BOOTSTRAP=/var/tmp/workdir-host-illumos-amd64-jclulow/go1.4" "SMF_FMRI=svc:/site/buildlet:default" "A__z=\"*SHLVL" "PATH=/usr/bin:/usr/sbin:/sbin:/opt/local/bin:/opt/local/sbin:/opt/go/bootstrap/bin" "GO_BUILDER_ENV=host-illumos-amd64-jclulow" "SMF_RESTARTER=svc:/system/svc/restarter:default" "PWD=/home/gobuild" "LANG=en_US.UTF-8" "TZ=UTC" "SMF_ZONENAME=09a56fd3-94eb-ee7b-d03c-9bfd8c9b7619" "SHLVL=2" "HOME=/home/gobuild" "LOGNAME=gobuild" "SMF_METHOD=start" "_=/usr/bin/ctrun" "GO_STAGE0_NET_DELAY=400ms" "GO_STAGE0_DL_DELAY=200ms" "WORKDIR=/var/tmp/workdir-host-illumos-amd64-jclulow" "GO_BUILDER_NAME=illumos-amd64" "GOBIN=" "TMPDIR=/var/tmp/workdir-host-illumos-amd64-jclulow/tmp" "GOCACHE=/var/tmp/workdir-host-illumos-amd64-jclulow/gocache" "GOROOT_BOOTSTRAP=/opt/go/bootstrap"] in dir /var/tmp/workdir-host-illumos-amd64-jclulow/go/src

This has been running for some time:

[root@gobuild3 ~]# date -uR
Thu, 31 Oct 2019 02:50:26 +0000

The process tree:

      77885 /opt/go/build/bin/stage0
        77886 ./buildlet.exe --halt=false --reverse-type=host-illumos-amd64-jclulow --coo
          77887 bash /var/tmp/workdir-host-illumos-amd64-jclulow/go/src/make.bash
            78001 ./cmd/dist/dist bootstrap -a
              78401 /var/tmp/workdir-host-illumos-amd64-jclulow/go/pkg/tool/illumos_amd64
[root@gobuild3 ~]# pargs 78401
78401:	/var/tmp/workdir-host-illumos-amd64-jclulow/go/pkg/tool/illumos_amd64/go_bootst
argv[0]: /var/tmp/workdir-host-illumos-amd64-jclulow/go/pkg/tool/illumos_amd64/go_bootstrap
argv[1]: install
argv[2]: -gcflags=all=
argv[3]: -ldflags=all=
argv[4]: -i
argv[5]: cmd/asm
argv[6]: cmd/cgo
argv[7]: cmd/compile
argv[8]: cmd/link

There's an OS thread possibly stuck in the allocator?

--------------------- thread# 1 / lwp# 1 ---------------------
 fffffc7fef2905f7 lwp_park (0, 0, 0)
 fffffc7fef282883 sema_wait (affbb0) + 13
 fffffc7fef275ef8 sem_wait (affbb0) + 38
 000000000045ef1a runtime.asmsysvicall6 () + 5a
 000000000040a6e0 runtime.notesleep () + e0
 0000000000434ca0 runtime.stopm () + c0
 0000000000438881 runtime.exitsyscall0 () + 111
 000000000045b5e4 runtime.mcall () + 64
 0000000000416c05 runtime.(*mcache).refill () + 85
 000000000040b7e7 runtime.(*mcache).nextFree () + 87
 000000000040c123 runtime.mallocgc () + 793
 0000000000444fdc runtime.makeslice () + 6c
 00000000004eacc3 bytes.makeSlice () + 73
 00000000004ea60b bytes.(*Buffer).grow () + 15b
 00000000004eaab8 bytes.(*Buffer).ReadFrom () + 48
 000000000049a87c io.copyBuffer () + 2fc
 00000000004fa653 os/exec.(*Cmd).writerDescriptor.func1 () + 63
 00000000004fa6d7 os/exec.(*Cmd).Start.func1 () + 27
 000000000045d5e1 runtime.goexit () + 1
--------------------- thread# 2 / lwp# 2 ---------------------
 fffffc7fef2905f7 lwp_park (0, fffffc7fee7ffdc0, 0)
 fffffc7fef2828b8 sema_reltimedwait (affdb0, c000038358) + 28
 fffffc7fef275fdb sem_reltimedwait_np (affdb0, c000038358) + 4b
 000000000045ef1a runtime.asmsysvicall6 () + 5a
 000000000040a84d runtime.notetsleep_internal () + 10d
 000000000040aa08 runtime.notetsleep () + 58
 000000000043b93f runtime.sysmon () + 3bf
 00000000004338d3 runtime.mstart1 () + c3
 00000000004337f6 runtime.mstart () + 66
 000000000045efa2 runtime.tstart_sysvicall () + 42
 fffffc7fef2905b0 _lwp_start ()
--------------------- thread# 3 / lwp# 3 ---------------------
 fffffc7fef2905f7 lwp_park (0, 0, 0)
 fffffc7fef282883 sema_wait (affc30) + 13
 fffffc7fef275ef8 sem_wait (affc30) + 38
 000000000045ef1a runtime.asmsysvicall6 () + 5a
 000000000040a6e0 runtime.notesleep () + e0
 0000000000434ca0 runtime.stopm () + c0
 000000000043626d runtime.findrunnable () + a0d
 0000000000436da5 runtime.schedule () + 2f5
 00000000004371ed runtime.park_m () + 9d
 000000000045b5e4 runtime.mcall () + 64
 000000000041b7ff runtime.gcBgMarkWorker () + ff
 000000000045d5e1 runtime.goexit () + 1
--------------------- thread# 4 / lwp# 4 ---------------------
 fffffc7fef2905f7 lwp_park (0, 0, 0)
 fffffc7fef282883 sema_wait (affbf0) + 13
 fffffc7fef275ef8 sem_wait (affbf0) + 38
 000000000045ef1a runtime.asmsysvicall6 () + 5a
 000000000040a6e0 runtime.notesleep () + e0
 0000000000434ca0 runtime.stopm () + c0
 000000000043626d runtime.findrunnable () + a0d
 0000000000436da5 runtime.schedule () + 2f5
 00000000004371ed runtime.park_m () + 9d
 000000000045b5e4 runtime.mcall () + 64
 000000000041b7ff runtime.gcBgMarkWorker () + ff
 000000000045d5e1 runtime.goexit () + 1
--------------------- thread# 5 / lwp# 5 ---------------------
 fffffc7fef2905f7 lwp_park (0, 0, 0)
 fffffc7fef282883 sema_wait (affc70) + 13
 fffffc7fef275ef8 sem_wait (affc70) + 38
 000000000045ef1a runtime.asmsysvicall6 () + 5a
 000000000040a6e0 runtime.notesleep () + e0
 0000000000434ca0 runtime.stopm () + c0
 000000000043626d runtime.findrunnable () + a0d
 0000000000436da5 runtime.schedule () + 2f5
 00000000004371ed runtime.park_m () + 9d
 000000000045b5e4 runtime.mcall () + 64
--------------------- thread# 6 / lwp# 6 ---------------------
 fffffc7fef2905f7 lwp_park (0, 0, 0)
 fffffc7fef282883 sema_wait (affcb0) + 13
 fffffc7fef275ef8 sem_wait (affcb0) + 38
 000000000045ef1a runtime.asmsysvicall6 () + 5a
 000000000040a6e0 runtime.notesleep () + e0
 0000000000434ca0 runtime.stopm () + c0
 000000000043626d runtime.findrunnable () + a0d
 0000000000436da5 runtime.schedule () + 2f5
 00000000004371ed runtime.park_m () + 9d
 000000000045b5e4 runtime.mcall () + 64
 0000000000421753 runtime.bgscavenge () + 3b3
 000000000045d5e1 runtime.goexit () + 1
--------------------- thread# 7 / lwp# 7 ---------------------
 fffffc7fef2905f7 lwp_park (0, 0, 0)
 fffffc7fef282883 sema_wait (affcf0) + 13
 fffffc7fef275ef8 sem_wait (affcf0) + 38
 000000000045ef1a runtime.asmsysvicall6 () + 5a
 000000000040a6e0 runtime.notesleep () + e0
 0000000000434ca0 runtime.stopm () + c0
 000000000043626d runtime.findrunnable () + a0d
 0000000000436da5 runtime.schedule () + 2f5
 00000000004371ed runtime.park_m () + 9d
 000000000045b5e4 runtime.mcall () + 64
 0000000000421e81 runtime.bgsweep () + 131
 000000000045d5e1 runtime.goexit () + 1
--------------------- thread# 8 / lwp# 8 ---------------------
 fffffc7fef2905f7 lwp_park (0, 0, 0)
 fffffc7fef282883 sema_wait (affd30) + 13
 fffffc7fef275ef8 sem_wait (affd30) + 38
 000000000045ef1a runtime.asmsysvicall6 () + 5a
 000000000040a6e0 runtime.notesleep () + e0
 0000000000434ca0 runtime.stopm () + c0
 0000000000438881 runtime.exitsyscall0 () + 111
 000000000045b5e4 runtime.mcall () + 64
 0000000000000000 ???????? ()
 0000c000142e5700 ???????? ()
--------------------- thread# 9 / lwp# 9 ---------------------
 fffffc7fef296a6a portfs   (6, 4, fffffc7fea1ff200, 80, 1, 0)
 000000000045ef1a runtime.asmsysvicall6 () + 5a
 000000000042ad55 runtime.netpoll () + c5
 0000000000435f8b runtime.findrunnable () + 72b
 0000000000436da5 runtime.schedule () + 2f5
 00000000004371ed runtime.park_m () + 9d
 000000000045b5e4 runtime.mcall () + 64
 0000000000421753 runtime.bgscavenge () + 3b3
 000000000045d5e1 runtime.goexit () + 1
-------------------- thread# 10 / lwp# 10 --------------------
 fffffc7fef2905f7 lwp_park (0, 0, 0)
 fffffc7fef282883 sema_wait (affdf0) + 13
 fffffc7fef275ef8 sem_wait (affdf0) + 38
 000000000045ef1a runtime.asmsysvicall6 () + 5a
 000000000040a6e0 runtime.notesleep () + e0
 0000000000434ca0 runtime.stopm () + c0
 000000000043626d runtime.findrunnable () + a0d
 0000000000436da5 runtime.schedule () + 2f5
 00000000004371ed runtime.park_m () + 9d
 000000000045b5e4 runtime.mcall () + 64
 000000000042136d runtime.scavengeSleep () + ed
 0000000000421753 runtime.bgscavenge () + 3b3
 000000000045d5e1 runtime.goexit () + 1
-------------------- thread# 11 / lwp# 11 --------------------
 fffffc7fef2905f7 lwp_park (0, 0, 0)
 fffffc7fef282883 sema_wait (affe30) + 13
 fffffc7fef275ef8 sem_wait (affe30) + 38
 000000000045ef1a runtime.asmsysvicall6 () + 5a
 000000000040a6e0 runtime.notesleep () + e0
 0000000000434ca0 runtime.stopm () + c0
 000000000043626d runtime.findrunnable () + a0d
 0000000000436da5 runtime.schedule () + 2f5
 00000000004371ed runtime.park_m () + 9d
 000000000045b5e4 runtime.mcall () + 64
@jclulow
Copy link
Contributor Author

jclulow commented Oct 31, 2019

I believe the build in progress is:

  builder: illumos-amd64
      rev: 6becb033341602f2df9d7c55cc23e64b925bbee2
 buildlet: http://gobuild3 reverse peer gobuild3/107.3.176.60:62245 for host type host-illumos-amd64-jclulow
  started: 2019-10-30 21:44:44.273031912 +0000 UTC m=+25661.982251605
   status: still running

@jclulow
Copy link
Contributor Author

jclulow commented Oct 31, 2019

It seems like this might have been caused by the integration of 6becb03. If I try to build that revision, we seem to hit a reproducible hang at the Building Go toolchain2 using go_bootstrap and Go toolchain1. build step. If I try the parent commit, it seems to work every time.

@ianlancetaylor Sorry to bug you, but do you have any thoughts?

@ianlancetaylor
Copy link
Contributor

When this happens again, can you send a SIGQUIT to the process and attach the Go stack dump? That should show the arguments to the functions, which will help. Thanks.

@jclulow
Copy link
Contributor Author

jclulow commented Nov 1, 2019

I waited until the thing was stuck for at least a minute with no child processes and then I dropped SIGQUIT on it:

$ GOROOT_BOOTSTRAP=/opt/go/1.13.1 ./make.bash
Building Go cmd/dist using /opt/go/1.13.1.
Building Go toolchain1 using /opt/go/1.13.1.
Building Go bootstrap cmd/go (go_bootstrap) using Go toolchain1.
Building Go toolchain2 using go_bootstrap and Go toolchain1.
SIGQUIT: quit (ASCII FS)
PC=0xfffffd7feedc3cb7 m=0 sigcode=0

goroutine 0 [idle]:
runtime: unknown pc 0xfffffd7feedc3cb7
stack: frame={sp:0xfffffd7fffdfe738, fp:0x0} stack=[0xfffffd7fffdee9a8,0xfffffd7fffdfe940)
fffffd7fffdfe638:  0000000000000004  fffffd7fffdfe700 
fffffd7fffdfe648:  000000000045f0aa <runtime.asmsysvicall6+90>  fffffd7fffdfe6c8 
fffffd7fffdfe658:  fffffd7fef1d2a40  fffffd7fffdfe6a0 
fffffd7fffdfe668:  fffffd7feedb917f  0000000000179000 
fffffd7fffdfe678:  00000000019222e0  00000000019222e0 
fffffd7fffdfe688:  fffffd7fee704f00  0000000000000000 
fffffd7fffdfe698:  fffffd7fef1d2a40  fffffd7fffdfe6e0 
fffffd7fffdfe6a8:  fffffd7feedb917f  fffffd7fffdfe6f0 
fffffd7fffdfe6b8:  0000000001922020  fffffd7fee702300 
fffffd7fffdfe6c8:  fffffd7fef1d2a40  fffffd7fffdfe6f0 
fffffd7fffdfe6d8:  fffffd7fef1d2a40  fffffd7fffdfe710 
fffffd7fffdfe6e8:  fffffd7feedb923a  fffffd7fffdfe750 
fffffd7fffdfe6f8:  fffffd7fef1d2a40  fffffd7fffdfe740 
fffffd7fffdfe708:  fffffd7fee702300  fffffd7fffdfe730 
fffffd7fffdfe718:  fffffd7feedb93be  0000000000000001 
fffffd7fffdfe728:  fffffd7fee702300  fffffd7fffdfe7a0 
fffffd7fffdfe738: <fffffd7feedb61d8  0000000000000000 
fffffd7fffdfe748:  0000000000000000  fffffd7fffdfe780 
fffffd7fffdfe758:  fffffd7feedaa1d2  000000000000001e 
fffffd7fffdfe768:  0000000001922020  00000000000000f2 
fffffd7fffdfe778:  0000000000000000  000000000089af4e 
fffffd7fffdfe788:  0000000000000000  0000000000000000 
fffffd7fffdfe798:  0000000001922020  fffffd7fffdfe7c0 
fffffd7fffdfe7a8:  fffffd7feedb64c0  0000000000000000 
fffffd7fffdfe7b8:  0000000001922020  fffffd7fffdfe7f0 
fffffd7fffdfe7c8:  fffffd7feedaa222  0000000000000000 
fffffd7fffdfe7d8:  0000000000ae1080  000000000042c7f9 <runtime.sysvicall1+169> 
fffffd7fffdfe7e8:  0000000001922020  fffffd7fffdfe878 
fffffd7fffdfe7f8:  000000000045f0aa <runtime.asmsysvicall6+90>  0000000000ae1340 
fffffd7fffdfe808:  000000000045d03f <runtime.asmcgocall+191>  0000000000000000 
fffffd7fffdfe818:  0000000000000000  ffffff0e3bd85c20 
fffffd7fffdfe828:  0000000000000000  0000000000000003 
runtime: unknown pc 0xfffffd7feedc3cb7
stack: frame={sp:0xfffffd7fffdfe738, fp:0x0} stack=[0xfffffd7fffdee9a8,0xfffffd7fffdfe940)
fffffd7fffdfe638:  0000000000000004  fffffd7fffdfe700 
fffffd7fffdfe648:  000000000045f0aa <runtime.asmsysvicall6+90>  fffffd7fffdfe6c8 
fffffd7fffdfe658:  fffffd7fef1d2a40  fffffd7fffdfe6a0 
fffffd7fffdfe668:  fffffd7feedb917f  0000000000179000 
fffffd7fffdfe678:  00000000019222e0  00000000019222e0 
fffffd7fffdfe688:  fffffd7fee704f00  0000000000000000 
fffffd7fffdfe698:  fffffd7fef1d2a40  fffffd7fffdfe6e0 
fffffd7fffdfe6a8:  fffffd7feedb917f  fffffd7fffdfe6f0 
fffffd7fffdfe6b8:  0000000001922020  fffffd7fee702300 
fffffd7fffdfe6c8:  fffffd7fef1d2a40  fffffd7fffdfe6f0 
fffffd7fffdfe6d8:  fffffd7fef1d2a40  fffffd7fffdfe710 
fffffd7fffdfe6e8:  fffffd7feedb923a  fffffd7fffdfe750 
fffffd7fffdfe6f8:  fffffd7fef1d2a40  fffffd7fffdfe740 
fffffd7fffdfe708:  fffffd7fee702300  fffffd7fffdfe730 
fffffd7fffdfe718:  fffffd7feedb93be  0000000000000001 
fffffd7fffdfe728:  fffffd7fee702300  fffffd7fffdfe7a0 
fffffd7fffdfe738: <fffffd7feedb61d8  0000000000000000 
fffffd7fffdfe748:  0000000000000000  fffffd7fffdfe780 
fffffd7fffdfe758:  fffffd7feedaa1d2  000000000000001e 
fffffd7fffdfe768:  0000000001922020  00000000000000f2 
fffffd7fffdfe778:  0000000000000000  000000000089af4e 
fffffd7fffdfe788:  0000000000000000  0000000000000000 
fffffd7fffdfe798:  0000000001922020  fffffd7fffdfe7c0 
fffffd7fffdfe7a8:  fffffd7feedb64c0  0000000000000000 
fffffd7fffdfe7b8:  0000000001922020  fffffd7fffdfe7f0 
fffffd7fffdfe7c8:  fffffd7feedaa222  0000000000000000 
fffffd7fffdfe7d8:  0000000000ae1080  000000000042c7f9 <runtime.sysvicall1+169> 
fffffd7fffdfe7e8:  0000000001922020  fffffd7fffdfe878 
fffffd7fffdfe7f8:  000000000045f0aa <runtime.asmsysvicall6+90>  0000000000ae1340 
fffffd7fffdfe808:  000000000045d03f <runtime.asmcgocall+191>  0000000000000000 
fffffd7fffdfe818:  0000000000000000  ffffff0e3bd85c20 
fffffd7fffdfe828:  0000000000000000  0000000000000003 

goroutine 1 [semacquire, 1 minutes]:
sync.runtime_Semacquire(0xc000683848)
        /ws/go/master/src/runtime/sema.go:56 +0x42
sync.(*WaitGroup).Wait(0xc000683840)
        /ws/go/master/src/sync/waitgroup.go:130 +0x64
cmd/go/internal/work.(*Builder).Do(0xc0001d8b40, 0xc0001492c0)
        /ws/go/master/src/cmd/go/internal/work/exec.go:187 +0x3ae
cmd/go/internal/work.InstallPackages(0xc000096050, 0x4, 0x4, 0xc000142500, 0x4, 0x4)
        /ws/go/master/src/cmd/go/internal/work/build.go:605 +0xb2d
cmd/go/internal/work.runInstall(0xad7f80, 0xc000096050, 0x4, 0x4)
        /ws/go/master/src/cmd/go/internal/work/build.go:516 +0x66
main.main()
        /ws/go/master/src/cmd/go/main.go:189 +0x569

goroutine 877 [IO wait, 1 minutes]:
internal/poll.runtime_pollWait(0xfffffd7fef18a928, 0x72, 0xc0002d6200)
        /ws/go/master/src/runtime/netpoll.go:196 +0x6d
internal/poll.(*pollDesc).wait(0xc000638378, 0x72, 0x201, 0x200, 0xffffffffffffffff)
        /ws/go/master/src/internal/poll/fd_poll_runtime.go:87 +0x45
internal/poll.(*pollDesc).waitRead(...)
        /ws/go/master/src/internal/poll/fd_poll_runtime.go:92
internal/poll.(*FD).Read(0xc000638360, 0xc0002d6200, 0x200, 0x200, 0x0, 0x0, 0x0)
        /ws/go/master/src/internal/poll/fd_unix.go:169 +0x19b
os.(*File).read(...)
        /ws/go/master/src/os/file_unix.go:263
os.(*File).Read(0xc00027c4b0, 0xc0002d6200, 0x200, 0x200, 0xc0005f2e90, 0x440e52, 0xc0005f2ea0)
        /ws/go/master/src/os/file.go:116 +0x71
bytes.(*Buffer).ReadFrom(0xc00058c180, 0x8a5be0, 0xc00027c4b0, 0xfffffd7fef0777e8, 0xc00058c180, 0xc0005f2f01)
        /ws/go/master/src/bytes/buffer.go:204 +0xb1
io.copyBuffer(0x8a5480, 0xc00058c180, 0x8a5be0, 0xc00027c4b0, 0x0, 0x0, 0x0, 0x4049e5, 0xc0005783c0, 0xc0005f2fb0)
        /ws/go/master/src/io/io.go:391 +0x2fc
io.Copy(...)
        /ws/go/master/src/io/io.go:364
os/exec.(*Cmd).writerDescriptor.func1(0xc0005783c0, 0xc0005f2fb0)
        /ws/go/master/src/os/exec/exec.go:311 +0x63
os/exec.(*Cmd).Start.func1(0xc00042c000, 0xc00057c380)
        /ws/go/master/src/os/exec/exec.go:435 +0x27
created by os/exec.(*Cmd).Start
        /ws/go/master/src/os/exec/exec.go:434 +0x608

goroutine 993 [select]:
cmd/go/internal/work.(*Builder).Do.func3(0xc000683840, 0xc0001d8b40, 0xc00052f040)
        /ws/go/master/src/cmd/go/internal/work/exec.go:168 +0xed
created by cmd/go/internal/work.(*Builder).Do
        /ws/go/master/src/cmd/go/internal/work/exec.go:165 +0x38a

goroutine 831 [IO wait, 1 minutes]:
internal/poll.runtime_pollWait(0xfffffd7fef189b28, 0x72, 0xc0003a8c00)
        /ws/go/master/src/runtime/netpoll.go:196 +0x6d
internal/poll.(*pollDesc).wait(0xc0003e5158, 0x72, 0x201, 0x200, 0xffffffffffffffff)
        /ws/go/master/src/internal/poll/fd_poll_runtime.go:87 +0x45
internal/poll.(*pollDesc).waitRead(...)
        /ws/go/master/src/internal/poll/fd_poll_runtime.go:92
internal/poll.(*FD).Read(0xc0003e5140, 0xc0003a8c00, 0x200, 0x200, 0x0, 0x0, 0x0)
        /ws/go/master/src/internal/poll/fd_unix.go:169 +0x19b
os.(*File).read(...)
        /ws/go/master/src/os/file_unix.go:263
os.(*File).Read(0xc0002f4638, 0xc0003a8c00, 0x200, 0x200, 0x0, 0x0, 0xc0006736a0)
        /ws/go/master/src/os/file.go:116 +0x71
bytes.(*Buffer).ReadFrom(0xc0004fc540, 0x8a5be0, 0xc0002f4638, 0xfffffd7fef0777e8, 0xc0004fc540, 0x1)
        /ws/go/master/src/bytes/buffer.go:204 +0xb1
io.copyBuffer(0x8a5480, 0xc0004fc540, 0x8a5be0, 0xc0002f4638, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0)
        /ws/go/master/src/io/io.go:391 +0x2fc
io.Copy(...)
        /ws/go/master/src/io/io.go:364
os/exec.(*Cmd).writerDescriptor.func1(0x0, 0x0)
        /ws/go/master/src/os/exec/exec.go:311 +0x63
os/exec.(*Cmd).Start.func1(0xc00042c2c0, 0xc000442680)
        /ws/go/master/src/os/exec/exec.go:435 +0x27
created by os/exec.(*Cmd).Start
        /ws/go/master/src/os/exec/exec.go:434 +0x608

goroutine 944 [chan receive, 1 minutes]:
os/exec.(*Cmd).Wait(0xc0006c42c0, 0x0, 0x0)
        /ws/go/master/src/os/exec/exec.go:509 +0x125
os/exec.(*Cmd).Run(0xc0006c42c0, 0xc00049a510, 0x37)
        /ws/go/master/src/os/exec/exec.go:341 +0x5c
cmd/go/internal/work.(*Builder).toolID(0xc0001d8b40, 0x7fc8e2, 0x7, 0xb, 0xc0005133d0)
        /ws/go/master/src/cmd/go/internal/work/buildid.go:192 +0x44d
cmd/go/internal/work.(*Builder).buildActionID(0xc0001d8b40, 0xc0001f77c0, 0x0, 0x0, 0x0, 0x0)
        /ws/go/master/src/cmd/go/internal/work/exec.go:243 +0xd54
cmd/go/internal/work.(*Builder).build(0xc0001d8b40, 0xc0001f77c0, 0x0, 0x0)
        /ws/go/master/src/cmd/go/internal/work/exec.go:398 +0x50f0
cmd/go/internal/work.(*Builder).Do.func2(0xc0001f77c0)
        /ws/go/master/src/cmd/go/internal/work/exec.go:118 +0x358
cmd/go/internal/work.(*Builder).Do.func3(0xc000683840, 0xc0001d8b40, 0xc00052f040)
        /ws/go/master/src/cmd/go/internal/work/exec.go:178 +0x76
created by cmd/go/internal/work.(*Builder).Do
        /ws/go/master/src/cmd/go/internal/work/exec.go:165 +0x38a

goroutine 941 [select]:
cmd/go/internal/work.(*Builder).Do.func3(0xc000683840, 0xc0001d8b40, 0xc00052f040)
        /ws/go/master/src/cmd/go/internal/work/exec.go:168 +0xed
created by cmd/go/internal/work.(*Builder).Do
        /ws/go/master/src/cmd/go/internal/work/exec.go:165 +0x38a

goroutine 830 [IO wait, 1 minutes]:
internal/poll.runtime_pollWait(0xfffffd7fef184908, 0x72, 0xc0003a8a00)
        /ws/go/master/src/runtime/netpoll.go:196 +0x6d
internal/poll.(*pollDesc).wait(0xc000447158, 0x72, 0x201, 0x200, 0xffffffffffffffff)
        /ws/go/master/src/internal/poll/fd_poll_runtime.go:87 +0x45
internal/poll.(*pollDesc).waitRead(...)
        /ws/go/master/src/internal/poll/fd_poll_runtime.go:92
internal/poll.(*FD).Read(0xc000447140, 0xc0003a8a00, 0x200, 0x200, 0x0, 0x0, 0x0)
        /ws/go/master/src/internal/poll/fd_unix.go:169 +0x19b
os.(*File).read(...)
        /ws/go/master/src/os/file_unix.go:263
os.(*File).Read(0xc000150688, 0xc0003a8a00, 0x200, 0x200, 0x0, 0x0, 0xc000672ea0)
        /ws/go/master/src/os/file.go:116 +0x71
bytes.(*Buffer).ReadFrom(0xc000586f30, 0x8a5be0, 0xc000150688, 0xfffffd7fef0777e8, 0xc000586f30, 0x1)
        /ws/go/master/src/bytes/buffer.go:204 +0xb1
io.copyBuffer(0x8a5480, 0xc000586f30, 0x8a5be0, 0xc000150688, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0)
        /ws/go/master/src/io/io.go:391 +0x2fc
io.Copy(...)
        /ws/go/master/src/io/io.go:364
os/exec.(*Cmd).writerDescriptor.func1(0x0, 0x0)
        /ws/go/master/src/os/exec/exec.go:311 +0x63
os/exec.(*Cmd).Start.func1(0xc0001a46e0, 0xc0000b0ea0)
        /ws/go/master/src/os/exec/exec.go:435 +0x27
created by os/exec.(*Cmd).Start
        /ws/go/master/src/os/exec/exec.go:434 +0x608

goroutine 942 [chan receive]:
os/exec.(*Cmd).Wait(0xc00042c2c0, 0x0, 0x0)
        /ws/go/master/src/os/exec/exec.go:509 +0x125
os/exec.(*Cmd).Run(0xc00042c2c0, 0x115ac288, 0xadfe60)
        /ws/go/master/src/os/exec/exec.go:341 +0x5c
cmd/go/internal/work.(*Builder).runOut(0xc0001d8b40, 0xc0001f7cc0, 0xc0006bd470, 0x27, 0x0, 0x0, 0x0, 0xc0003d8280, 0xf, 0x14, ...)
        /ws/go/master/src/cmd/go/internal/work/exec.go:1919 +0x5bc
cmd/go/internal/work.gcToolchain.gc(0xc0001d8b40, 0xc0001f7cc0, 0xc000575530, 0x23, 0xc0000f20e0, 0x68, 0xd8, 0x0, 0x0, 0xc000503500, ...)
        /ws/go/master/src/cmd/go/internal/work/gc.go:142 +0xe03
cmd/go/internal/work.(*Builder).build(0xc0001d8b40, 0xc0001f7cc0, 0x0, 0x0)
        /ws/go/master/src/cmd/go/internal/work/exec.go:674 +0x1715
cmd/go/internal/work.(*Builder).Do.func2(0xc0001f7cc0)
        /ws/go/master/src/cmd/go/internal/work/exec.go:118 +0x358
cmd/go/internal/work.(*Builder).Do.func3(0xc000683840, 0xc0001d8b40, 0xc00052f040)
        /ws/go/master/src/cmd/go/internal/work/exec.go:178 +0x76
created by cmd/go/internal/work.(*Builder).Do
        /ws/go/master/src/cmd/go/internal/work/exec.go:165 +0x38a

goroutine 730 [IO wait, 1 minutes]:
internal/poll.runtime_pollWait(0xfffffd7fef184708, 0x72, 0xc0002d6000)
        /ws/go/master/src/runtime/netpoll.go:196 +0x6d
internal/poll.(*pollDesc).wait(0xc000080258, 0x72, 0x201, 0x200, 0xffffffffffffffff)
        /ws/go/master/src/internal/poll/fd_poll_runtime.go:87 +0x45
internal/poll.(*pollDesc).waitRead(...)
        /ws/go/master/src/internal/poll/fd_poll_runtime.go:92
internal/poll.(*FD).Read(0xc000080240, 0xc0002d6000, 0x200, 0x200, 0x0, 0x0, 0x0)
        /ws/go/master/src/internal/poll/fd_unix.go:169 +0x19b
os.(*File).read(...)
        /ws/go/master/src/os/file_unix.go:263
os.(*File).Read(0xc0000840b0, 0xc0002d6000, 0x200, 0x200, 0x687940, 0x6b5768, 0xc0001ff6a0)
        /ws/go/master/src/os/file.go:116 +0x71
bytes.(*Buffer).ReadFrom(0xc00049a4b0, 0x8a5be0, 0xc0000840b0, 0xfffffd7fef0777e8, 0xc00049a4b0, 0x7b7701)
        /ws/go/master/src/bytes/buffer.go:204 +0xb1
io.copyBuffer(0x8a5480, 0xc00049a4b0, 0x8a5be0, 0xc0000840b0, 0x0, 0x0, 0x0, 0x0, 0x1, 0x0)
        /ws/go/master/src/io/io.go:391 +0x2fc
io.Copy(...)
        /ws/go/master/src/io/io.go:364
os/exec.(*Cmd).writerDescriptor.func1(0x0, 0xc000683840)
        /ws/go/master/src/os/exec/exec.go:311 +0x63
os/exec.(*Cmd).Start.func1(0xc0006c42c0, 0xc00052f080)
        /ws/go/master/src/os/exec/exec.go:435 +0x27
created by os/exec.(*Cmd).Start
        /ws/go/master/src/os/exec/exec.go:434 +0x608

goroutine 1013 [IO wait, 1 minutes]:
internal/poll.runtime_pollWait(0xfffffd7fef184308, 0x72, 0xc0002ba400)
        /ws/go/master/src/runtime/netpoll.go:196 +0x6d
internal/poll.(*pollDesc).wait(0xc000396858, 0x72, 0x201, 0x200, 0xffffffffffffffff)
        /ws/go/master/src/internal/poll/fd_poll_runtime.go:87 +0x45
internal/poll.(*pollDesc).waitRead(...)
        /ws/go/master/src/internal/poll/fd_poll_runtime.go:92
internal/poll.(*FD).Read(0xc000396840, 0xc0002ba400, 0x200, 0x200, 0x0, 0x0, 0x0)
        /ws/go/master/src/internal/poll/fd_unix.go:169 +0x19b
os.(*File).read(...)
        /ws/go/master/src/os/file_unix.go:263
os.(*File).Read(0xc00025d730, 0xc0002ba400, 0x200, 0x200, 0x0, 0x0, 0xc0005f36a0)
        /ws/go/master/src/os/file.go:116 +0x71
bytes.(*Buffer).ReadFrom(0xc00049a540, 0x8a5be0, 0xc00025d730, 0xfffffd7fef0777e8, 0xc00049a540, 0x1)
        /ws/go/master/src/bytes/buffer.go:204 +0xb1
io.copyBuffer(0x8a5480, 0xc00049a540, 0x8a5be0, 0xc00025d730, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0)
        /ws/go/master/src/io/io.go:391 +0x2fc
io.Copy(...)
        /ws/go/master/src/io/io.go:364
os/exec.(*Cmd).writerDescriptor.func1(0x0, 0x0)
        /ws/go/master/src/os/exec/exec.go:311 +0x63
os/exec.(*Cmd).Start.func1(0xc0006c4580, 0xc00043a280)
        /ws/go/master/src/os/exec/exec.go:435 +0x27
created by os/exec.(*Cmd).Start
        /ws/go/master/src/os/exec/exec.go:434 +0x608

goroutine 943 [chan receive]:
os/exec.(*Cmd).Wait(0xc0001a46e0, 0x0, 0x0)
        /ws/go/master/src/os/exec/exec.go:509 +0x125
os/exec.(*Cmd).Run(0xc0001a46e0, 0x11560ed3, 0xadfe60)
        /ws/go/master/src/os/exec/exec.go:341 +0x5c
cmd/go/internal/work.(*Builder).runOut(0xc0001d8b40, 0xc00013fcc0, 0xc0004e1230, 0x24, 0x0, 0x0, 0x0, 0xc000148f00, 0xf, 0x14, ...)
        /ws/go/master/src/cmd/go/internal/work/exec.go:1919 +0x5bc
cmd/go/internal/work.gcToolchain.gc(0xc0001d8b40, 0xc00013fcc0, 0xc00014a540, 0x23, 0xc0002566c0, 0x10, 0x40, 0x0, 0x0, 0xc000517500, ...)
        /ws/go/master/src/cmd/go/internal/work/gc.go:142 +0xe03
cmd/go/internal/work.(*Builder).build(0xc0001d8b40, 0xc00013fcc0, 0x0, 0x0)
        /ws/go/master/src/cmd/go/internal/work/exec.go:674 +0x1715
cmd/go/internal/work.(*Builder).Do.func2(0xc00013fcc0)
        /ws/go/master/src/cmd/go/internal/work/exec.go:118 +0x358
cmd/go/internal/work.(*Builder).Do.func3(0xc000683840, 0xc0001d8b40, 0xc00052f040)
        /ws/go/master/src/cmd/go/internal/work/exec.go:178 +0x76
created by cmd/go/internal/work.(*Builder).Do
        /ws/go/master/src/cmd/go/internal/work/exec.go:165 +0x38a

goroutine 878 [IO wait, 1 minutes]:
internal/poll.runtime_pollWait(0xfffffd7fef189228, 0x72, 0xc0002d6400)
        /ws/go/master/src/runtime/netpoll.go:196 +0x6d
internal/poll.(*pollDesc).wait(0xc000522738, 0x72, 0x201, 0x200, 0xffffffffffffffff)
        /ws/go/master/src/internal/poll/fd_poll_runtime.go:87 +0x45
internal/poll.(*pollDesc).waitRead(...)
        /ws/go/master/src/internal/poll/fd_poll_runtime.go:92
internal/poll.(*FD).Read(0xc000522720, 0xc0002d6400, 0x200, 0x200, 0x0, 0x0, 0x0)
        /ws/go/master/src/internal/poll/fd_unix.go:169 +0x19b
os.(*File).read(...)
        /ws/go/master/src/os/file_unix.go:263
os.(*File).Read(0xc0000101a0, 0xc0002d6400, 0x200, 0x200, 0x0, 0x0, 0xc000676ea0)
        /ws/go/master/src/os/file.go:116 +0x71
bytes.(*Buffer).ReadFrom(0xc00058c1b0, 0x8a5be0, 0xc0000101a0, 0xfffffd7fef0777e8, 0xc00058c1b0, 0x1)
        /ws/go/master/src/bytes/buffer.go:204 +0xb1
io.copyBuffer(0x8a5480, 0xc00058c1b0, 0x8a5be0, 0xc0000101a0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0)
        /ws/go/master/src/io/io.go:391 +0x2fc
io.Copy(...)
        /ws/go/master/src/io/io.go:364
os/exec.(*Cmd).writerDescriptor.func1(0x0, 0x0)
        /ws/go/master/src/os/exec/exec.go:311 +0x63
os/exec.(*Cmd).Start.func1(0xc00042c000, 0xc0005ee4c0)
        /ws/go/master/src/os/exec/exec.go:435 +0x27
created by os/exec.(*Cmd).Start
        /ws/go/master/src/os/exec/exec.go:434 +0x608

goroutine 994 [chan receive, 1 minutes]:
os/exec.(*Cmd).Wait(0xc00042c000, 0x0, 0x0)
        /ws/go/master/src/os/exec/exec.go:509 +0x125
os/exec.(*Cmd).Run(0xc00042c000, 0xc00058c1b0, 0x37)
        /ws/go/master/src/os/exec/exec.go:341 +0x5c
cmd/go/internal/work.(*Builder).toolID(0xc0001d8b40, 0x7fb091, 0x3, 0x11, 0xc000541420)
        /ws/go/master/src/cmd/go/internal/work/buildid.go:192 +0x44d
cmd/go/internal/work.(*Builder).buildActionID(0xc0001d8b40, 0xc000440b40, 0x0, 0x0, 0x0, 0x0)
        /ws/go/master/src/cmd/go/internal/work/exec.go:245 +0x1643
cmd/go/internal/work.(*Builder).build(0xc0001d8b40, 0xc000440b40, 0x0, 0x0)
        /ws/go/master/src/cmd/go/internal/work/exec.go:398 +0x50f0
cmd/go/internal/work.(*Builder).Do.func2(0xc000440b40)
        /ws/go/master/src/cmd/go/internal/work/exec.go:118 +0x358
cmd/go/internal/work.(*Builder).Do.func3(0xc000683840, 0xc0001d8b40, 0xc00052f040)
        /ws/go/master/src/cmd/go/internal/work/exec.go:178 +0x76
created by cmd/go/internal/work.(*Builder).Do
        /ws/go/master/src/cmd/go/internal/work/exec.go:165 +0x38a

goroutine 995 [select]:
cmd/go/internal/work.(*Builder).Do.func3(0xc000683840, 0xc0001d8b40, 0xc00052f040)
        /ws/go/master/src/cmd/go/internal/work/exec.go:168 +0xed
created by cmd/go/internal/work.(*Builder).Do
        /ws/go/master/src/cmd/go/internal/work/exec.go:165 +0x38a

goroutine 996 [chan receive, 1 minutes]:
os/exec.(*Cmd).Wait(0xc0006c4580, 0x0, 0x0)
        /ws/go/master/src/os/exec/exec.go:509 +0x125
os/exec.(*Cmd).Run(0xc0006c4580, 0xc00049a5d0, 0x37)
        /ws/go/master/src/os/exec/exec.go:341 +0x5c
cmd/go/internal/work.(*Builder).toolID(0xc0001d8b40, 0x7fc8e2, 0x7, 0xb, 0xc0004d73d0)
        /ws/go/master/src/cmd/go/internal/work/buildid.go:192 +0x44d
cmd/go/internal/work.(*Builder).buildActionID(0xc0001d8b40, 0xc0001f7180, 0x0, 0x0, 0x0, 0x0)
        /ws/go/master/src/cmd/go/internal/work/exec.go:243 +0xd54
cmd/go/internal/work.(*Builder).build(0xc0001d8b40, 0xc0001f7180, 0x0, 0x0)
        /ws/go/master/src/cmd/go/internal/work/exec.go:398 +0x50f0
cmd/go/internal/work.(*Builder).Do.func2(0xc0001f7180)
        /ws/go/master/src/cmd/go/internal/work/exec.go:118 +0x358
cmd/go/internal/work.(*Builder).Do.func3(0xc000683840, 0xc0001d8b40, 0xc00052f040)
        /ws/go/master/src/cmd/go/internal/work/exec.go:178 +0x76
created by cmd/go/internal/work.(*Builder).Do
        /ws/go/master/src/cmd/go/internal/work/exec.go:165 +0x38a

rax    0x5b
rbx    0xfffffd7fef1d2a40
rcx    0xfffffd7feec7a000
rdx    0x0
rdi    0x0
rsi    0x0
rbp    0xfffffd7fffdfe7a0
rsp    0xfffffd7fffdfe738
r8     0x0
r9     0xfffffd7fee702340
r10    0xfffffd7feec7a000
r11    0xc0003ea000
r12    0x1922020
r13    0xfffffd7fee702300
r14    0x89af4e
r15    0x0
rip    0xfffffd7feedc3cb7
rflags 0x247
cs     0x53
fs     0x0
gs     0x0
go tool dist: FAILED: /ws/go/master/pkg/tool/illumos_amd64/go_bootstrap install -gcflags=all= -ldflags=all= -i cmd/asm cmd/cgo cmd/compile cmd/link: exit status 2

@jclulow
Copy link
Contributor Author

jclulow commented Nov 1, 2019

Alright I think I have figured it out:

diff --git a/src/runtime/netpoll_solaris.go b/src/runtime/netpoll_solaris.go
index 26bbe38d86..c05c2a2a7c 100644
--- a/src/runtime/netpoll_solaris.go
+++ b/src/runtime/netpoll_solaris.go
@@ -230,7 +230,13 @@ func netpoll(delay int64) gList {
 retry:
        var n uint32 = 1
        if port_getn(portfd, &events[0], uint32(len(events)), &n, wait) < 0 {
-               if e := errno(); e != _EINTR && e != _ETIME {
+               e := errno()
+               // As per port_getn(3C), an ETIME failure does not preclude the
+               // delivery of some number of events.
+               if e == _ETIME && n > 0 {
+                       goto process
+               }
+               if e != _EINTR && e != _ETIME {
                        print("runtime: port_getn on fd ", portfd, " failed (errno=", e, ")\n")
                        throw("runtime: netpoll failed")
                }
@@ -242,6 +248,7 @@ retry:
                goto retry
        }
 
+process:
        var toRun gList
        for i := 0; i < int(n); i++ {
                ev := &events[i]

I'll submit a CL. Sorry for the bother.

@ianlancetaylor
Copy link
Contributor

Thanks for figuring this out!

(Probably the real CL should not use a goto statement like that.)

@gopherbot
Copy link
Contributor

Change https://golang.org/cl/204801 mentions this issue: runtime: check for events when port_getn fails with ETIME

@golang golang locked and limited conversation to collaborators Nov 1, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants