Skip to content

Commit 0c942e8

Browse files
committed
runtime: avoid incorrect panic when a signal arrives during STW
Stop-the-world and freeze-the-world (used for unhandled panics) are currently not safe to do at the same time. While a regular unhandled panic can't happen concurrently with STW (if the P hasn't been stopped, then the panic blocks the STW), a panic from a _SigThrow signal can happen on an already-stopped P, racing with STW. When this happens, freezetheworld sets sched.stopwait to 0x7fffffff and stopTheWorldWithSema panics because sched.stopwait != 0. Fix this by detecting when freeze-the-world happens before stop-the-world has completely stopped the world and freeze the STW operation rather than panicking. Fixes #17442. Change-Id: I646a7341221dd6d33ea21d818c2f7218e2cb7e20 Reviewed-on: https://go-review.googlesource.com/34611 Run-TryBot: Austin Clements <[email protected]> TryBot-Result: Gobot Gobot <[email protected]> Reviewed-by: Russ Cox <[email protected]> Reviewed-by: Ian Lance Taylor <[email protected]>
1 parent 860c9c0 commit 0c942e8

File tree

1 file changed

+26
-6
lines changed

1 file changed

+26
-6
lines changed

src/runtime/proc.go

+26-6
Original file line numberDiff line numberDiff line change
@@ -632,10 +632,15 @@ func helpgc(nproc int32) {
632632
// sched.stopwait to in order to request that all Gs permanently stop.
633633
const freezeStopWait = 0x7fffffff
634634

635+
// freezing is set to non-zero if the runtime is trying to freeze the
636+
// world.
637+
var freezing uint32
638+
635639
// Similar to stopTheWorld but best-effort and can be called several times.
636640
// There is no reverse operation, used during crashing.
637641
// This function must not lock any mutexes.
638642
func freezetheworld() {
643+
atomic.Store(&freezing, 1)
639644
// stopwait and preemption requests can be lost
640645
// due to races with concurrently executing threads,
641646
// so try several times
@@ -1018,15 +1023,30 @@ func stopTheWorldWithSema() {
10181023
preemptall()
10191024
}
10201025
}
1026+
1027+
// sanity checks
1028+
bad := ""
10211029
if sched.stopwait != 0 {
1022-
throw("stopTheWorld: not stopped")
1023-
}
1024-
for i := 0; i < int(gomaxprocs); i++ {
1025-
p := allp[i]
1026-
if p.status != _Pgcstop {
1027-
throw("stopTheWorld: not stopped")
1030+
bad = "stopTheWorld: not stopped (stopwait != 0)"
1031+
} else {
1032+
for i := 0; i < int(gomaxprocs); i++ {
1033+
p := allp[i]
1034+
if p.status != _Pgcstop {
1035+
bad = "stopTheWorld: not stopped (status != _Pgcstop)"
1036+
}
10281037
}
10291038
}
1039+
if atomic.Load(&freezing) != 0 {
1040+
// Some other thread is panicking. This can cause the
1041+
// sanity checks above to fail if the panic happens in
1042+
// the signal handler on a stopped thread. Either way,
1043+
// we should halt this thread.
1044+
lock(&deadlock)
1045+
lock(&deadlock)
1046+
}
1047+
if bad != "" {
1048+
throw(bad)
1049+
}
10301050
}
10311051

10321052
func mhelpgc() {

0 commit comments

Comments
 (0)