Skip to content

runtime: goroutine starvation due to Gosched #13546

Closed
@dvyukov

Description

@dvyukov

The following program hangs:

package main

import (
    "runtime"
    "sync/atomic"
)

func main() {
    const P = 4
    runtime.GOMAXPROCS(P)
    x := uint32(0)
    for p := 0; p < P; p++ {
        go func() {
            atomic.AddUint32(&x, 1)
            for atomic.LoadUint32(&x) != P {
            }
        }()
    }
    for atomic.LoadUint32(&x) != P {
        runtime.Gosched()
    }
}
SIGABRT: abort
PC=0x44dae0 m=0

goroutine 21 [running]:
sync/atomic.LoadUint32(0xc8200b6000)
    src/sync/atomic/asm_amd64.s:92 fp=0xc820060798 sp=0xc820060790
main.main.func1(0xc8200b6000)
    /tmp/gosched.go:15 +0x37 fp=0xc8200607b8 sp=0xc820060798
runtime.goexit()
    src/runtime/asm_amd64.s:1998 +0x1 fp=0xc8200607c0 sp=0xc8200607b8
created by main.main
    /tmp/gosched.go:17 +0x78

goroutine 1 [runnable]:
runtime.Gosched()
    src/runtime/proc.go:235 +0x14
main.main()
    /tmp/gosched.go:20 +0xa7

goroutine 18 [runnable]:
main.main.func1(0xc8200b6000)
    /tmp/gosched.go:13
created by main.main
    /tmp/gosched.go:17 +0x78

goroutine 19 [running]:
    goroutine running on other thread; stack unavailable
created by main.main
    /tmp/gosched.go:17 +0x78

goroutine 20 [running]:
    goroutine running on other thread; stack unavailable
created by main.main
    /tmp/gosched.go:17 +0x78

One goroutine constantly calls runtime.Gosched but another runnable goroutine is starved.
The root cause is: Gosched puts the current goroutine onto global run queue, then the thread check local run queue (empty), then it checks global run queue and picks up the old goroutine again. But at the same time there is another runnable goroutine in remote per-P queue.

This is probably not super critical, as it can happen only if there are goroutines in tight non-preemptable loops. But still we could check local queues ahead of global queue once in a while in findrunnable. We do the opposite hack in schedule -- check global queue ahead of local queue once in a while. On the other hand this will destroy locality, which is bad for performance...

Activity

dvyukov

dvyukov commented on Dec 9, 2015

@dvyukov
MemberAuthor
aclements

aclements commented on Dec 9, 2015

@aclements
Member

Interesting, though I agree with your assessment that this seems relatively low priority. I think if we fix the problem with non-preemptible loops (#10958) it will also fix this, and I'd much rather fix non-preemptible loops than try to put a hack in the scheduler (unless it can be more generally justified).

Hopefully SSA will make it easier to fix non-preemptible loops (because SSA will make everything easier, right? :)

dvyukov

dvyukov commented on Dec 9, 2015

@dvyukov
MemberAuthor

Yes, fixing non-preemptible loops is definitely better.

aclements

aclements commented on Dec 10, 2015

@aclements
Member

Closing as a dup of #10958, though we can of course reopen if we want to take a more specific approach to this problem.

locked and limited conversation to collaborators on Dec 14, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @dvyukov@aclements@gopherbot

        Issue actions

          runtime: goroutine starvation due to Gosched · Issue #13546 · golang/go