Skip to content

runtime: Stack() calls lock against each other unnecessarily, even when single-goroutine #56400

Open
@seebs

Description

@seebs

What version of Go are you using (go version)?

$ go version
1.19.2

Does this issue reproduce with the latest release?

y

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
(encountered on Linux but I assume it shows up everywhere)

What did you do?

I saw a profile that showed 85% of CPU time in calls to runtime.Stack(), and 81% of total CPU time in calls to printlock and printunlock.

What did you expect to see?

I expected calls that were independent of each other, writing into buffers rather than to shared streams, on different CPUs, to proceed without lock contention.

What did you see instead?

Every single write in any part of the traceback process goes through print (and friends) ultimately, which means that it uses printlock and printunlock. This is true even when g.writebuf is non-nil and the different gs will be writing into different buffers. As a result, the stack tracebacks block each other, and since they're both making large numbers of individual calls to print, that results in heavy lock contention. (To complicate things, there's cases where a non-nil g.writebuf still doesn't cause the writes to go into the buffer, as when gp.m is dying.)

This looks like a mess to fix. It's "easy" to fix by changing the code to thread a write target through it, except that you absolutely don't want any of this code dealing with stuff like io.Writer or other higher-level abstractions. I also don't know whether it's actually completely safe to have multiple copies of print running, against different write buffers, on different gs. In the case where you really are writing to a stream, you probably don't want a buffer. You don't want to have two copies of all of this code, identical except for their output. But having two calls to runtime.Stack() generate hundreds of contended locks seems pretty bad too...

Activity

heschi

heschi commented on Oct 25, 2022

@heschi
Contributor

cc @golang/runtime

added
NeedsInvestigationSomeone must examine and confirm this is a valid issue and not a duplicate of an existing one.
on Oct 25, 2022
added this to the Backlog milestone on Oct 25, 2022
mknyszek

mknyszek commented on Oct 26, 2022

@mknyszek
Contributor

In triage this doesn't seem like high priority to look at. Stack could probably avoid the printlock but that might be tricky today. This might be more doable if we clean up gentraceback in general (CC @aclements).

mknyszek

mknyszek commented on Oct 26, 2022

@mknyszek
Contributor

(In other words, effectively blocking this on #54466.)

prattmic

prattmic commented on Oct 26, 2022

@prattmic
Member

Note that using runtime.Callers should avoid this lock contention.

josharian

josharian commented on Oct 26, 2022

@josharian
Contributor

There was also some discussion in the Gophers Slack about one possible hack fix:

 func printlock() {
-       mp := getg().m
+       gp := getg()
+       if gp.writebuf != nil {
+               return
+       }
+       mp := gp.m

(and the same for printunlock)

ianlancetaylor

ianlancetaylor commented on Oct 26, 2022

@ianlancetaylor
Contributor

That's not the only reason printlock exists. It also ensures that if two goroutines call print in parallel with multiple arguments then the output will not be interleaved.

josharian

josharian commented on Oct 26, 2022

@josharian
Contributor

Right. But if I read the code correctly (big if), when gp.writebuf is non-nil, we are always non-concurrent.

ianlancetaylor

ianlancetaylor commented on Oct 26, 2022

@ianlancetaylor
Contributor

Perhaps you are not aware that the compiler generates calls to printlock. When the compiler sees print(a, b, c), it generates code like

    printlock()
    printint(a)
    printint(b)
    printint(c)
    printunlock()
josharian

josharian commented on Oct 26, 2022

@josharian
Contributor

I am indeed quite aware of that. :)

In those cases, however, gp.writebuf is nil (the output goes straight to stderr). gp.writebuf is only non-nil for runtime.Stack and a runtime tests.

ianlancetaylor

ianlancetaylor commented on Oct 26, 2022

@ianlancetaylor
Contributor

Sure, but my point was that if two different goroutines call print in parallel with multiple arguments, their output will not be interleaved. And that still seems true to me. Am I missing something?

(It's true that the output from print can be interleaved with other output that the program might be writing to standard error. That seems harder to avoid, though as you imply it could be done by setting writebuf.)

josharian

josharian commented on Oct 27, 2022

@josharian
Contributor

Am I missing something?

I hope so? :P

The idea of the patch above was that we effectively disable printlock and printunlock only on the relevant G when its writebuf is non-nil. writebuf is non-nil only when runtime.Stack is actively using its (systemstack's) G to print a stack. All other goroutines that are printing will have nil writebufs, and the locking will proceed as usual.

ianlancetaylor

ianlancetaylor commented on Oct 27, 2022

@ianlancetaylor
Contributor

Argh, sorry, I get it now. Completely misunderstood. Apologies.

seebs

seebs commented on Oct 27, 2022

@seebs
ContributorAuthor

I did note that runtime.Callers produces part of this data without locking, but there's no way to get the argument info, if you want it, and also there's no easy way to just get the pretty-printed output. So, it solves the performance problem but it loses information; so far as I can tell, there's nothing available to a user that can produce the call stack with args that isn't this.

So far as I can tell, writebuf can only be non-nil during a call to runtime.Stack(), and during one specific test that also assigns to it for some reason. There's no other uses in the code.

So I think in practice the "just dummy out locks when writebuf is non-nil" behavior would be correct, but I'd feel a bit nervous about it being brittle. I'm also not 100% sure I know what it would do in the case where you request tracebacks for all goroutines, but I assume it works. (I also note that if you request tracebacks for all goroutines, the world is stopped, so, good news, no lock contention, bad news, you already took every performance hit.)

That said, I'm certainly not about to volunteer to try to make the traceback stuff work differently, that looks like a really hard problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    NeedsInvestigationSomeone must examine and confirm this is a valid issue and not a duplicate of an existing one.Performancecompiler/runtimeIssues related to the Go compiler and/or runtime.

    Type

    No type

    Projects

    Status

    Todo

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @josharian@seebs@mknyszek@prattmic@ianlancetaylor

        Issue actions

          runtime: Stack() calls lock against each other unnecessarily, even when single-goroutine · Issue #56400 · golang/go