-
Notifications
You must be signed in to change notification settings - Fork 18k
cmd/trace: visualize time taken by syscall #57307
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Here are the notes I took on Go runtime tracing and syscall behavior. At the time I looked at Go 1.19's runtime, but I don't think anything of importance has changed in 1.20. I hope this helps.
|
cc @golang/runtime @mknyszek |
I'm not sure this has to be a proposal. Are there any API changes here, or cases where this might potentially break an existing program? Thanks. |
There aren't. But I didn't know which template would be appropriate. I doubted between "proposal" and "bug". While this is actually more of a feature request, between proposal and bug I'd choose the former. Which one should I have chosen? |
The "Bugs" template I guess. I considered changing it to "Bugs or feature requests", but that would lead too many people away from the "Proposal" template appropriate for API changes. So not sure how to make this better. |
Thanks @dominikh, that's very useful. Your notes explain (I think) exactly what's going on, and why my sample code/trace was not representative of the problems I experienced in production nearly a year ago.
In the sample trace, I'm artificially issuing long-running read syscalls from /dev/urandom, because I wanted to make the "problem" clear on a human timescale. In the production trace where I couldn't get slices even with the cmd/trace patch above, the situation was different:
All of that likely prevented sysmon from finding those syscalls. So this issue is not the only one preventing good observability. Ideally Go would:
|
I modified the sample program to have an additional goroutine that makes many short syscalls, alongside the main goroutine which makes long syscalls. As @dominikh said, a lot of them (but not all) don't have a link (they're instants, not slices). Comparing non-patched vs patched cmd/trace shows some issues when this happens: EDIT: The image below is wrong, this wrongly patched version omits
The long syscalls (rendered as a slice in the patched version on the right) are displayed on the proc that started executing it (proc 0 on the screenshot). Trace events for the first long syscall:
Between timestamps 106527657 and 2326550570, there were many other syscalls by function shortsyscalls. For example (most syscalls by g 7 only have a
It also started on proc 0. This indicates that the long-running syscall was moved to another proc close to when it started, or perhaps immediately. I suppose that is proc 1000003 mentioned in GoSysExit which may be some virtual ID for the proc attached (?) to the kernel thread servicing the event poller. Given that non-blocking syscalls (glad I picked those) don't run on a dedicated thread, it's questionable how to display this in the main view. Perhaps in that view we could either generate fake procs for every multiplexed syscall, and display also procs for syscalls that can't be moved to the netpoller thread (so they need their own M). This needs more discussion IMHO. However, what shouldn't be contentious is presenting slices instead of instants in the goroutine centric view. The patched cmd/trace does well here: On a related note, the. The goroutine analysis page shows a breakdown of goroutine time spent in either: Execution, Network wait, Sync block, Blocking syscall, Scheduler wait, GC sweeping, GC pause. The longsyscalls loop spends nearly all of its time in blocking syscall: We know that I'd have expected at least 20% of the time to be in syscalls (I've thought about measuring this on the Go side but calling time.Now() so often might disturb things). This further calls out that it would be nice if every syscall had a time measurement. New program: func main() {
if err := rmain(); err != nil {
fmt.Fprintf(os.Stderr, "error: %v", err)
os.Exit(1)
}
}
func rmain() error {
if err := trace.Start(os.Stdout); err != nil {
return err
}
defer trace.Stop()
go shortsyscalls()
return longsyscalls()
}
func longsyscalls() error {
f, err := os.Open("/dev/urandom")
if err != nil {
return err
}
var buf [1024 * 1024 * 1024]byte
for i, last := 0, time.Now(); i < 5; i++ {
fmt.Fprintf(os.Stderr, "%d: read(%q) start\n", i, f.Name())
n, err := f.Read(buf[:])
if err != nil || n == 0 {
panic("PANIC!")
}
now := time.Now()
fmt.Fprintf(os.Stderr, "%d: read(%q) end (%v)\n", i, f.Name(), now.Sub(last))
last = now
}
return nil
}
// Never stops.
func shortsyscalls() {
var buf [1024 * 1024 * 10]byte
for i, last := 0, time.Now(); i < math.MaxInt; i++ {
f, err := os.Open("./longsyscall.go") // After the first read, this should be very fast.
if err != nil {
panic(err)
}
if i%1000 == 0 {
fmt.Fprintf(os.Stderr, "%d: read(%q) start\n", i, f.Name())
}
n, err := f.Read(buf[:])
if err != nil || n == 0 {
panic("PANIC!")
}
now := time.Now()
if i%1000 == 0 {
fmt.Fprintf(os.Stderr, "%d: read(%q) end (%v)\n", i, f.Name(), now.Sub(last))
}
last = now
f.Close()
}
} |
@aktau can you explain the format of your debug output? That is, what exactly does
represent? With regard to the multiple lines and their order. |
There's no moving between Ps here. The goroutine was simply running on p=0 all along, until it eventually invoked a syscall. (The corresponding event indicating this would be the last GoStart before the GoSysCall)
1000003 is a synthetic ID that gets used for "returns from syscalls". There's no reason like a kernel thread or event poller for this, just the fact that when the syscall returns, there is no P associated with that event, but we wanted some number. Other such IDs are
|
For every GoSysCall event in the trace, print metadata about it (first line, without the
I think I've just been misusing terminology here. The main trace view is a proc-centric view, and I wanted to express that the syscall was (apart from the very beginning perhaps) not executing on the proc (0) anymore. When I read "proc" in that view my mind just thinks "thread". Stated differently: some other goroutine (like g7) could occupy proc 0 when the main goroutine (g1) was in the read syscall.
I was wondering why Debug log code: + var sysCalls, sysCallsNoLink int
for _, ev := range ctx.parsed.Events {
// Handle state transitions before we filter out events.
switch ev.Type {
@@ -817,7 +819,29 @@ func generateTrace(params *traceParams, consumer traceConsumer) error {
case trace.EvGoUnblock:
ctx.emitArrow(ev, "unblock")
case trace.EvGoSysCall:
- ctx.emitInstant(ev, "syscall", "")
+ sysCalls++
+ if ev.Link != nil {
+ ctx.emitSlice(ev, "syscall")
+ var sb strings.Builder
+ fmt.Fprintf(&sb, "syscall event with link (%d/%d):\n %+v", sysCallsNoLink, sysCalls, ev)
+ for prev, itev := ev, ev.Link; itev != nil; prev, itev = itev, itev.Link {
+ sb.WriteString("\n -> ")
+ fmt.Fprintf(&sb, " %2.3fs ", time.Duration(itev.Ts-prev.Ts).Seconds())
+ sb.WriteString(itev.String())
+ }
+ log.Println(sb.String())
+ } else {
+ sysCallsNoLink++
+ if (sysCallsNoLink % 100000) == 0 {
+ log.Printf("sampled syscall event without link (%d/%d):\n %+v", sysCallsNoLink, sysCalls, ev)
+ }
+ ctx.emitInstant(ev, "syscall", "")
+ }
case trace.EvGoSysExit:
ctx.emitArrow(ev, "sysexit")
case trace.EvUserLog:
diff --git a/internal/trace/parser.go b/internal/trace/parser.go
index b091a85..a17db4b 100644
--- a/internal/trace/parser.go
+++ b/internal/trace/parser.go
@@ -974,13 +974,29 @@ func PrintEvent(ev *Event) {
func (ev *Event) String() string {
desc := EventDescriptions[ev.Type]
w := new(strings.Builder)
- fmt.Fprintf(w, "%v %v p=%v g=%v off=%v", ev.Ts, desc.Name, ev.P, ev.G, ev.Off)
+ fmt.Fprintf(w, "%14d %-14s p=%-7d g=%-14d stkID=%-10d seq=%-3d", ev.Ts, desc.Name, ev.P, ev.G, ev.StkID, ev.seq)
+ if len(desc.Args) > 0 {
+ w.WriteString(" {")
+ }
for i, a := range desc.Args {
fmt.Fprintf(w, " %v=%v", a, ev.Args[i])
}
+ if len(desc.Args) > 0 {
+ w.WriteString(" }")
+ }
+ if len(desc.SArgs) > 0 {
+ w.WriteString(" {")
+ }
for i, a := range desc.SArgs {
fmt.Fprintf(w, " %v=%v", a, ev.SArgs[i])
}
+ if len(desc.SArgs) > 0 {
+ w.WriteString(" }")
+ }
return w.String()
} |
This issue is fixed in the new tracer. All system calls for Go 1.22 traces and later have a duration in the trace. |
Currently (Go 1.20-dev), syscalls as processed by
go tool trace
from a runtime/trace profile visualize as points in time, being about 1 pixel wide no matter how long they took (see reproducer below). It would be nicer if they were slices, to visualize how much time is spent in (conceptually blocking) syscalls for a given goroutine.Somewhere in the last release cycle (1.19), I tried making the obvious patch to cmd/trace:
And I remembered it didn't work: most (all?)
EvGoSysCall
events don't have a link to the finish event. But before filing this proposal I tried patching up cmd/trace from 1.20 and to my surprise it did work. I assume something changed in Go 1.20 and now more (most?) EvGoSysCall events have a link, which is great. Debugging output of my patchedgo tool trace
:Resulting trace view comparison:
Reproducer:
The program that generated this trace was:
Running it:
$ go run longsyscall.go > longsyscall.trace start...took 2.203378744s start...took 2.207477581s start...took 2.211924562s start...took 2.202174787s start...took 2.219617454s
The text was updated successfully, but these errors were encountered: