Description
In rolling forward the SetMaxHeap patches, I discovered that, as a result of the fix to #19112, if ReadMemStats
is called in a tight loop, then it might keep stomping on other goroutines trying to stop the world by reacquiring worldsema
before others get the chance. The worst case is if it stomps on a goroutine trying to stop the world for either starting or stopping a GC, which messes with pacing significantly.
The good news is, ReadMemStats
is known to be a heavyweight operation, and so isn't called often. This is really only a problem if it's called in a tight loop. In the reproduction "tight loop" consisted of a single goroutine performing a large allocation, calling ReadMemStats
, then doing some arithmetic and predictable branches. Adding in even a 1 ms sleep is enough to resolve the issue. This case is more likely to appear in tests and examples, where the behavior will be unexpected.
Activity
mknyszek commentedon Jul 28, 2020
CC @aclements @dr2chase
Fix is already uploaded, just need to associate it with this issue.
mknyszek commentedon Jul 28, 2020
FTR: the fix is to make starting the world hand off
worldsema
directly to one of the waiting goroutines. This makes the behavior around the semaphore more fair, and helps prevent the situation described in the original post.gopherbot commentedon Jul 28, 2020
Change https://golang.org/cl/243977 mentions this issue:
runtime: release worldsema with a direct G handoff
odeke-em commentedon Jul 29, 2020
Thank you for reporting this @mknyszek and thank you for having the fix steady and ready! Could we perhaps include a footnote BUG in the runtime package about this, or perhaps also in the release notes as an acknowledgement of the bug?