-
Notifications
You must be signed in to change notification settings - Fork 18.3k
Open
Labels
NeedsFixThe path to resolution is known, but the work has not been done.The path to resolution is known, but the work has not been done.Performancecompiler/runtimeIssues related to the Go compiler and/or runtime.Issues related to the Go compiler and/or runtime.
Milestone
Description
On a machine with many cores, the performance of sync.RWMutex.R{Lock,Unlock}
degrades dramatically as GOMAXPROCS
increases.
This test program:
package benchmarks_test
import (
"fmt"
"sync"
"testing"
)
func BenchmarkRWMutex(b *testing.B) {
for ng := 1; ng <= 256; ng <<= 2 {
b.Run(fmt.Sprint(ng), func(b *testing.B) {
var mu sync.RWMutex
mu.Lock()
var wg sync.WaitGroup
wg.Add(ng)
n := b.N
quota := n / ng
for g := ng; g > 0; g-- {
if g == 1 {
quota = n
}
go func(quota int) {
for i := 0; i < quota; i++ {
mu.RLock()
mu.RUnlock()
}
wg.Done()
}(quota)
n -= quota
}
if n != 0 {
b.Fatalf("Incorrect quota assignments: %v remaining", n)
}
b.StartTimer()
mu.Unlock()
wg.Wait()
b.StopTimer()
})
}
}
degrades by a factor of 8x as it saturates threads and cores, presumably due to cache contention on &rw.readerCount:
# ./benchmarks.test -test.bench . -test.cpu 1,4,16,64
testing: warning: no tests to run
BenchmarkRWMutex/1 20000000 72.6 ns/op
BenchmarkRWMutex/1-4 20000000 72.4 ns/op
BenchmarkRWMutex/1-16 20000000 72.8 ns/op
BenchmarkRWMutex/1-64 20000000 72.5 ns/op
BenchmarkRWMutex/4 20000000 72.6 ns/op
BenchmarkRWMutex/4-4 20000000 105 ns/op
BenchmarkRWMutex/4-16 10000000 130 ns/op
BenchmarkRWMutex/4-64 20000000 160 ns/op
BenchmarkRWMutex/16 20000000 72.4 ns/op
BenchmarkRWMutex/16-4 10000000 125 ns/op
BenchmarkRWMutex/16-16 10000000 263 ns/op
BenchmarkRWMutex/16-64 5000000 287 ns/op
BenchmarkRWMutex/64 20000000 72.6 ns/op
BenchmarkRWMutex/64-4 10000000 137 ns/op
BenchmarkRWMutex/64-16 5000000 306 ns/op
BenchmarkRWMutex/64-64 3000000 517 ns/op
BenchmarkRWMutex/256 20000000 72.4 ns/op
BenchmarkRWMutex/256-4 20000000 137 ns/op
BenchmarkRWMutex/256-16 5000000 280 ns/op
BenchmarkRWMutex/256-64 3000000 602 ns/op
PASS
A "control" test, calling a no-op function instead of RWMutex
methods, displays no such degradation: the problem does not appear to be due to runtime scheduling overhead.
tmthrgd, yuzic, msoedov, stephenlacy, ansiwen and 5 morecristaloleg, bep, ramonberrutti and romankarpowich
Metadata
Metadata
Assignees
Labels
NeedsFixThe path to resolution is known, but the work has not been done.The path to resolution is known, but the work has not been done.Performancecompiler/runtimeIssues related to the Go compiler and/or runtime.Issues related to the Go compiler and/or runtime.