Description
Go version
1.23.6, 1.24.0, 1.24.1
Output of go env
in your module/workspace:
AR='ar'
CC='gcc'
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_ENABLED='1'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
CXX='g++'
GCCGO='gccgo'
GO111MODULE=''
GOAMD64='v1'
GOARCH='amd64'
GOAUTH='netrc'
GOBIN=''
GOCACHE='/home/user/.cache/go-build'
GOCACHEPROG=''
GODEBUG=''
GOENV='/home/user/.config/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFIPS140='off'
GOFLAGS=''
GOGCCFLAGS='-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -ffile-prefix-map=/tmp/go-build456387207=/tmp/go-build -gno-record-gcc-switches'
GOHOSTARCH='amd64'
GOHOSTOS='linux'
GOINSECURE=''
GOMOD='/home/user/somerepo/go.mod'
GOMODCACHE='/home/user/go/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='linux'
GOPATH='/home/user/go'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/home/user/lwcode/go/go1.24.0'
GOSUMDB='sum.golang.org'
GOTELEMETRY='local'
GOTELEMETRYDIR='/home/user/.config/go/telemetry'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/home/user/go/go1.24.0/pkg/tool/linux_amd64'
GOVCS=''
GOVERSION='go1.24.0'
GOWORK=''
PKG_CONFIG='pkg-config'
What did you do?
I am running a performance test on z1d.metal AWS instances. These are 48 core Xeons. https://instances.vantage.sh/aws/ec2/z1d.metal
With 12 goroutines performing CPU intensive tasks on these 48 core processors, I see a very large performance decrease moving from 1.23.6 to 1.24.0, but 1.24.1 restores performance. Building the test with go 1.24.0 with GOEXPERIMENT=nospinbitmutex
almost completely returns perf to 1.23.6 levels
I do not see a performance difference using 4 or fewer goroutines, which is what lead me to test using nospinbitmutex
Here is the naive implementation of the test, each goroutine acts on a batch of roughly 100,000 messages, in order to reduce channel overhead.
numComputeRoutines := runtime.NumCPU() / 4
semaphoreComputeC := make(chan struct{}, numComputeRoutines)
var processWg sync.WaitGroup
processWg.Add(len(messageBatches))
start := time.Now()
var numBatchesProcessed atomic.Int64
for _, batch := range messageBatches { // batches are large
go func(batch []Message) {
semaphoreComputeC <- struct{}{}
defer func() { <-semaphoreComputeC }()
startBatchTime := time.Now()
// Expensive but should not involve any intercore communication
processBatch(batch)
batchDuration := time.Since(startBatchTime)
batchDuration = time.Duration(int64(batchDuration) / int64(len(batch)))
roundedBatchDuration := batchDuration.Round(time.Microsecond)
numBatchesProcessed.Add(1)
fmt.Printf("Processed %d/%d batches (this was %d events) %v per event\n", numBatchesProcessed.Load(), len(messageBatches), len(batch), roundedBatchDuration)
processWg.Done()
}(batch)
}
What did you see happen?
+-------------------------------------------+-------------+-----------+
| ALL ITEMS | 16m54.958s | |
+-------------------------------------------+-------------+-----------+
| PER ITEM | 73.859µs | 13741824 |
+-------------------------------------------+-------------+-----------+
| Num Compute Goroutines | 12 | |
| Go Version | go1.23.6 | |
+-------------------------------------------+-------------+-----------+
| ALL ITEMS | 16m41.381s | |
+-------------------------------------------+-------------+-----------+
| PER ITEM | 72.871µs | 13741824 |
+-------------------------------------------+-------------+-----------+
| Num Compute Goroutines | 12 | |
| Go Version | go1.24.0 X:nospinbitmutex |
+-------------------------------------------+-------------+-----------+
+-------------------------------------------+-------------+-----------+
| ALL ITEMS | 20m0.336s | |
+-------------------------------------------+-------------+-----------+
| PER ITEM | 87.349µs | 13741824 |
+-------------------------------------------+-------------+-----------+
| Num Compute Goroutines | 12 | |
| Go Version | go1.24.0 | |
+-------------------------------------------+-------------+-----------+
| ALL ITEMS | 16m41.746s | |
+-------------------------------------------+-------------+-----------+
| PER ITEM | 72.897µs | 13741824 |
+-------------------------------------------+-------------+-----------+
| Num Compute Goroutines | 12 | |
| Go Version | go1.24.1 | |
+-------------------------------------------+-------------+-----------+
What did you expect to see?
I expected 1.24.1 to have the same performance regression as 1.24.0.
Activity
ianlancetaylor commentedon Mar 5, 2025
If I understand you correctly, things got better. So a bug was fixed. Is there a bug report here, or are you just asking a question? For questions, you will get better and faster answers using a forum rather than the issue tracker: see https://go.dev/wiki/Questions. Thanks.
catz-lw commentedon Mar 5, 2025
Things did indeed get better, but I would expect a fix of this magnitude to be expected and publicized in the release notes. I'll go try a forum.
gabyhelp commentedon Mar 5, 2025
Related Issues
Related Discussions
(Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.)