Skip to content

runtime: spinbitmutex performance differs between 1.24.0 and 1.24.1, what changed? #72117

Closed
@catz-lw

Description

@catz-lw

Go version

1.23.6, 1.24.0, 1.24.1

Output of go env in your module/workspace:

AR='ar'
CC='gcc'
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_ENABLED='1'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
CXX='g++'
GCCGO='gccgo'
GO111MODULE=''
GOAMD64='v1'
GOARCH='amd64'
GOAUTH='netrc'
GOBIN=''
GOCACHE='/home/user/.cache/go-build'
GOCACHEPROG=''
GODEBUG=''
GOENV='/home/user/.config/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFIPS140='off'
GOFLAGS=''
GOGCCFLAGS='-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -ffile-prefix-map=/tmp/go-build456387207=/tmp/go-build -gno-record-gcc-switches'
GOHOSTARCH='amd64'
GOHOSTOS='linux'
GOINSECURE=''
GOMOD='/home/user/somerepo/go.mod'
GOMODCACHE='/home/user/go/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='linux'
GOPATH='/home/user/go'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/home/user/lwcode/go/go1.24.0'
GOSUMDB='sum.golang.org'
GOTELEMETRY='local'
GOTELEMETRYDIR='/home/user/.config/go/telemetry'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/home/user/go/go1.24.0/pkg/tool/linux_amd64'
GOVCS=''
GOVERSION='go1.24.0'
GOWORK=''
PKG_CONFIG='pkg-config'

What did you do?

I am running a performance test on z1d.metal AWS instances. These are 48 core Xeons. https://instances.vantage.sh/aws/ec2/z1d.metal

With 12 goroutines performing CPU intensive tasks on these 48 core processors, I see a very large performance decrease moving from 1.23.6 to 1.24.0, but 1.24.1 restores performance. Building the test with go 1.24.0 with GOEXPERIMENT=nospinbitmutex almost completely returns perf to 1.23.6 levels

I do not see a performance difference using 4 or fewer goroutines, which is what lead me to test using nospinbitmutex

Here is the naive implementation of the test, each goroutine acts on a batch of roughly 100,000 messages, in order to reduce channel overhead.

        numComputeRoutines := runtime.NumCPU() / 4
	semaphoreComputeC := make(chan struct{}, numComputeRoutines)
	var processWg sync.WaitGroup
	processWg.Add(len(messageBatches))
	start := time.Now()
	var numBatchesProcessed atomic.Int64
	for _, batch := range messageBatches { // batches are large
		go func(batch []Message) {
			semaphoreComputeC <- struct{}{}
			defer func() { <-semaphoreComputeC }()
			startBatchTime := time.Now()
                         // Expensive but should not involve any intercore communication
			processBatch(batch) 
			batchDuration := time.Since(startBatchTime)
			batchDuration = time.Duration(int64(batchDuration) / int64(len(batch)))
			roundedBatchDuration := batchDuration.Round(time.Microsecond)
			numBatchesProcessed.Add(1)
			fmt.Printf("Processed %d/%d batches (this was %d events) %v per event\n", numBatchesProcessed.Load(), len(messageBatches), len(batch), roundedBatchDuration)
			processWg.Done()
		}(batch)
	}

What did you see happen?

+-------------------------------------------+-------------+-----------+
| ALL ITEMS                                 |  16m54.958s |               |
+-------------------------------------------+-------------+-----------+
| PER ITEM                                  |  73.859µs   |   13741824    |
+-------------------------------------------+-------------+-----------+
| Num Compute Goroutines                    |  12         |               |
| Go Version                                |  go1.23.6   |               |
+-------------------------------------------+-------------+-----------+


| ALL ITEMS                                 |  16m41.381s |               |
+-------------------------------------------+-------------+-----------+
| PER ITEM                                  |  72.871µs   |   13741824    |
+-------------------------------------------+-------------+-----------+
| Num Compute Goroutines                    |  12         |               |
| Go Version                                |  go1.24.0 X:nospinbitmutex |         
+-------------------------------------------+-------------+-----------+


+-------------------------------------------+-------------+-----------+
| ALL ITEMS                            |  20m0.336s  |               |
+-------------------------------------------+-------------+-----------+
| PER ITEM                                  |  87.349µs   |   13741824    |
+-------------------------------------------+-------------+-----------+
| Num Compute Goroutines                    |  12         |               |
| Go Version                                |  go1.24.0   |               |
+-------------------------------------------+-------------+-----------+


| ALL ITEMS                            |  16m41.746s |               |
+-------------------------------------------+-------------+-----------+
| PER ITEM                                  |  72.897µs   |   13741824    |
+-------------------------------------------+-------------+-----------+
| Num Compute Goroutines                    |  12         |               |
| Go Version                                |  go1.24.1   |               |
+-------------------------------------------+-------------+-----------+


What did you expect to see?

I expected 1.24.1 to have the same performance regression as 1.24.0.

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugReportIssues describing a possible bug in the Go implementation.WaitingForInfoIssue is not actionable because of missing required information, which needs to be provided.compiler/runtimeIssues related to the Go compiler and/or runtime.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions