Closed
Description
What version of Go are you using (go version
)?
$ go version 1.15.5
Does this issue reproduce with the latest release?
Yes, also happens on tip.
What did you do?
The inline cost calculation for these seem inconsistent https://go.godbolt.org/z/YMKWjc
package ex
import (
"math"
"encoding/binary"
)
type Point struct {
X, Y float32
}
type Quad struct {
From, Ctrl, To Point
}
func EncodeQuad(d [24]byte, q Quad) {
binary.LittleEndian.PutUint32(d[0:], math.Float32bits(q.From.X))
binary.LittleEndian.PutUint32(d[4:], math.Float32bits(q.From.Y))
binary.LittleEndian.PutUint32(d[8:], math.Float32bits(q.Ctrl.X))
binary.LittleEndian.PutUint32(d[12:], math.Float32bits(q.Ctrl.Y))
binary.LittleEndian.PutUint32(d[16:], math.Float32bits(q.To.X))
binary.LittleEndian.PutUint32(d[20:], math.Float32bits(q.To.Y))
}
func EncodeQuad2(d [6]float32, q Quad) {
d[0] = q.From.X
d[1] = q.From.Y
d[2] = q.Ctrl.X
d[3] = q.Ctrl.Y
d[4] = q.To.X
d[5] = q.To.Y
}
func EncodeQuad3(d [6]uint32, q Quad) {
d[0] = math.Float32bits(q.From.X)
d[1] = math.Float32bits(q.From.Y)
d[2] = math.Float32bits(q.Ctrl.X)
d[3] = math.Float32bits(q.Ctrl.Y)
d[4] = math.Float32bits(q.To.X)
d[5] = math.Float32bits(q.To.Y)
}
The inline cost for those:
$ go tool compile -m -m ex.go | grep cost
ex.go:16:6: cannot inline EncodeQuad: function too complex: cost 318 exceeds budget 80
ex.go:25:6: can inline EncodeQuad2 with cost 42 as: func([6]float32, Quad) { d[0] = q.From.X; d[1] = q.From.Y; d[2] = q.Ctrl.X; d[3] = q.Ctrl.Y; d[4] = q.To.X; d[5] = q.To.Y }
ex.go:34:6: cannot inline EncodeQuad3: function too complex: cost 90 exceeds budget 80
There seems to be a really large difference between EncodeQuad2, EncodedQuad3 and EncodeQuad. I would expect all of them to be inlinable and similar in cost.
Metadata
Metadata
Assignees
Labels
Type
Projects
Milestone
Relationships
Development
No branches or pull requests
Activity
cespare commentedon Nov 23, 2020
As far as the Float32bits part goes, it could be similar to #42739.
randall77 commentedon Nov 23, 2020
The big difference in
EncodeQuad
is thatPutUint32
has to handle unaligned writes, whose expensiveness depends on the system. For architectures with unaligned writes it is cheap, but for those without unaligned writes it is still 24 byte writes. The otherEncodeQuad
s can assume aligned writes.At inline cost calculation time, we have to assume the worst for
PutUint32
because it all depends on whether the unaligned write combining optimization happens.BTW,
EncodeQuad*
doesn't do anything. It encodes into an argument that is thrown away upon return. You want to use a slice or pointer to an array, not a bare array.egonelbre commentedon Nov 23, 2020
Thanks, I'm aware. The code is a simplification of #28941 (comment) to ensure BCE works.
egonelbre commentedon Jan 2, 2021
I was thinking about the conversion overhead in math.Float32bits. Maybe
(*X)(unsafe.Pointer(x))
should be considered as zero cost?I currently added a new case in
inl.go
as:I'm not entirely sure, whether the
if
handles all the conditions properly, however it did cut 2 off the cost ofmath.Float32bits
.It didn't make it equivalent to just returning, however it seems like a worthwhile improvement nevertheless.
egonelbre commentedon Jan 2, 2021
I guess the whole
*(*X)(unsafe.Pointer(&x))
would be low-cost as long as the type x fits into a register.This is more hacky case than the previous:
randall77 commentedon Jan 2, 2021
Perhaps all
OCONVNOP
s should be zero cost.egonelbre commentedon Jan 2, 2021
Yeah, that is simpler. I wasn't sure whether string to byte (or any other similar) conversions are represented by
OCONVNOP
. After reading the documentation, they obviously aren't.gopherbot commentedon Jan 3, 2021
Change https://golang.org/cl/281232 mentions this issue:
cmd/compile: reduce inline cost of OCONVOP
1 remaining item