Skip to content

cmd/compile: optimize away some MOVQconverts #21572

Open
@josharian

Description

@josharian

We have to keep uintptrs and unsafe.Pointers separate, to get accurate stackmaps for the compiler. However, in some cases, this generates unnecessary register moves.

Here's the example from the runtime I'm looking at. mapaccess1_fast32 currently ends:

	for {
		for i, k := uintptr(0), b.keys(); i < bucketCnt; i, k = i+1, add(k, 4) {
			if *(*uint32)(k) == key && b.tophash[i] != empty {
				return add(unsafe.Pointer(b), dataOffset+bucketCnt*4+i*uintptr(t.valuesize))
			}
		}
		b = b.overflow(t)
		if b == nil {
			return unsafe.Pointer(&zeroVal[0])
		}
	}

This has an unnecessary nil check of b in the inner loop when evaluating b.tophash, so I'd like to change the outer loop structure to remove it:

	for ; b != nil; b = b.overflow(t) {
		for i, k := uintptr(0), b.keys(); i < bucketCnt; i, k = i+1, add(k, 4) {
			if *(*uint32)(k) == key && b.tophash[i] != empty {
				return add(unsafe.Pointer(b), dataOffset+bucketCnt*4+i*uintptr(t.valuesize))
			}
		}
	}
	return unsafe.Pointer(&zeroVal[0])

With this new structure, the nil check is gone, but we now have an extra register-register move, instruction 0x009f:

	0x0096 00150 (hashmap_fast.go:42)	MOVQ	"".t+40(SP), CX
	0x009b 00155 (hashmap_fast.go:42)	MOVWLZX	84(CX), DX
	0x009f 00159 (hashmap_fast.go:42)	MOVQ	AX, BX
	0x00a2 00162 (hashmap_fast.go:42)	LEAQ	-8(BX)(DX*1), DX
	0x00a7 00167 (hashmap_fast.go:42)	TESTB	AL, (CX)
	0x00a9 00169 (hashmap_fast.go:42)	MOVQ	(DX), AX
	0x00ac 00172 (hashmap_fast.go:42)	TESTQ	AX, AX
	0x00af 00175 (hashmap_fast.go:42)	JEQ	185

The register-register move is there because calculating b.overflow involves a uintptr/unsafe.Pointer conversion, which gets translated into a MOVQconvert; regalloc allocates a register for the converted value. However, the register move is pointless; the destination register (BX) is used in an LEAQ instruction and is dead thereafter.

In general, it seems that we should be able to rewrite away some MOVQconverts when they are used once, immediately, as part of some pointer math, which is the typical usage. The hard part is making sure that the rewrite rules are safe.

This should help codegen for the runtime, which does lots of pointer arithmetic.

cc @randall77

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions