ARM/AArch64 backend aggressively pessimizes code with broadcasted constants

I'm having a lot of trouble with the arm (32 and 64 bit) backends de-optimizing code related to broadcasted constants.  There are several issues:

- LLVM attempts to observe constants through memory, and propagate them.
- LLVM moves broadcasts into loops.
- LLVM spills broadcasts by redoing the broadcast, rather than spilling and reloading a vector.

Here's an example that demonstrates several issues: https://godbolt.org/z/chjx4d4vh

If the compiler would compile the code as written, there would be no register spills, because the constants would occupy half as many registers. I included a commented call to `make_opaque` that is one attempted workaround, to trick the compiler into not thinking these are constants (at the expense of a function call...), and it does work to do that, but the compiler still moves the broadcasts (`dup` instructions) out of the loop and spills some of the registers.

I run into this issue very frequently. Any suggested workarounds, e.g. some annotation to force the compiler to keep a broadcast outside of the loop, or possible fixes to LLVM, would be very welcome. As it stands, I find `vmla_lane_X` intrinsics to be almost useless because of this issue.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ARM/AArch64 backend aggressively pessimizes code with broadcasted constants #102195

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

ARM/AArch64 backend aggressively pessimizes code with broadcasted constants #102195

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions