Skip to content

Incorrect codegen for unaligned load/store (ARM NEON) #59081

@Nugine

Description

@Nugine

https://clang.godbolt.org/z/nEfsT961d

#include <arm_neon.h>
#include <stdint.h>

void double32(uint8_t* data) {
    uint8x16x2_t x = vld1q_u8_x2(data);
    uint8x16_t y0 = vaddq_u8(x.val[0], x.val[0]);
    uint8x16_t y1 = vaddq_u8(x.val[1], x.val[1]);
    uint8x16x2_t y = {y0, y1};
    vst1q_u8_x2(data, y);
}
double32:
        vld1.8  {d16, d17, d18, d19}, [r0:256] ; <- should be [r0]
        vshl.i8 q11, q9, #1
        vshl.i8 q10, q8, #1
        vst1.8  {d20, d21, d22, d23}, [r0:256] ; <- should be [r0]
        bx      lr

https://developer.arm.com/documentation/den0018/a/NEON-Instruction-Set-Architecture/Alignment

It seems that there is no way to generate unaligned load/store instructions.

Metadata

Metadata

Assignees

Type

No type

Projects

Status

Done

Relationships

None yet

Development

No branches or pull requests

Issue actions