[AArch64] Combine store (trunc X to <3 x i8>) to sequence of ST1.b #8052

fhahn · 2024-01-25T19:06:37Z

Improve codegen for (trunc X to <3 x i8>) by converting it to a sequence
of 3 ST1.b, but first converting the truncate operand to either v8i8 or
v16i8, extracting the lanes for the truncate results and storing them.

At the moment, there are almost no cases in which such vector operations
will be generated automatically. The motivating case is non-power-of-2
SLP vectorization: llvm#77790

PR: llvm#78637

(cherry-picked from eb678d8)

(cherry-picked from 8336515)

Extra tests for llvm#78637 llvm#78632 (cherry-picked from ff1cde5)

Extra tests for llvm#78637 llvm#78632 (cherry-picked from e7b4ff8)

Add extra tests with different load/store alignments for llvm#78637. (cherry-picked from 98509c7)

…lvm#78637) Improve codegen for (trunc X to <3 x i8>) by converting it to a sequence of 3 ST1.b, but first converting the truncate operand to either v8i8 or v16i8, extracting the lanes for the truncate results and storing them. At the moment, there are almost no cases in which such vector operations will be generated automatically. The motivating case is non-power-of-2 SLP vectorization: llvm#77790 PR: llvm#78637 (cherry-picked from eb678d8)

fhahn · 2024-01-25T19:06:47Z

@swift-ci please test

fhahn · 2024-01-25T19:06:53Z

@swift-ci please test llvm

fhahn · 2024-01-26T09:55:52Z

@swift-ci please test macos

fhahn added 5 commits January 25, 2024 18:58

[AArch64] Add vec3 tests with add between load and store.

Unverified

This commit is not signed, but one or more authors requires that any commit attributed to them is signed.

Learn about vigilant mode

44f385a

Extra tests for llvm#78637 llvm#78632 (cherry-picked from e7b4ff8)

fhahn merged commit b0f6f4f into swiftlang:stable/20230725 Jan 26, 2024

fhahn deleted the vec3-trunc-store branch January 26, 2024 20:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[AArch64] Combine store (trunc X to <3 x i8>) to sequence of ST1.b #8052

[AArch64] Combine store (trunc X to <3 x i8>) to sequence of ST1.b #8052

fhahn commented Jan 25, 2024

Uh oh!

fhahn commented Jan 25, 2024

Uh oh!

fhahn commented Jan 25, 2024

Uh oh!

fhahn commented Jan 26, 2024

Uh oh!

[AArch64] Combine store (trunc X to <3 x i8>) to sequence of ST1.b #8052

[AArch64] Combine store (trunc X to <3 x i8>) to sequence of ST1.b #8052

Conversation

fhahn commented Jan 25, 2024

Uh oh!

fhahn commented Jan 25, 2024

Uh oh!

fhahn commented Jan 25, 2024

Uh oh!

fhahn commented Jan 26, 2024

Uh oh!