Skip to content

[AArch64] Combine store (trunc X to <3 x i8>) to sequence of ST1.b #8052

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jan 26, 2024

Conversation

fhahn
Copy link

@fhahn fhahn commented Jan 25, 2024

Improve codegen for (trunc X to <3 x i8>) by converting it to a sequence
of 3 ST1.b, but first converting the truncate operand to either v8i8 or
v16i8, extracting the lanes for the truncate results and storing them.

At the moment, there are almost no cases in which such vector operations
will be generated automatically. The motivating case is non-power-of-2
SLP vectorization: llvm#77790

PR: llvm#78637

(cherry-picked from eb678d8)

fhahn added 5 commits January 25, 2024 18:58

Verified

This commit was signed with the committer’s verified signature. The key has expired.
fhahn Florian Hahn
(cherry-picked from 8336515)

Verified

This commit was signed with the committer’s verified signature. The key has expired.
fhahn Florian Hahn
Extra tests for
  llvm#78637
  llvm#78632

(cherry-picked from ff1cde5)

Verified

This commit was signed with the committer’s verified signature. The key has expired.
fhahn Florian Hahn
Extra tests for
  llvm#78637
  llvm#78632

(cherry-picked from e7b4ff8)

Verified

This commit was signed with the committer’s verified signature. The key has expired.
fhahn Florian Hahn
Add extra tests with different load/store alignments for
llvm#78637.

(cherry-picked from 98509c7)

Verified

This commit was signed with the committer’s verified signature. The key has expired.
fhahn Florian Hahn
…lvm#78637)

Improve codegen for (trunc X to <3 x i8>) by converting it to a sequence
of 3 ST1.b, but first converting the truncate operand to either v8i8 or
v16i8, extracting the lanes for the truncate results and storing them.

At the moment, there are almost no cases in which such vector operations
will be generated automatically. The motivating case is non-power-of-2
SLP vectorization: llvm#77790

PR: llvm#78637

(cherry-picked from eb678d8)
@fhahn
Copy link
Author

fhahn commented Jan 25, 2024

@swift-ci please test

@fhahn
Copy link
Author

fhahn commented Jan 25, 2024

@swift-ci please test llvm

@fhahn
Copy link
Author

fhahn commented Jan 26, 2024

@swift-ci please test macos

@fhahn fhahn merged commit b0f6f4f into swiftlang:stable/20230725 Jan 26, 2024
@fhahn fhahn deleted the vec3-trunc-store branch January 26, 2024 20:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant