Skip to content

[GlobalISel] Import extract/insert subvector #110287

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Sep 30, 2024

Conversation

tschuett
Copy link

@tschuett tschuett commented Sep 27, 2024

@llvmbot
Copy link
Member

llvmbot commented Sep 27, 2024

@llvm/pr-subscribers-backend-aarch64

Author: Thorsten Schütt (tschuett)

Changes

Tests are limited to fixed-length vectors.

Test: AArch64/GlobalISel/irtranslator-subvector.ll

Reference:

https://llvm.org/docs/LangRef.html#llvm-vector-extract-intrinsic
https://llvm.org/docs/LangRef.html#llvm-vector-insert-intrinsic


Full diff: https://github.com/llvm/llvm-project/pull/110287.diff

2 Files Affected:

  • (modified) llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp (+14)
  • (added) llvm/test/CodeGen/AArch64/GlobalISel/irtranslator-subvector.ll (+78)
diff --git a/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp b/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
index 7ff8d2446eec5d..a0649f712bd642 100644
--- a/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
+++ b/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
@@ -2588,6 +2588,20 @@ bool IRTranslator::translateKnownIntrinsic(const CallInst &CI, Intrinsic::ID ID,
                          getOrCreateVReg(*CI.getOperand(0)),
                          getOrCreateVReg(*CI.getOperand(1)));
     return true;
+  case Intrinsic::vector_extract: {
+    ConstantInt *Index = cast<ConstantInt>(CI.getOperand(1));
+    MIRBuilder.buildExtractSubvector(getOrCreateVReg(CI),
+                                     getOrCreateVReg(*CI.getOperand(0)),
+                                     Index->getZExtValue());
+    return true;
+  }
+  case Intrinsic::vector_insert: {
+    ConstantInt *Index = cast<ConstantInt>(CI.getOperand(2));
+    MIRBuilder.buildInsertSubvector(
+        getOrCreateVReg(CI), getOrCreateVReg(*CI.getOperand(0)),
+        getOrCreateVReg(*CI.getOperand(1)), Index->getZExtValue());
+    return true;
+  }
   case Intrinsic::prefetch: {
     Value *Addr = CI.getOperand(0);
     unsigned RW = cast<ConstantInt>(CI.getOperand(1))->getZExtValue();
diff --git a/llvm/test/CodeGen/AArch64/GlobalISel/irtranslator-subvector.ll b/llvm/test/CodeGen/AArch64/GlobalISel/irtranslator-subvector.ll
new file mode 100644
index 00000000000000..bdcd8e3d99af87
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/irtranslator-subvector.ll
@@ -0,0 +1,78 @@
+; NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
+; RUN: llc -O0 -mtriple=aarch64-linux-gnu -global-isel -stop-after=irtranslator %s -o - | FileCheck %s
+
+define i32 @extract_v4i32_vector_insert_const(<4 x i32> %a, <2 x i32> %b, i32 %c) {
+  ; CHECK-LABEL: name: extract_v4i32_vector_insert_const
+  ; CHECK: bb.1.entry:
+  ; CHECK-NEXT:   liveins: $d1, $q0, $w0
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   [[COPY:%[0-9]+]]:_(<4 x s32>) = COPY $q0
+  ; CHECK-NEXT:   [[COPY1:%[0-9]+]]:_(<2 x s32>) = COPY $d1
+  ; CHECK-NEXT:   [[COPY2:%[0-9]+]]:_(s32) = COPY $w0
+  ; CHECK-NEXT:   [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 1
+  ; CHECK-NEXT:   [[INSERT_SUBVECTOR:%[0-9]+]]:_(<4 x s32>) = G_INSERT_SUBVECTOR [[COPY]], [[COPY1]](<2 x s32>), 0
+  ; CHECK-NEXT:   [[EVEC:%[0-9]+]]:_(s32) = G_EXTRACT_VECTOR_ELT [[INSERT_SUBVECTOR]](<4 x s32>), [[C]](s64)
+  ; CHECK-NEXT:   $w0 = COPY [[EVEC]](s32)
+  ; CHECK-NEXT:   RET_ReallyLR implicit $w0
+entry:
+  %vector = call <4 x i32> @llvm.vector.insert.v4i32.v2i32(<4 x i32> %a, <2 x i32> %b, i64 0)
+  %d = extractelement <4 x i32> %vector, i32 1
+  ret i32 %d
+}
+
+define i32 @extract_v4i32_vector_insert(<4 x i32> %a, <2 x i32> %b, i32 %c) {
+  ; CHECK-LABEL: name: extract_v4i32_vector_insert
+  ; CHECK: bb.1.entry:
+  ; CHECK-NEXT:   liveins: $d1, $q0, $w0
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   [[COPY:%[0-9]+]]:_(<4 x s32>) = COPY $q0
+  ; CHECK-NEXT:   [[COPY1:%[0-9]+]]:_(<2 x s32>) = COPY $d1
+  ; CHECK-NEXT:   [[COPY2:%[0-9]+]]:_(s32) = COPY $w0
+  ; CHECK-NEXT:   [[INSERT_SUBVECTOR:%[0-9]+]]:_(<4 x s32>) = G_INSERT_SUBVECTOR [[COPY]], [[COPY1]](<2 x s32>), 0
+  ; CHECK-NEXT:   [[ZEXT:%[0-9]+]]:_(s64) = G_ZEXT [[COPY2]](s32)
+  ; CHECK-NEXT:   [[EVEC:%[0-9]+]]:_(s32) = G_EXTRACT_VECTOR_ELT [[INSERT_SUBVECTOR]](<4 x s32>), [[ZEXT]](s64)
+  ; CHECK-NEXT:   $w0 = COPY [[EVEC]](s32)
+  ; CHECK-NEXT:   RET_ReallyLR implicit $w0
+entry:
+  %vector = call <4 x i32> @llvm.vector.insert.v4i32.v2i32(<4 x i32> %a, <2 x i32> %b, i64 0)
+  %d = extractelement <4 x i32> %vector, i32 %c
+  ret i32 %d
+}
+
+define i32 @extract_v4i32_vector_extract(<4 x i32> %a, <2 x i32> %b, i32 %c) {
+  ; CHECK-LABEL: name: extract_v4i32_vector_extract
+  ; CHECK: bb.1.entry:
+  ; CHECK-NEXT:   liveins: $d1, $q0, $w0
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   [[COPY:%[0-9]+]]:_(<4 x s32>) = COPY $q0
+  ; CHECK-NEXT:   [[COPY1:%[0-9]+]]:_(<2 x s32>) = COPY $d1
+  ; CHECK-NEXT:   [[COPY2:%[0-9]+]]:_(s32) = COPY $w0
+  ; CHECK-NEXT:   [[EXTRACT_SUBVECTOR:%[0-9]+]]:_(<4 x s32>) = G_EXTRACT_SUBVECTOR [[COPY]](<4 x s32>), 0
+  ; CHECK-NEXT:   [[ZEXT:%[0-9]+]]:_(s64) = G_ZEXT [[COPY2]](s32)
+  ; CHECK-NEXT:   [[EVEC:%[0-9]+]]:_(s32) = G_EXTRACT_VECTOR_ELT [[EXTRACT_SUBVECTOR]](<4 x s32>), [[ZEXT]](s64)
+  ; CHECK-NEXT:   $w0 = COPY [[EVEC]](s32)
+  ; CHECK-NEXT:   RET_ReallyLR implicit $w0
+entry:
+  %vector = call <4 x i32> @llvm.vector.extract.v2i32.v4i32(<4 x i32> %a, i64 0)
+  %d = extractelement <4 x i32> %vector, i32 %c
+  ret i32 %d
+}
+
+define i32 @extract_v4i32_vector_extract_const(<4 x i32> %a, <2 x i32> %b, i32 %c) {
+  ; CHECK-LABEL: name: extract_v4i32_vector_extract_const
+  ; CHECK: bb.1.entry:
+  ; CHECK-NEXT:   liveins: $d1, $q0, $w0
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   [[COPY:%[0-9]+]]:_(<4 x s32>) = COPY $q0
+  ; CHECK-NEXT:   [[COPY1:%[0-9]+]]:_(<2 x s32>) = COPY $d1
+  ; CHECK-NEXT:   [[COPY2:%[0-9]+]]:_(s32) = COPY $w0
+  ; CHECK-NEXT:   [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 0
+  ; CHECK-NEXT:   [[EXTRACT_SUBVECTOR:%[0-9]+]]:_(<4 x s32>) = G_EXTRACT_SUBVECTOR [[COPY]](<4 x s32>), 0
+  ; CHECK-NEXT:   [[EVEC:%[0-9]+]]:_(s32) = G_EXTRACT_VECTOR_ELT [[EXTRACT_SUBVECTOR]](<4 x s32>), [[C]](s64)
+  ; CHECK-NEXT:   $w0 = COPY [[EVEC]](s32)
+  ; CHECK-NEXT:   RET_ReallyLR implicit $w0
+entry:
+  %vector = call <4 x i32> @llvm.vector.extract.v2i32.v4i32(<4 x i32> %a, i64 0)
+  %d = extractelement <4 x i32> %vector, i32 0
+  ret i32 %d
+}

@llvmbot
Copy link
Member

llvmbot commented Sep 27, 2024

@llvm/pr-subscribers-llvm-globalisel

Author: Thorsten Schütt (tschuett)

Changes

Tests are limited to fixed-length vectors.

Test: AArch64/GlobalISel/irtranslator-subvector.ll

Reference:

https://llvm.org/docs/LangRef.html#llvm-vector-extract-intrinsic
https://llvm.org/docs/LangRef.html#llvm-vector-insert-intrinsic


Full diff: https://github.com/llvm/llvm-project/pull/110287.diff

2 Files Affected:

  • (modified) llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp (+14)
  • (added) llvm/test/CodeGen/AArch64/GlobalISel/irtranslator-subvector.ll (+78)
diff --git a/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp b/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
index 7ff8d2446eec5d..a0649f712bd642 100644
--- a/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
+++ b/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
@@ -2588,6 +2588,20 @@ bool IRTranslator::translateKnownIntrinsic(const CallInst &CI, Intrinsic::ID ID,
                          getOrCreateVReg(*CI.getOperand(0)),
                          getOrCreateVReg(*CI.getOperand(1)));
     return true;
+  case Intrinsic::vector_extract: {
+    ConstantInt *Index = cast<ConstantInt>(CI.getOperand(1));
+    MIRBuilder.buildExtractSubvector(getOrCreateVReg(CI),
+                                     getOrCreateVReg(*CI.getOperand(0)),
+                                     Index->getZExtValue());
+    return true;
+  }
+  case Intrinsic::vector_insert: {
+    ConstantInt *Index = cast<ConstantInt>(CI.getOperand(2));
+    MIRBuilder.buildInsertSubvector(
+        getOrCreateVReg(CI), getOrCreateVReg(*CI.getOperand(0)),
+        getOrCreateVReg(*CI.getOperand(1)), Index->getZExtValue());
+    return true;
+  }
   case Intrinsic::prefetch: {
     Value *Addr = CI.getOperand(0);
     unsigned RW = cast<ConstantInt>(CI.getOperand(1))->getZExtValue();
diff --git a/llvm/test/CodeGen/AArch64/GlobalISel/irtranslator-subvector.ll b/llvm/test/CodeGen/AArch64/GlobalISel/irtranslator-subvector.ll
new file mode 100644
index 00000000000000..bdcd8e3d99af87
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/irtranslator-subvector.ll
@@ -0,0 +1,78 @@
+; NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
+; RUN: llc -O0 -mtriple=aarch64-linux-gnu -global-isel -stop-after=irtranslator %s -o - | FileCheck %s
+
+define i32 @extract_v4i32_vector_insert_const(<4 x i32> %a, <2 x i32> %b, i32 %c) {
+  ; CHECK-LABEL: name: extract_v4i32_vector_insert_const
+  ; CHECK: bb.1.entry:
+  ; CHECK-NEXT:   liveins: $d1, $q0, $w0
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   [[COPY:%[0-9]+]]:_(<4 x s32>) = COPY $q0
+  ; CHECK-NEXT:   [[COPY1:%[0-9]+]]:_(<2 x s32>) = COPY $d1
+  ; CHECK-NEXT:   [[COPY2:%[0-9]+]]:_(s32) = COPY $w0
+  ; CHECK-NEXT:   [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 1
+  ; CHECK-NEXT:   [[INSERT_SUBVECTOR:%[0-9]+]]:_(<4 x s32>) = G_INSERT_SUBVECTOR [[COPY]], [[COPY1]](<2 x s32>), 0
+  ; CHECK-NEXT:   [[EVEC:%[0-9]+]]:_(s32) = G_EXTRACT_VECTOR_ELT [[INSERT_SUBVECTOR]](<4 x s32>), [[C]](s64)
+  ; CHECK-NEXT:   $w0 = COPY [[EVEC]](s32)
+  ; CHECK-NEXT:   RET_ReallyLR implicit $w0
+entry:
+  %vector = call <4 x i32> @llvm.vector.insert.v4i32.v2i32(<4 x i32> %a, <2 x i32> %b, i64 0)
+  %d = extractelement <4 x i32> %vector, i32 1
+  ret i32 %d
+}
+
+define i32 @extract_v4i32_vector_insert(<4 x i32> %a, <2 x i32> %b, i32 %c) {
+  ; CHECK-LABEL: name: extract_v4i32_vector_insert
+  ; CHECK: bb.1.entry:
+  ; CHECK-NEXT:   liveins: $d1, $q0, $w0
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   [[COPY:%[0-9]+]]:_(<4 x s32>) = COPY $q0
+  ; CHECK-NEXT:   [[COPY1:%[0-9]+]]:_(<2 x s32>) = COPY $d1
+  ; CHECK-NEXT:   [[COPY2:%[0-9]+]]:_(s32) = COPY $w0
+  ; CHECK-NEXT:   [[INSERT_SUBVECTOR:%[0-9]+]]:_(<4 x s32>) = G_INSERT_SUBVECTOR [[COPY]], [[COPY1]](<2 x s32>), 0
+  ; CHECK-NEXT:   [[ZEXT:%[0-9]+]]:_(s64) = G_ZEXT [[COPY2]](s32)
+  ; CHECK-NEXT:   [[EVEC:%[0-9]+]]:_(s32) = G_EXTRACT_VECTOR_ELT [[INSERT_SUBVECTOR]](<4 x s32>), [[ZEXT]](s64)
+  ; CHECK-NEXT:   $w0 = COPY [[EVEC]](s32)
+  ; CHECK-NEXT:   RET_ReallyLR implicit $w0
+entry:
+  %vector = call <4 x i32> @llvm.vector.insert.v4i32.v2i32(<4 x i32> %a, <2 x i32> %b, i64 0)
+  %d = extractelement <4 x i32> %vector, i32 %c
+  ret i32 %d
+}
+
+define i32 @extract_v4i32_vector_extract(<4 x i32> %a, <2 x i32> %b, i32 %c) {
+  ; CHECK-LABEL: name: extract_v4i32_vector_extract
+  ; CHECK: bb.1.entry:
+  ; CHECK-NEXT:   liveins: $d1, $q0, $w0
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   [[COPY:%[0-9]+]]:_(<4 x s32>) = COPY $q0
+  ; CHECK-NEXT:   [[COPY1:%[0-9]+]]:_(<2 x s32>) = COPY $d1
+  ; CHECK-NEXT:   [[COPY2:%[0-9]+]]:_(s32) = COPY $w0
+  ; CHECK-NEXT:   [[EXTRACT_SUBVECTOR:%[0-9]+]]:_(<4 x s32>) = G_EXTRACT_SUBVECTOR [[COPY]](<4 x s32>), 0
+  ; CHECK-NEXT:   [[ZEXT:%[0-9]+]]:_(s64) = G_ZEXT [[COPY2]](s32)
+  ; CHECK-NEXT:   [[EVEC:%[0-9]+]]:_(s32) = G_EXTRACT_VECTOR_ELT [[EXTRACT_SUBVECTOR]](<4 x s32>), [[ZEXT]](s64)
+  ; CHECK-NEXT:   $w0 = COPY [[EVEC]](s32)
+  ; CHECK-NEXT:   RET_ReallyLR implicit $w0
+entry:
+  %vector = call <4 x i32> @llvm.vector.extract.v2i32.v4i32(<4 x i32> %a, i64 0)
+  %d = extractelement <4 x i32> %vector, i32 %c
+  ret i32 %d
+}
+
+define i32 @extract_v4i32_vector_extract_const(<4 x i32> %a, <2 x i32> %b, i32 %c) {
+  ; CHECK-LABEL: name: extract_v4i32_vector_extract_const
+  ; CHECK: bb.1.entry:
+  ; CHECK-NEXT:   liveins: $d1, $q0, $w0
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   [[COPY:%[0-9]+]]:_(<4 x s32>) = COPY $q0
+  ; CHECK-NEXT:   [[COPY1:%[0-9]+]]:_(<2 x s32>) = COPY $d1
+  ; CHECK-NEXT:   [[COPY2:%[0-9]+]]:_(s32) = COPY $w0
+  ; CHECK-NEXT:   [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 0
+  ; CHECK-NEXT:   [[EXTRACT_SUBVECTOR:%[0-9]+]]:_(<4 x s32>) = G_EXTRACT_SUBVECTOR [[COPY]](<4 x s32>), 0
+  ; CHECK-NEXT:   [[EVEC:%[0-9]+]]:_(s32) = G_EXTRACT_VECTOR_ELT [[EXTRACT_SUBVECTOR]](<4 x s32>), [[C]](s64)
+  ; CHECK-NEXT:   $w0 = COPY [[EVEC]](s32)
+  ; CHECK-NEXT:   RET_ReallyLR implicit $w0
+entry:
+  %vector = call <4 x i32> @llvm.vector.extract.v2i32.v4i32(<4 x i32> %a, i64 0)
+  %d = extractelement <4 x i32> %vector, i32 0
+  ret i32 %d
+}

@tschuett tschuett force-pushed the gisel-import-subvector branch 3 times, most recently from f2202d0 to f5b6a7c Compare September 28, 2024 14:25
@tschuett tschuett force-pushed the gisel-import-subvector branch from e5e78c1 to 68a0152 Compare September 29, 2024 11:30
}
if (auto *InputType =
dyn_cast<FixedVectorType>(U.getOperand(0)->getType())) {
// We are inserting an illegal fixed vector into a fixed vector, use the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

InputType not used, is this missing the InputType && InputType->getNumElements() == 1 like above?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope. It is meant to say InputType && InputType->getNumElements() != 1.

@tschuett
Copy link
Author

The unused InputType s are probably just isa tests.

Copy link
Contributor

@arsenm arsenm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but could use some test coverage for different element classes

%vector = call <1 x i32> @llvm.vector.insert.v1i32.v4i32(<1 x i32> %a, <1 x i32> %b, i64 0)
store <1 x i32> %vector, ptr %p, align 16
ret i32 1
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add some tests with FP types, and vectors of pointers?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. There are some limitations with scalable vectors.

Copy link
Contributor

@arsenm arsenm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm with nit

; CHECK-NEXT: $d0 = COPY [[EVEC]](s64)
; CHECK-NEXT: RET_ReallyLR implicit $d0
entry:
%vector = call <4 x double> @llvm.vector.insert.v4double.v2double(<4 x double> %a, <2 x double> %b, i64 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use canonical mangling, f64 not double

@tschuett
Copy link
Author

Thanks for review.

@tschuett tschuett merged commit 53943de into llvm:main Sep 30, 2024
8 checks passed
@tschuett tschuett deleted the gisel-import-subvector branch September 30, 2024 20:12
Sterling-Augustine pushed a commit to Sterling-Augustine/llvm-project that referenced this pull request Oct 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants