Skip to content

release/18.x: [AArch64] Backport Ampere1B support (#81297 , #81341, and #81744) #81857

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Feb 27, 2024

Conversation

llvmbot
Copy link
Member

@llvmbot llvmbot commented Feb 15, 2024

Backport a52eea6 fbba818 dd1897c 3369e34

Requested by: @ptomsich

@llvmbot llvmbot added this to the LLVM 18.X Release milestone Feb 15, 2024
@llvmbot
Copy link
Member Author

llvmbot commented Feb 15, 2024

@DavidSpickett @jthackray @davemgreen @davemgreen What do you think about merging this PR to the release branch?

@llvmbot llvmbot added clang Clang issues not falling into any other category backend:AArch64 clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' labels Feb 15, 2024
@llvmbot llvmbot requested a review from davemgreen February 15, 2024 13:20
@llvmbot llvmbot added clang:frontend Language frontend issues, e.g. anything involving "Sema" mc Machine (object) code labels Feb 15, 2024
@llvmbot
Copy link
Member Author

llvmbot commented Feb 15, 2024

@llvm/pr-subscribers-mc
@llvm/pr-subscribers-clang-driver
@llvm/pr-subscribers-clang

@llvm/pr-subscribers-backend-aarch64

Author: None (llvmbot)

Changes

Backport a52eea6 fbba818 dd1897c 3369e34

Requested by: @ptomsich


Patch is 631.69 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/81857.diff

40 Files Affected:

  • (modified) clang/lib/Basic/Targets/AArch64.cpp (-1)
  • (modified) clang/test/CodeGen/aarch64-targetattr.c (+5-5)
  • (modified) clang/test/Driver/aarch64-cssc.c (+1)
  • (modified) clang/test/Misc/target-invalid-cpu-note.c (+2-2)
  • (modified) clang/test/Preprocessor/aarch64-target-features.c (+21-14)
  • (modified) llvm/include/llvm/TargetParser/AArch64TargetParser.h (+7-1)
  • (modified) llvm/lib/Target/AArch64/AArch64.td (+27)
  • (modified) llvm/lib/Target/AArch64/AArch64SchedA53.td (+1-1)
  • (modified) llvm/lib/Target/AArch64/AArch64SchedA57.td (+1-1)
  • (modified) llvm/lib/Target/AArch64/AArch64SchedA64FX.td (+2-1)
  • (added) llvm/lib/Target/AArch64/AArch64SchedAmpere1B.td (+1149)
  • (modified) llvm/lib/Target/AArch64/AArch64SchedCyclone.td (+1-1)
  • (modified) llvm/lib/Target/AArch64/AArch64SchedExynosM3.td (+1-1)
  • (modified) llvm/lib/Target/AArch64/AArch64SchedExynosM4.td (+1-1)
  • (modified) llvm/lib/Target/AArch64/AArch64SchedExynosM5.td (+1-1)
  • (modified) llvm/lib/Target/AArch64/AArch64SchedFalkor.td (+1-1)
  • (modified) llvm/lib/Target/AArch64/AArch64SchedKryo.td (+1-1)
  • (modified) llvm/lib/Target/AArch64/AArch64SchedNeoverseN1.td (+1-1)
  • (modified) llvm/lib/Target/AArch64/AArch64SchedNeoverseN2.td (+1-1)
  • (modified) llvm/lib/Target/AArch64/AArch64SchedNeoverseV1.td (+2-1)
  • (modified) llvm/lib/Target/AArch64/AArch64SchedNeoverseV2.td (+2-1)
  • (modified) llvm/lib/Target/AArch64/AArch64SchedTSV110.td (+1-1)
  • (modified) llvm/lib/Target/AArch64/AArch64SchedThunderX.td (+1-1)
  • (modified) llvm/lib/Target/AArch64/AArch64SchedThunderX2T99.td (+1-1)
  • (modified) llvm/lib/Target/AArch64/AArch64SchedThunderX3T110.td (+1-1)
  • (modified) llvm/lib/Target/AArch64/AArch64Subtarget.cpp (+1)
  • (modified) llvm/lib/Target/AArch64/AArch64Subtarget.h (+1)
  • (modified) llvm/lib/TargetParser/Host.cpp (+1)
  • (modified) llvm/test/CodeGen/AArch64/cpus.ll (+1)
  • (modified) llvm/test/CodeGen/AArch64/neon-dot-product.ll (+1)
  • (modified) llvm/test/CodeGen/AArch64/remat.ll (+1)
  • (modified) llvm/test/MC/AArch64/armv8.2a-dotprod.s (+3)
  • (modified) llvm/test/MC/Disassembler/AArch64/armv8.3a-rcpc.txt (+1)
  • (added) llvm/test/tools/llvm-mca/AArch64/Ampere/Ampere1B/basic-instructions.s (+3724)
  • (added) llvm/test/tools/llvm-mca/AArch64/Ampere/Ampere1B/cssc-instructions.s (+76)
  • (added) llvm/test/tools/llvm-mca/AArch64/Ampere/Ampere1B/mte-instructions.s (+349)
  • (added) llvm/test/tools/llvm-mca/AArch64/Ampere/Ampere1B/neon-instructions.s (+3235)
  • (added) llvm/test/tools/llvm-mca/AArch64/Ampere/Ampere1B/shifted-register.s (+31)
  • (modified) llvm/unittests/TargetParser/Host.cpp (+3)
  • (modified) llvm/unittests/TargetParser/TargetParserTest.cpp (+46-22)
diff --git a/clang/lib/Basic/Targets/AArch64.cpp b/clang/lib/Basic/Targets/AArch64.cpp
index 336b7a5e3d727d..72270167118390 100644
--- a/clang/lib/Basic/Targets/AArch64.cpp
+++ b/clang/lib/Basic/Targets/AArch64.cpp
@@ -258,7 +258,6 @@ void AArch64TargetInfo::getTargetDefinesARMV83A(const LangOptions &Opts,
                                                 MacroBuilder &Builder) const {
   Builder.defineMacro("__ARM_FEATURE_COMPLEX", "1");
   Builder.defineMacro("__ARM_FEATURE_JCVT", "1");
-  Builder.defineMacro("__ARM_FEATURE_PAUTH", "1");
   // Also include the Armv8.2 defines
   getTargetDefinesARMV82A(Opts, Builder);
 }
diff --git a/clang/test/CodeGen/aarch64-targetattr.c b/clang/test/CodeGen/aarch64-targetattr.c
index 02da18264da0a3..1a3a84a73dbad1 100644
--- a/clang/test/CodeGen/aarch64-targetattr.c
+++ b/clang/test/CodeGen/aarch64-targetattr.c
@@ -97,19 +97,19 @@ void minusarch() {}
 // CHECK: attributes #0 = { {{.*}} "target-features"="+crc,+fp-armv8,+lse,+neon,+ras,+rdm,+v8.1a,+v8.2a,+v8a" }
 // CHECK: attributes #1 = { {{.*}} "target-features"="+crc,+fp-armv8,+fullfp16,+lse,+neon,+ras,+rdm,+sve,+v8.1a,+v8.2a,+v8a" }
 // CHECK: attributes #2 = { {{.*}} "target-features"="+crc,+fp-armv8,+fullfp16,+lse,+neon,+ras,+rdm,+sve,+sve2,+v8.1a,+v8.2a,+v8a" }
-// CHECK: attributes #3 = { {{.*}} "target-features"="+bf16,+complxnum,+crc,+dotprod,+fp-armv8,+fullfp16,+i8mm,+jsconv,+lse,+neon,+ras,+rcpc,+rdm,+sve,+sve2,+v8.1a,+v8.2a,+v8.3a,+v8.4a,+v8.5a,+v8.6a,+v8a" }
+// CHECK: attributes #3 = { {{.*}} "target-features"="+bf16,+complxnum,+crc,+dotprod,+fp-armv8,+fullfp16,+i8mm,+jsconv,+lse,+neon,+pauth,+ras,+rcpc,+rdm,+sve,+sve2,+v8.1a,+v8.2a,+v8.3a,+v8.4a,+v8.5a,+v8.6a,+v8a" }
 // CHECK: attributes #4 = { {{.*}} "target-cpu"="cortex-a710" "target-features"="+bf16,+complxnum,+crc,+dotprod,+flagm,+fp-armv8,+fp16fml,+fullfp16,+i8mm,+jsconv,+lse,+mte,+neon,+pauth,+ras,+rcpc,+rdm,+sb,+sve,+sve2,+sve2-bitperm" }
 // CHECK: attributes #5 = { {{.*}} "tune-cpu"="cortex-a710" }
 // CHECK: attributes #6 = { {{.*}} "target-cpu"="generic" }
 // CHECK: attributes #7 = { {{.*}} "tune-cpu"="generic" }
 // CHECK: attributes #8 = { {{.*}} "target-cpu"="neoverse-n1" "target-features"="+aes,+crc,+dotprod,+fp-armv8,+fullfp16,+lse,+neon,+ras,+rcpc,+rdm,+sha2,+spe,+ssbs" "tune-cpu"="cortex-a710" }
 // CHECK: attributes #9 = { {{.*}} "target-features"="+fp-armv8,+fullfp16,+neon,+sve" "tune-cpu"="cortex-a710" }
-// CHECK: attributes #10 = { {{.*}} "target-cpu"="neoverse-v1" "target-features"="+aes,+bf16,+complxnum,+crc,+dotprod,+fp-armv8,+fp16fml,+fullfp16,+i8mm,+jsconv,+lse,+neon,+rand,+ras,+rcpc,+rdm,+sha2,+sha3,+sm4,+spe,+ssbs,+sve,+sve2" }
-// CHECK: attributes #11 = { {{.*}} "target-cpu"="neoverse-v1" "target-features"="+aes,+bf16,+complxnum,+crc,+dotprod,+fp-armv8,+fp16fml,+fullfp16,+i8mm,+jsconv,+lse,+neon,+rand,+ras,+rcpc,+rdm,+sha2,+sha3,+sm4,+spe,+ssbs,-sve" }
+// CHECK: attributes #10 = { {{.*}} "target-cpu"="neoverse-v1" "target-features"="+aes,+bf16,+complxnum,+crc,+dotprod,+fp-armv8,+fp16fml,+fullfp16,+i8mm,+jsconv,+lse,+neon,+pauth,+rand,+ras,+rcpc,+rdm,+sha2,+sha3,+sm4,+spe,+ssbs,+sve,+sve2" }
+// CHECK: attributes #11 = { {{.*}} "target-cpu"="neoverse-v1" "target-features"="+aes,+bf16,+complxnum,+crc,+dotprod,+fp-armv8,+fp16fml,+fullfp16,+i8mm,+jsconv,+lse,+neon,+pauth,+rand,+ras,+rcpc,+rdm,+sha2,+sha3,+sm4,+spe,+ssbs,-sve" }
 // CHECK: attributes #12 = { {{.*}} "target-features"="+fp-armv8,+fullfp16,+neon,+sve" }
 // CHECK: attributes #13 = { {{.*}} "target-features"="+fp-armv8,+fullfp16,+neon,+sve,-sve2" }
 // CHECK: attributes #14 = { {{.*}} "target-features"="+fullfp16" }
-// CHECK: attributes #15 = { {{.*}} "target-cpu"="neoverse-n1" "target-features"="+aes,+bf16,+complxnum,+crc,+dotprod,+fp-armv8,+fullfp16,+i8mm,+jsconv,+lse,+neon,+ras,+rcpc,+rdm,+sha2,+spe,+ssbs,+sve,+sve2,+v8.1a,+v8.2a,+v8.3a,+v8.4a,+v8.5a,+v8.6a,+v8a" "tune-cpu"="cortex-a710" }
-// CHECK: attributes #16 = { {{.*}} "branch-target-enforcement"="true" "guarded-control-stack"="true" {{.*}} "target-features"="+aes,+bf16,+complxnum,+crc,+dotprod,+fp-armv8,+fullfp16,+i8mm,+jsconv,+lse,+neon,+ras,+rcpc,+rdm,+sha2,+spe,+ssbs,+sve,+sve2,+v8.1a,+v8.2a,+v8.3a,+v8.4a,+v8.5a,+v8.6a,+v8a" "tune-cpu"="cortex-a710" }
+// CHECK: attributes #15 = { {{.*}} "target-cpu"="neoverse-n1" "target-features"="+aes,+bf16,+complxnum,+crc,+dotprod,+fp-armv8,+fullfp16,+i8mm,+jsconv,+lse,+neon,+pauth,+ras,+rcpc,+rdm,+sha2,+spe,+ssbs,+sve,+sve2,+v8.1a,+v8.2a,+v8.3a,+v8.4a,+v8.5a,+v8.6a,+v8a" "tune-cpu"="cortex-a710" }
+// CHECK: attributes #16 = { {{.*}} "branch-target-enforcement"="true" "guarded-control-stack"="true" {{.*}} "target-features"="+aes,+bf16,+complxnum,+crc,+dotprod,+fp-armv8,+fullfp16,+i8mm,+jsconv,+lse,+neon,+pauth,+ras,+rcpc,+rdm,+sha2,+spe,+ssbs,+sve,+sve2,+v8.1a,+v8.2a,+v8.3a,+v8.4a,+v8.5a,+v8.6a,+v8a" "tune-cpu"="cortex-a710" }
 // CHECK: attributes #17 = { {{.*}} "target-features"="-neon" }
 // CHECK: attributes #18 = { {{.*}} "target-features"="-v9.3a" }
diff --git a/clang/test/Driver/aarch64-cssc.c b/clang/test/Driver/aarch64-cssc.c
index a3e18663279bbd..5df0ea79d7c850 100644
--- a/clang/test/Driver/aarch64-cssc.c
+++ b/clang/test/Driver/aarch64-cssc.c
@@ -9,6 +9,7 @@
 // RUN: %clang -S -o - -emit-llvm --target=aarch64-none-elf -march=armv9.4-a        %s 2>&1 | FileCheck %s
 // RUN: %clang -S -o - -emit-llvm --target=aarch64-none-elf -march=armv9.4-a+cssc   %s 2>&1 | FileCheck %s
 // RUN: %clang -S -o - -emit-llvm --target=aarch64-none-elf -march=armv9.4-a+nocssc %s 2>&1 | FileCheck %s --check-prefix=NO_CSSC
+// RUN: %clang -S -o - -emit-llvm --target=aarch64-none-elf -mcpu=ampere1b          %s 2>&1 | FileCheck %s
 
 // CHECK: "target-features"="{{.*}},+cssc
 // NO_CSSC: "target-features"="{{.*}},-cssc
diff --git a/clang/test/Misc/target-invalid-cpu-note.c b/clang/test/Misc/target-invalid-cpu-note.c
index 2f10bfb1fd82fe..39ed02f50950dd 100644
--- a/clang/test/Misc/target-invalid-cpu-note.c
+++ b/clang/test/Misc/target-invalid-cpu-note.c
@@ -5,11 +5,11 @@
 
 // RUN: not %clang_cc1 -triple arm64--- -target-cpu not-a-cpu -fsyntax-only %s 2>&1 | FileCheck %s --check-prefix AARCH64
 // AARCH64: error: unknown target CPU 'not-a-cpu'
-// AARCH64-NEXT: note: valid target CPU values are: cortex-a34, cortex-a35, cortex-a53, cortex-a55, cortex-a510, cortex-a520, cortex-a57, cortex-a65, cortex-a65ae, cortex-a72, cortex-a73, cortex-a75, cortex-a76, cortex-a76ae, cortex-a77, cortex-a78, cortex-a78c, cortex-a710, cortex-a715, cortex-a720, cortex-r82, cortex-x1, cortex-x1c, cortex-x2, cortex-x3, cortex-x4, neoverse-e1, neoverse-n1, neoverse-n2, neoverse-512tvb, neoverse-v1, neoverse-v2, cyclone, apple-a7, apple-a8, apple-a9, apple-a10, apple-a11, apple-a12, apple-a13, apple-a14, apple-a15, apple-a16, apple-a17, apple-m1, apple-m2, apple-m3, apple-s4, apple-s5, exynos-m3, exynos-m4, exynos-m5, falkor, saphira, kryo, thunderx2t99, thunderx3t110, thunderx, thunderxt88, thunderxt81, thunderxt83, tsv110, a64fx, carmel, ampere1, ampere1a, cobalt-100, grace{{$}}
+// AARCH64-NEXT: note: valid target CPU values are: cortex-a34, cortex-a35, cortex-a53, cortex-a55, cortex-a510, cortex-a520, cortex-a57, cortex-a65, cortex-a65ae, cortex-a72, cortex-a73, cortex-a75, cortex-a76, cortex-a76ae, cortex-a77, cortex-a78, cortex-a78c, cortex-a710, cortex-a715, cortex-a720, cortex-r82, cortex-x1, cortex-x1c, cortex-x2, cortex-x3, cortex-x4, neoverse-e1, neoverse-n1, neoverse-n2, neoverse-512tvb, neoverse-v1, neoverse-v2, cyclone, apple-a7, apple-a8, apple-a9, apple-a10, apple-a11, apple-a12, apple-a13, apple-a14, apple-a15, apple-a16, apple-a17, apple-m1, apple-m2, apple-m3, apple-s4, apple-s5, exynos-m3, exynos-m4, exynos-m5, falkor, saphira, kryo, thunderx2t99, thunderx3t110, thunderx, thunderxt88, thunderxt81, thunderxt83, tsv110, a64fx, carmel, ampere1, ampere1a, ampere1b, cobalt-100, grace{{$}}
 
 // RUN: not %clang_cc1 -triple arm64--- -tune-cpu not-a-cpu -fsyntax-only %s 2>&1 | FileCheck %s --check-prefix TUNE_AARCH64
 // TUNE_AARCH64: error: unknown target CPU 'not-a-cpu'
-// TUNE_AARCH64-NEXT: note: valid target CPU values are: cortex-a34, cortex-a35, cortex-a53, cortex-a55, cortex-a510, cortex-a520, cortex-a57, cortex-a65, cortex-a65ae, cortex-a72, cortex-a73, cortex-a75, cortex-a76, cortex-a76ae, cortex-a77, cortex-a78, cortex-a78c, cortex-a710, cortex-a715, cortex-a720, cortex-r82, cortex-x1, cortex-x1c, cortex-x2, cortex-x3, cortex-x4, neoverse-e1, neoverse-n1, neoverse-n2, neoverse-512tvb, neoverse-v1, neoverse-v2, cyclone, apple-a7, apple-a8, apple-a9, apple-a10, apple-a11, apple-a12, apple-a13, apple-a14, apple-a15, apple-a16, apple-a17, apple-m1, apple-m2, apple-m3, apple-s4, apple-s5, exynos-m3, exynos-m4, exynos-m5, falkor, saphira, kryo, thunderx2t99, thunderx3t110, thunderx, thunderxt88, thunderxt81, thunderxt83, tsv110, a64fx, carmel, ampere1, ampere1a, cobalt-100, grace{{$}}
+// TUNE_AARCH64-NEXT: note: valid target CPU values are: cortex-a34, cortex-a35, cortex-a53, cortex-a55, cortex-a510, cortex-a520, cortex-a57, cortex-a65, cortex-a65ae, cortex-a72, cortex-a73, cortex-a75, cortex-a76, cortex-a76ae, cortex-a77, cortex-a78, cortex-a78c, cortex-a710, cortex-a715, cortex-a720, cortex-r82, cortex-x1, cortex-x1c, cortex-x2, cortex-x3, cortex-x4, neoverse-e1, neoverse-n1, neoverse-n2, neoverse-512tvb, neoverse-v1, neoverse-v2, cyclone, apple-a7, apple-a8, apple-a9, apple-a10, apple-a11, apple-a12, apple-a13, apple-a14, apple-a15, apple-a16, apple-a17, apple-m1, apple-m2, apple-m3, apple-s4, apple-s5, exynos-m3, exynos-m4, exynos-m5, falkor, saphira, kryo, thunderx2t99, thunderx3t110, thunderx, thunderxt88, thunderxt81, thunderxt83, tsv110, a64fx, carmel, ampere1, ampere1a, ampere1b, cobalt-100, grace{{$}}
 
 // RUN: not %clang_cc1 -triple i386--- -target-cpu not-a-cpu -fsyntax-only %s 2>&1 | FileCheck %s --check-prefix X86
 // X86: error: unknown target CPU 'not-a-cpu'
diff --git a/clang/test/Preprocessor/aarch64-target-features.c b/clang/test/Preprocessor/aarch64-target-features.c
index 9914775097e579..1e9aec3fdf2373 100644
--- a/clang/test/Preprocessor/aarch64-target-features.c
+++ b/clang/test/Preprocessor/aarch64-target-features.c
@@ -318,15 +318,15 @@
 // CHECK-MCPU-APPLE-A7: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-feature" "+zcm" "-target-feature" "+zcz" "-target-feature" "+v8a" "-target-feature" "+aes"{{.*}} "-target-feature" "+fp-armv8" "-target-feature" "+sha2" "-target-feature" "+neon"
 // CHECK-MCPU-APPLE-A10: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-feature" "+zcm" "-target-feature" "+zcz" "-target-feature" "+v8a" "-target-feature" "+aes"{{.*}} "-target-feature" "+crc" "-target-feature" "+fp-armv8" "-target-feature" "+rdm" "-target-feature" "+sha2" "-target-feature" "+neon"
 // CHECK-MCPU-APPLE-A11: "-cc1"{{.*}} "-triple" "aarch64{{.*}}"{{.*}}"-target-feature" "+zcm" "-target-feature" "+zcz" "-target-feature" "+v8.2a" "-target-feature" "+aes" "-target-feature" "+crc" "-target-feature" "+fp-armv8" "-target-feature" "+fullfp16" "-target-feature" "+lse" "-target-feature" "+ras" "-target-feature" "+rdm" "-target-feature" "+sha2" "-target-feature" "+neon"
-// CHECK-MCPU-APPLE-A12: "-cc1"{{.*}} "-triple" "aarch64"{{.*}} "-target-feature" "+zcm" "-target-feature" "+zcz" "-target-feature" "+v8.3a" "-target-feature" "+aes" "-target-feature" "+crc" "-target-feature" "+complxnum" "-target-feature" "+fp-armv8" "-target-feature" "+fullfp16" "-target-feature" "+jsconv" "-target-feature" "+lse" "-target-feature" "+ras" "-target-feature" "+rcpc" "-target-feature" "+rdm" "-target-feature" "+sha2" "-target-feature" "+neon"
+// CHECK-MCPU-APPLE-A12: "-cc1"{{.*}} "-triple" "aarch64"{{.*}} "-target-feature" "+zcm" "-target-feature" "+zcz" "-target-feature" "+v8.3a" "-target-feature" "+aes" "-target-feature" "+crc" "-target-feature" "+complxnum" "-target-feature" "+fp-armv8" "-target-feature" "+fullfp16" "-target-feature" "+jsconv" "-target-feature" "+lse" "-target-feature" "+pauth" "-target-feature" "+ras" "-target-feature" "+rcpc" "-target-feature" "+rdm" "-target-feature" "+sha2" "-target-feature" "+neon"
 // CHECK-MCPU-A34: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-feature" "+aes" "-target-feature" "+crc" "-target-feature" "+fp-armv8" "-target-feature" "+sha2" "-target-feature" "+neon"
-// CHECK-MCPU-APPLE-A13: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "apple-a13" "-target-feature" "+zcm" "-target-feature" "+zcz" "-target-feature" "+v8.4a" "-target-feature" "+aes" "-target-feature" "+crc" "-target-feature" "+dotprod" "-target-feature" "+complxnum" "-target-feature" "+fp-armv8" "-target-feature" "+fullfp16" "-target-feature" "+fp16fml" "-target-feature" "+jsconv" "-target-feature" "+lse" "-target-feature" "+ras" "-target-feature" "+rcpc" "-target-feature" "+rdm" "-target-feature" "+sha2" "-target-feature" "+sha3" "-target-feature" "+neon"
+// CHECK-MCPU-APPLE-A13: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "apple-a13" "-target-feature" "+zcm" "-target-feature" "+zcz" "-target-feature" "+v8.4a" "-target-feature" "+aes" "-target-feature" "+crc" "-target-feature" "+dotprod" "-target-feature" "+complxnum" "-target-feature" "+fp-armv8" "-target-feature" "+fullfp16" "-target-feature" "+fp16fml" "-target-feature" "+jsconv" "-target-feature" "+lse" "-target-feature" "+pauth" "-target-feature" "+ras" "-target-feature" "+rcpc" "-target-feature" "+rdm" "-target-feature" "+sha2" "-target-feature" "+sha3" "-target-feature" "+neon"
 // CHECK-MCPU-A35: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-feature" "+v8a" "-target-feature" "+aes" "-target-feature" "+crc" "-target-feature" "+fp-armv8" "-target-feature" "+sha2" "-target-feature" "+neon"
 // CHECK-MCPU-A53: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-feature" "+v8a" "-target-feature" "+aes" "-target-feature" "+crc" "-target-feature" "+fp-armv8" "-target-feature" "+sha2" "-target-feature" "+neon"
 // CHECK-MCPU-A57: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-feature" "+v8a" "-target-feature" "+aes" "-target-feature" "+crc" "-target-feature" "+fp-armv8" "-target-feature" "+sha2" "-target-feature" "+neon"
 // CHECK-MCPU-A72: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-feature" "+v8a" "-target-feature" "+aes" "-target-feature" "+crc" "-target-feature" "+fp-armv8" "-target-feature" "+sha2" "-target-feature" "+neon"
 // CHECK-MCPU-CORTEX-A73: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-feature" "+v8a" "-target-feature" "+aes" "-target-feature" "+crc" "-target-feature" "+fp-armv8" "-target-feature" "+sha2" "-target-feature" "+neon"
-// CHECK-MCPU-CORTEX-R82: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-feature" "+v8r" "-target-feature" "+crc" "-target-feature" "+dotprod" "-target-feature" "+complxnum" "-target-feature" "+fp-armv8" "-target-feature" "+fullfp16" "-target-feature" "+fp16fml" "-target-feature" "+jsconv" "-target-feature" "+lse" "-target-feature" "+ras" "-target-feature" "+rcpc" "-target-feature" "+rdm" "-target-feature" "+sb" "-target-feature" "+neon" "-target-feature" "+ssbs"
+// CHECK-MCPU-CORTEX-R82: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-feature" "+v8r" "-target-feature" "+crc" "-target-feature" "+dotprod" "-target-feature" "+complxnum" "-target-feature" "+fp-armv8" "-target-feature" "+fullfp16" "-target-feature" "+fp16fml" "-target-feature" "+jsconv" "-target-feature" "+lse" "-target-feature" "+pauth" "-target-feature" "+ras" "-target-feature" "+rcpc" "-target-feature" "+rdm" "-target-feature" "+sb" "-target-feature" "+neon" "-target-feature" "+ssbs"
 // CHECK-MCPU-M3: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-feature" "+v8a" "-target-feature" "+aes" "-target-feature" "+crc" "-target-feature" "+fp-armv8" "-target-feature" "+sha2" "-target-feature" "+neon"
 // CHECK-MCPU-M4: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-feature" "+v8.2a" "-target-feature" "+aes" "-target-feature" "+crc" "-target-feature" "+dotprod" "-target-feature" "+fp-armv8" "-target-feature" "+fullfp16" "-target-feature" "+lse" "-target-feature" "+ras" "-target-feature" "+rdm" "-target-feature" "+sha2" "-target-feature" "+neon"
 // CHECK-MCPU-KRYO: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-feature" "+v8a" "-target-feature" "+aes" "-target-feature" "+crc" "-target-feature" "+fp-armv8" "-target-feature" "+sha2" "-target-feature" "+neon"
@@ -335,10 +335,10 @@
 // CHECK-MCPU-CARMEL: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-feature" "+v8.2a" "-target-feature" "+aes" "-target-feature" "+crc" "-target-feature" "+fp-armv8" "-target-feature" "+fullfp16" "-target-feature" "+lse" "-target-feature" "+ras" "-target-feature" "+rdm" "-target-feature" "+sha2" "-target-feature" "+neon"
 
 // RUN: %clang -target x86_64-apple-macosx -arch arm64 -### -c %s 2>&1 | FileCheck --check-prefix=CHECK-ARCH-ARM64 %s
-// CHECK-ARCH-ARM64: "-target-cpu" "apple-m1" "-target-feature" "+zcm" "-target-feature" "+zcz" "-target-feature" "+v8.5a" "-target-feature" "+aes" "-target-feature" "+crc" "-target-feature" "+dotprod" "-target-feature" "+complxnum" "-target-feature" "+fp-armv8" "-target-feature" "+fullfp16" "-target-feature" "+fp16fml" "-target-feature" "+jsconv" "-target-feature" "+lse" "-target-feature" "+ras" "-target-feature" "+rcpc" "-target-feature" "+rdm" "-target-feature" "+sha2" "-target-feature" "+sha3" "-target-feature" "+neon"
+// CHECK-ARCH-ARM64: "-target-cpu" "apple-m1" "-target-feature" "+zcm" "-target-feature" "+zcz" "-target-feature" "+v8.5a" "-target-feature" "+aes" "-target-feature" "+crc" "-target-feature" "+dotprod" "-target-feature" "+complxnum" "-target-feature" "+fp-armv8" "-target-feature" "+fullfp16" "-target-feature" "+fp16fml" "-target-feature" "+jsconv" "-target-feature" "+lse" "-target-feature" "+pauth" "-target-feature" "+ras" "-target-feature" "+rcpc" "-target-feature" "+rdm" "-target-feature" "+sha2" "-target-feature" "+sha3" "-target-feature" "+neon"
 
 // RUN: %clang -target x86_64-apple-macosx -arch arm64_32 -### -c %s 2>&1 | FileCheck --check-prefix=CHECK-ARCH-ARM64_32 %s
-// CHECK-ARCH-ARM64_32: "-target-cpu" "apple-s4" "-target-feature" "+zcm" "-target-feature" "+zcz" "-target-feature" "+v8.3a" "-target-feature" "+aes" "-target-feature" "+crc" "-target-feature" "+complxnum" "-target-feature" "+fp-armv8" "-target-feature" "+fullfp16" "-target-feature" "+jsconv" "-target-feature" "+lse" "-target-feature" "+ras" "-target-feature" "+rcpc" "-target-feature" "+rdm" "-target-feature" "+sha2" "-target-feature" "+neon"
+// CHECK-ARCH-ARM64_32: "-target-cpu" "apple-s4" "-target-feature" "+zcm" "-target-feature" "+zcz" "-target-feature" "+v8.3a" "-target-feature" "+aes" "-target-feature" "+crc" "-target-feature" "+complxnum" "-target-feature" "+fp-armv8" "-target-feature" "+fullfp16" "-target-feature" "+jsconv" "-target-feature" "+lse" "-target-feature" "+pauth" "-target-feature" "+ras" "-target-feature" "+rcpc" "-target-feature" "+rdm" "-target-feature" "+sha2" "-target-feature" "+neon"
 
 // RUN: %clang -target aarch64 -march=armv8-a+fp+simd+crc+crypto -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-MARCH-1 %s
 // RUN: %clang -target aarch64 -march=armv8-a+nofp+nosimd+nocrc+nocrypto+fp+simd+crc+crypto -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-MARCH-1 %s
@@ -501,9 +501,10 @@
 // CHECK-MEMTAG: __ARM_FEATURE_MEMORY_TAGGING 1
 
 // ================== Check Pointer Authentication Extension (PAuth).
-// RUN: %clang -target arm64-none-linux-gnu -march=armv8-a -x c -E -dM %s -o - | FileCheck -check-prefix=CHECK-PAUTH-OFF %s
-// RUN: %clang -target arm64-none-linux-gnu -march=armv8.5-a -x c -E -dM %s -o - | FileCheck -check-prefix=CHECK-PAUTH-OFF %s
-// RUN: %clang -target arm64-none-linux-gnu -march=armv8-a+pauth -mbranch-protection=none -x c -E -dM %s -o - | FileCheck -check-prefix=CHECK-PAUTH-ON %s
+// RUN: %clang -target arm64-none-linux-gnu -march=armv8-a -x c -E -dM %s -o - | FileCheck -check-prefixes=CHECK-PAUTH-OFF,CHECK-CPU-NOPAUTH %s
+// RUN: %clang -target arm64-none-linux-gnu -march=armv8.5-a+nopauth -x c -E -dM %s -o - | FileCheck -check-prefixes=CHECK-PAUTH-OFF,CHECK-CPU-NOPAUTH %s
+// RUN: %clang -target arm64-none-linux-gnu -march=armv8.5-a -x c -E -dM %s -o - | FileCheck -check-prefixes=CHECK-PAUTH-OFF,CHECK-CPU-PAUTH %s
+// RUN: %clang -target arm64-none-linux-gnu -march=armv8-a+pauth -mbranch-protection=none -x c -E -dM %s -o - | FileCheck -check-prefixes=CHECK-PAUTH-OFF,CHECK-CPU-PAUTH %s
 // RUN: %clang -target arm64-none-linux-gnu -march=armv8-a -mbranch-protection=none -x c -E -dM %s -o - | FileCheck -check-prefix=CHECK-PAUTH-...
[truncated]

@ptomsich ptomsich changed the title release/18.x: [NFC][AArch64] fix whitespace in AArch64SchedNeoverseV1 (#81744) release/18.x: Backport Ampere1B support (#81744) Feb 15, 2024
@ptomsich ptomsich changed the title release/18.x: Backport Ampere1B support (#81744) release/18.x: Backport Ampere1B support (#81297 , #81341, and #81744) Feb 15, 2024
@ptomsich ptomsich changed the title release/18.x: Backport Ampere1B support (#81297 , #81341, and #81744) release/18.x: [AArch64] Backport Ampere1B support (#81297 , #81341, and #81744) Feb 16, 2024
@davemgreen
Copy link
Collaborator

This is a fairly big patch to backport. The ampere1b changes should be safe enough considering as it just adds support for an extra CPU. There is also the change from #78027 added for changing how PAUTH is enabled.

@atrosinenko @DavidSpickett do you think that is OK to backport to LLVM 18?

@ptomsich
Copy link
Contributor

This is a fairly big patch to backport. The ampere1b changes should be safe enough considering as it just adds support for an extra CPU. There is also the change from #78027 added for changing how PAUTH is enabled.

We can drop the dependency on #78027, if we modify the ampere1b enablement to explicitly add the PAUTH as part of the change in llvm/include/llvm/TargetParser/AArch64TargetParser.h; however this would make the change on the release/18.x branch have the extra PAUTH and not be a straight cherry-pick. Our rationale (so far) has been to include the dependency to make sure that this become a pure cherry-pick.

@tstellar
Copy link
Collaborator

Is this ready to merge?

Copy link
Collaborator

@davemgreen davemgreen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe considering what this changes it should be OK. LGTM

@AaronBallman
Copy link
Collaborator

Is this fixing a regression introduced in Clang 18? I'm wondering why the backport is needed in the first place, as this seems to be making potentially significant changes during the RC ("Make +pauth enabled in Armv8.3-a by default").

@davemgreen
Copy link
Collaborator

Is this fixing a regression introduced in Clang 18? I'm wondering why the backport is needed in the first place, as this seems to be making potentially significant changes during the RC ("Make +pauth enabled in Armv8.3-a by default").

It is adding new CPU support to clang 18, specifically the ampere-1b. I'm not sure what is considered acceptable at this stage. It is up to the release maintainers whether they want to accept it. On a technical level I believe it should be OK, but if only regressions are being fixed at this stage then it might be better to wait for clang-19. I'm not sure how much @ptomsich wanted to get this into clang-18?

@ptomsich
Copy link
Contributor

Is this fixing a regression introduced in Clang 18? I'm wondering why the backport is needed in the first place, as this seems to be making potentially significant changes during the RC ("Make +pauth enabled in Armv8.3-a by default").

It is adding new CPU support to clang 18, specifically the ampere-1b. I'm not sure what is considered acceptable at this stage. It is up to the release maintainers whether they want to accept it. On a technical level I believe it should be OK, but if only regressions are being fixed at this stage then it might be better to wait for clang-19. I'm not sure how much @ptomsich wanted to get this into clang-18?

Getting this into clang-18 is highly desirable to ensure it recognizes "-mcpu=ampere1b" and emits the CSSC instructions for the core. I would prefer not to wait for clang-19, even though we ended up late in the cycle.

@ptomsich
Copy link
Contributor

On a different note: the "+pauth" fix would be helpful in clang-18 as well, given that it brings the definition of "-march=armv8.3-a" in sync with the specification.

@tstellar
Copy link
Collaborator

On a different note: the "+pauth" fix would be helpful in clang-18 as well, given that it brings the definition of "-march=armv8.3-a" in sync with the specification.

OK, so sounds like the +pauth change is separate but included in this PR for convenience.

@ptomsich
Copy link
Contributor

On a different note: the "+pauth" fix would be helpful in clang-18 as well, given that it brings the definition of "-march=armv8.3-a" in sync with the specification.

OK, so sounds like the +pauth change is separate but included in this PR for convenience.

@tstellar The +pauth change was included to allow a straight cherry-pick of the Ampere1B changes. Without, we need to modify one patch (or'ing in the PAUTH bit) from what was committed onto main.

@tstellar
Copy link
Collaborator

I'm OK with this kind of change if the AArch64 maintainers are on board. @AaronBallman Do you have a strong objection to this PR?

@AaronBallman
Copy link
Collaborator

I'm OK with this kind of change if the AArch64 maintainers are on board. @AaronBallman Do you have a strong objection to this PR?

We don't typically backport features unless there's some strongly compelling case. This sounds like a nice-to-have but I'm not certain why it can't wait for Clang 19 (and have more in-tree time to bake), especially since rc3 is already out the door: https://llvm.org/docs/HowToReleaseLLVM.html#release-patch-rules

@AaronBallman
Copy link
Collaborator

I'm OK with this kind of change if the AArch64 maintainers are on board. @AaronBallman Do you have a strong objection to this PR?

We don't typically backport features unless there's some strongly compelling case. This sounds like a nice-to-have but I'm not certain why it can't wait for Clang 19 (and have more in-tree time to bake), especially since rc3 is already out the door: https://llvm.org/docs/HowToReleaseLLVM.html#release-patch-rules

To be clear, I'm not strongly opposed, I'm just trying to understand whether this meets our usual criteria and if it doesn't, what about this situation warrants an exception.

@tstellar
Copy link
Collaborator

I've decided to merge this. A feature like this that is self-contained in a single backend is inline with the kind of changes I've merged before during the RC phase. If this were some other feature, with a wider impact, it probably wouldn't be accepted.

atrosinenko and others added 4 commits February 26, 2024 17:38
Add AEK_PAUTH to ARMV8_3A in TargetParser and let it propagate to
ARMV8R, as it aligns with GCC defaults.

After adding AEK_PAUTH, several tests from TargetParserTest.cpp crashed
when trying to format an error message, thus update a format string in
AssertSameExtensionFlags to account for bitmask being pre-formatted as
std::string.

The CHECK-PAUTH* lines in aarch64-target-features.c are updated to
account for the fact that FEAT_PAUTH support and pac-ret can be enabled
independently and all four combinations are possible.

(cherry picked from commit a52eea6)
The Ampere1B is Ampere's third-generation core implementing a
superscalar, out-of-order microarchitecture with nested virtualization,
speculative side-channel mitigation and architectural support for
defense against ROP/JOP style software attacks.

Ampere1B is an ARMv8.7+ implementation, adding support for the FEAT
WFxT, FEAT CSSC, FEAT PAN3 and FEAT AFP extensions. It also includes all
features of the second-generation Ampere1A, such as the Memory Tagging
Extension and SM3/SM4 cryptography instructions.

(cherry picked from commit fbba818)
The Ampere1B core is enabled with a new scheduling/pipeline model, as it
provides significant updates over the Ampere1 core; it reduces latencies
on many instructions, has some micro-ops reassigned between the XY and X
units, and provides modelling for the instructions added since Ampere1
and Ampere1A.

As this is the first model implementing the CSSC instructions, we update
the UnsupportedFeatures on all other models (that have CompleteModel
set).

Testcases are added under llvm-mca: these showed the FullFP16 feature
missing, so we are adding it in as part of this commit.

This *adds tests and additional fixes* compared to the reverted llvm#81338.

(cherry picked from commit dd1897c)
One of the whitespace fixes didn't get added to the commit introducing
the Ampere1B model.
Clean it up.

(cherry picked from commit 3369e34)
@tstellar tstellar merged commit 6d8f929 into llvm:release/18.x Feb 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:AArch64 clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category mc Machine (object) code
Projects
Development

Successfully merging this pull request may close these issues.

6 participants