Skip to content

[M68k] Add remaining addressing modes for Atomic operations #115523

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Dec 12, 2024

Conversation

knickish
Copy link
Contributor

@knickish knickish commented Nov 8, 2024

Had been doing this piece by piece, but makes more sense to do it in a single PR. Adds support for ARID, PCI, PCD, AL, and ARD addressing modes for atomic operations, along with a variety of tests.

The CodeModel tests have been rearranged, as some of the new addressing modes are only exercised under some combinations of CodeModel and relocation mode

Copy link

github-actions bot commented Nov 8, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@knickish knickish force-pushed the m68k_atomic_addr_modes_for_merge branch from b71aa9f to e5a800c Compare November 8, 2024 18:13
@knickish knickish changed the title M68k atomic addr modes for merge [M68k] Add remaining addressing modes for Atomic operations Nov 8, 2024
@knickish knickish marked this pull request as ready for review November 8, 2024 18:19
@llvmbot
Copy link
Member

llvmbot commented Nov 8, 2024

@llvm/pr-subscribers-mc

@llvm/pr-subscribers-backend-m68k

Author: None (knickish)

Changes

Had been doing this piece by piece, but makes more sense to do it in a single PR. Adds support for ARID, PCI, PCD, AL, and ARD addressing modes for atomic operations, along with a variety of tests.

The CodeModel tests have been rearranged, as some of the new addressing modes are only exercised under some combinations of CodeModel and relocation mode


Patch is 134.67 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/115523.diff

23 Files Affected:

  • (modified) llvm/lib/Target/M68k/Disassembler/M68kDisassembler.cpp (+7-1)
  • (modified) llvm/lib/Target/M68k/M68kISelDAGToDAG.cpp (+16-1)
  • (modified) llvm/lib/Target/M68k/M68kInstrAtomics.td (+139-7)
  • (modified) llvm/test/CodeGen/M68k/Atomics/load-store.ll (+48)
  • (modified) llvm/test/CodeGen/M68k/Atomics/rmw.ll (+141)
  • (added) llvm/test/CodeGen/M68k/CodeModel/Large/Atomics/cmpxchg.ll (+317)
  • (added) llvm/test/CodeGen/M68k/CodeModel/Large/Atomics/fence.ll (+41)
  • (added) llvm/test/CodeGen/M68k/CodeModel/Large/Atomics/load-store.ll (+1161)
  • (added) llvm/test/CodeGen/M68k/CodeModel/Large/Atomics/rmw.ll (+1390)
  • (renamed) llvm/test/CodeGen/M68k/CodeModel/Large/large-pic.ll ()
  • (renamed) llvm/test/CodeGen/M68k/CodeModel/Large/large-pie-global-access.ll ()
  • (renamed) llvm/test/CodeGen/M68k/CodeModel/Large/large-pie.ll ()
  • (renamed) llvm/test/CodeGen/M68k/CodeModel/Large/large-static.ll ()
  • (renamed) llvm/test/CodeGen/M68k/CodeModel/Medium/medium-pic.ll ()
  • (renamed) llvm/test/CodeGen/M68k/CodeModel/Medium/medium-pie-global-access.ll ()
  • (renamed) llvm/test/CodeGen/M68k/CodeModel/Medium/medium-pie.ll ()
  • (renamed) llvm/test/CodeGen/M68k/CodeModel/Medium/medium-static.ll ()
  • (renamed) llvm/test/CodeGen/M68k/CodeModel/Small/small-pic.ll ()
  • (renamed) llvm/test/CodeGen/M68k/CodeModel/Small/small-pie-global-access.ll ()
  • (renamed) llvm/test/CodeGen/M68k/CodeModel/Small/small-pie.ll ()
  • (renamed) llvm/test/CodeGen/M68k/CodeModel/Small/small-static.ll ()
  • (added) llvm/test/CodeGen/M68k/Control/non-cmov-switch.ll (+126)
  • (added) llvm/test/CodeGen/M68k/TLS/tls-arid.ll (+19)
diff --git a/llvm/lib/Target/M68k/Disassembler/M68kDisassembler.cpp b/llvm/lib/Target/M68k/Disassembler/M68kDisassembler.cpp
index 7f0f737faccd0d..ce069ced66579a 100644
--- a/llvm/lib/Target/M68k/Disassembler/M68kDisassembler.cpp
+++ b/llvm/lib/Target/M68k/Disassembler/M68kDisassembler.cpp
@@ -19,8 +19,8 @@
 
 #include "llvm/MC/MCAsmInfo.h"
 #include "llvm/MC/MCContext.h"
-#include "llvm/MC/MCDisassembler/MCDisassembler.h"
 #include "llvm/MC/MCDecoderOps.h"
+#include "llvm/MC/MCDisassembler/MCDisassembler.h"
 #include "llvm/MC/MCInst.h"
 #include "llvm/MC/TargetRegistry.h"
 #include "llvm/Support/Endian.h"
@@ -83,6 +83,12 @@ static DecodeStatus DecodeXR32RegisterClass(MCInst &Inst, uint64_t RegNo,
   return DecodeRegisterClass(Inst, RegNo, Address, Decoder);
 }
 
+static DecodeStatus DecodeXR32RegisterClass(MCInst &Inst, APInt RegNo,
+                                            uint64_t Address,
+                                            const void *Decoder) {
+  return DecodeRegisterClass(Inst, RegNo.getZExtValue(), Address, Decoder);
+}
+
 static DecodeStatus DecodeXR16RegisterClass(MCInst &Inst, uint64_t RegNo,
                                             uint64_t Address,
                                             const void *Decoder) {
diff --git a/llvm/lib/Target/M68k/M68kISelDAGToDAG.cpp b/llvm/lib/Target/M68k/M68kISelDAGToDAG.cpp
index f496085c88356a..2bb674cf4bacf4 100644
--- a/llvm/lib/Target/M68k/M68kISelDAGToDAG.cpp
+++ b/llvm/lib/Target/M68k/M68kISelDAGToDAG.cpp
@@ -708,6 +708,20 @@ bool M68kDAGToDAGISel::SelectARIPD(SDNode *Parent, SDValue N, SDValue &Base) {
   return false;
 }
 
+static bool allowARIDWithDisp(SDNode *Parent) {
+  if (!Parent)
+    return false;
+  switch (Parent->getOpcode()) {
+  case ISD::LOAD:
+  case ISD::STORE:
+  case ISD::ATOMIC_LOAD:
+  case ISD::ATOMIC_STORE:
+    return true;
+  default:
+    return false;
+  }
+}
+
 bool M68kDAGToDAGISel::SelectARID(SDNode *Parent, SDValue N, SDValue &Disp,
                                   SDValue &Base) {
   LLVM_DEBUG(dbgs() << "Selecting AddrType::ARID: ");
@@ -740,7 +754,8 @@ bool M68kDAGToDAGISel::SelectARID(SDNode *Parent, SDValue N, SDValue &Disp,
   Base = AM.BaseReg;
 
   if (getSymbolicDisplacement(AM, SDLoc(N), Disp)) {
-    assert(!AM.Disp && "Should not be any displacement");
+    assert((!AM.Disp || allowARIDWithDisp(Parent)) &&
+           "Should not be any displacement");
     LLVM_DEBUG(dbgs() << "SUCCESS, matched Symbol\n");
     return true;
   }
diff --git a/llvm/lib/Target/M68k/M68kInstrAtomics.td b/llvm/lib/Target/M68k/M68kInstrAtomics.td
index 9203a3ef4ed093..a2ccd88573f4bd 100644
--- a/llvm/lib/Target/M68k/M68kInstrAtomics.td
+++ b/llvm/lib/Target/M68k/M68kInstrAtomics.td
@@ -13,6 +13,15 @@ foreach size = [8, 16, 32] in {
   def : Pat<(!cast<SDPatternOperator>("atomic_load_"#size) MxCP_ARII:$ptr),
             (!cast<MxInst>("MOV"#size#"df") !cast<MxMemOp>("MxARII"#size):$ptr)>;
 
+  def : Pat<(!cast<SDPatternOperator>("atomic_load_"#size) MxCP_ARID:$ptr),
+            (!cast<MxInst>("MOV"#size#"dp") !cast<MxMemOp>("MxARID"#size):$ptr)>;
+
+  def : Pat<(!cast<SDPatternOperator>("atomic_load_"#size) MxCP_PCD:$ptr),
+            (!cast<MxInst>("MOV"#size#"dq") !cast<MxMemOp>("MxPCD"#size):$ptr)>;
+
+  def : Pat<(!cast<SDPatternOperator>("atomic_load_"#size) MxCP_PCI:$ptr),
+            (!cast<MxInst>("MOV"#size#"dk") !cast<MxMemOp>("MxPCI"#size):$ptr)>;
+
   def : Pat<(!cast<SDPatternOperator>("atomic_store_"#size) !cast<MxRegOp>("MxDRD"#size):$val, MxCP_ARI:$ptr),
             (!cast<MxInst>("MOV"#size#"jd") !cast<MxMemOp>("MxARI"#size):$ptr,
                                             !cast<MxRegOp>("MxDRD"#size):$val)>;
@@ -20,10 +29,22 @@ foreach size = [8, 16, 32] in {
   def : Pat<(!cast<SDPatternOperator>("atomic_store_"#size) !cast<MxRegOp>("MxDRD"#size):$val, MxCP_ARII:$ptr),
             (!cast<MxInst>("MOV"#size#"fd") !cast<MxMemOp>("MxARII"#size):$ptr,
                                             !cast<MxRegOp>("MxDRD"#size):$val)>;
+
+  def : Pat<(!cast<SDPatternOperator>("atomic_store_"#size) !cast<MxRegOp>("MxDRD"#size):$val, MxCP_ARID:$ptr),
+            (!cast<MxInst>("MOV"#size#"pd") !cast<MxMemOp>("MxARID"#size):$ptr,
+                                            !cast<MxRegOp>("MxDRD"#size):$val)>;
+
+  def : Pat<(!cast<SDPatternOperator>("atomic_store_"#size) !cast<MxRegOp>("MxDRD"#size):$val, MxCP_PCD:$ptr),
+            (!cast<MxInst>("MOV"#size#"qd") !cast<MxMemOp>("MxPCD"#size):$ptr,
+                                            !cast<MxRegOp>("MxDRD"#size):$val)>;                                   
+
+  def : Pat<(!cast<SDPatternOperator>("atomic_store_"#size) !cast<MxRegOp>("MxDRD"#size):$val, MxCP_PCI:$ptr),
+            (!cast<MxInst>("MOV"#size#"kd") !cast<MxMemOp>("MxPCI"#size):$ptr,
+                                            !cast<MxRegOp>("MxDRD"#size):$val)>;                               
 }
 
 let Predicates = [AtLeastM68020] in {
-class MxCASOp<bits<2> size_encoding, MxType type>
+class MxCASARIOp<bits<2> size_encoding, MxType type>
     : MxInst<(outs type.ROp:$out),
              (ins type.ROp:$dc, type.ROp:$du, !cast<MxMemOp>("MxARI"#type.Size):$mem),
              "cas."#type.Prefix#" $dc, $du, $mem"> {
@@ -36,17 +57,128 @@ class MxCASOp<bits<2> size_encoding, MxType type>
   let mayStore = 1;
 }
 
-def CAS8  : MxCASOp<0x1, MxType8d>;
-def CAS16 : MxCASOp<0x2, MxType16d>;
-def CAS32 : MxCASOp<0x3, MxType32d>;
+def CASARI8  : MxCASARIOp<0x1, MxType8d>;
+def CASARI16 : MxCASARIOp<0x2, MxType16d>;
+def CASARI32 : MxCASARIOp<0x3, MxType32d>;
+
+class MxCASARIDOp<bits<2> size_encoding, MxType type>
+    : MxInst<(outs type.ROp:$out),
+             (ins type.ROp:$dc, type.ROp:$du, !cast<MxMemOp>("MxARID"#type.Size):$mem),
+             "cas."#type.Prefix#" $dc, $du, $mem"> {
+  let Inst = (ascend
+                (descend 0b00001, size_encoding, 0b011, MxEncAddrMode_p<"mem">.EA),
+                (descend 0b0000000, (operand "$du", 3), 0b000, (operand "$dc", 3))
+              );
+  let Constraints = "$out = $dc";
+  let mayLoad = 1;
+  let mayStore = 1;
+}
+
+def CASARID8  : MxCASARIDOp<0x1, MxType8d>;
+def CASARID16 : MxCASARIDOp<0x2, MxType16d>;
+def CASARID32 : MxCASARIDOp<0x3, MxType32d>;
+
+class MxCASARIIOp<bits<2> size_encoding, MxType type>
+    : MxInst<(outs type.ROp:$out),
+             (ins type.ROp:$dc, type.ROp:$du, !cast<MxMemOp>("MxARII"#type.Size):$mem),
+             "cas."#type.Prefix#" $dc, $du, $mem"> {
+  let Inst = (ascend
+                (descend 0b00001, size_encoding, 0b011, MxEncAddrMode_f<"mem">.EA),
+                (descend 0b0000000, (operand "$du", 3), 0b000, (operand "$dc", 3))
+              );
+  let Constraints = "$out = $dc";
+  let mayLoad = 1;
+  let mayStore = 1;
+}
+
+def CASARII8  : MxCASARIIOp<0x1, MxType8d>;
+def CASARII16 : MxCASARIIOp<0x2, MxType16d>;
+def CASARII32 : MxCASARIIOp<0x3, MxType32d>;
 
+class MxCASPCIOp<bits<2> size_encoding, MxType type>
+    : MxInst<(outs type.ROp:$out),
+             (ins type.ROp:$dc, type.ROp:$du, !cast<MxMemOp>("MxPCI"#type.Size):$mem),
+             "cas."#type.Prefix#" $dc, $du, $mem"> {
+  let Inst = (ascend
+                (descend 0b00001, size_encoding, 0b011, MxEncAddrMode_k<"mem">.EA),
+                (descend 0b0000000, (operand "$du", 3), 0b000, (operand "$dc", 3))
+              );
+  let Constraints = "$out = $dc";
+  let mayLoad = 1;
+  let mayStore = 1;
+}
 
+def CASPCI8  : MxCASPCIOp<0x1, MxType8d>;
+def CASPCI16 : MxCASPCIOp<0x2, MxType16d>;
+def CASPCI32 : MxCASPCIOp<0x3, MxType32d>;
+
+class MxCASPCDOp<bits<2> size_encoding, MxType type>
+    : MxInst<(outs type.ROp:$out),
+             (ins type.ROp:$dc, type.ROp:$du, !cast<MxMemOp>("MxPCD"#type.Size):$mem),
+             "cas."#type.Prefix#" $dc, $du, $mem"> {
+  let Inst = (ascend
+                (descend 0b00001, size_encoding, 0b011, MxEncAddrMode_q<"mem">.EA),
+                (descend 0b0000000, (operand "$du", 3), 0b000, (operand "$dc", 3))
+              );
+  let Constraints = "$out = $dc";
+  let mayLoad = 1;
+  let mayStore = 1;
+}
+
+def CASPCD8  : MxCASPCDOp<0x1, MxType8d>;
+def CASPCD16 : MxCASPCDOp<0x2, MxType16d>;
+def CASPCD32 : MxCASPCDOp<0x3, MxType32d>;
+
+class MxCASALOp<bits<2> size_encoding, MxType type>
+    : MxInst<(outs type.ROp:$out),
+             (ins type.ROp:$dc, type.ROp:$du, !cast<MxMemOp>("MxAL"#type.Size):$mem),
+             "cas."#type.Prefix#" $dc, $du, $mem"> {
+  let Inst = (ascend
+                (descend 0b00001, size_encoding, 0b011, MxEncAddrMode_abs<"mem">.EA),
+                (descend 0b0000000, (operand "$du", 3), 0b000, (operand "$dc", 3))
+              );
+  let Constraints = "$out = $dc";
+  let mayLoad = 1;
+  let mayStore = 1;
+}
+
+def CASAL8  : MxCASALOp<0x1, MxType8d>;
+def CASAL16 : MxCASALOp<0x2, MxType16d>;
+def CASAL32 : MxCASALOp<0x3, MxType32d>;
+
+foreach mode = ["ARI", "ARII", "ARID", "PCI", "PCD", "AL"] in {
 foreach size = [8, 16, 32] in {
-  def : Pat<(!cast<SDPatternOperator>("atomic_cmp_swap_i"#size) MxCP_ARI:$ptr,
+  def : Pat<(!cast<SDPatternOperator>("atomic_cmp_swap_i"#size) !cast<ComplexPattern>("MxCP_"#mode):$ptr,
                                                                 !cast<MxRegOp>("MxDRD"#size):$cmp,
                                                                 !cast<MxRegOp>("MxDRD"#size):$new),
-            (!cast<MxInst>("CAS"#size) !cast<MxRegOp>("MxDRD"#size):$cmp,
+            (!cast<MxInst>("CAS"#mode#size) !cast<MxRegOp>("MxDRD"#size):$cmp,
                                        !cast<MxRegOp>("MxDRD"#size):$new,
-                                       !cast<MxMemOp>("MxARI"#size):$ptr)>;
+                                       !cast<MxMemOp>("Mx"#mode#size):$ptr)>;
+} // size
+} // addr mode
+
+class MxCASARDOp<bits<2> size_encoding, MxType type>
+    : MxInst<(outs type.ROp:$out),
+             (ins type.ROp:$dc, type.ROp:$du, !cast<MxRegOp>("MxARD"#type.Size):$mem),
+             "cas."#type.Prefix#" $dc, $du, $mem"> {
+  let Inst = (ascend
+                (descend 0b00001, size_encoding, 0b011, MxEncAddrMode_a<"mem">.EA),
+                (descend 0b0000000, (operand "$du", 3), 0b000, (operand "$dc", 3))
+              );
+  let Constraints = "$out = $dc";
+  let mayLoad = 1;
+  let mayStore = 1;
 }
+
+def CASARD16 : MxCASARDOp<0x2, MxType16a>;
+def CASARD32 : MxCASARDOp<0x3, MxType32a>;
+
+foreach size = [16, 32] in {
+  def : Pat<(!cast<SDPatternOperator>("atomic_cmp_swap_i"#size) !cast<MxRegOp>("MxARD"#size):$ptr,
+                                                                !cast<MxRegOp>("MxDRD"#size):$cmp,
+                                                                !cast<MxRegOp>("MxDRD"#size):$new),
+            (!cast<MxInst>("CASARD"#size) !cast<MxRegOp>("MxDRD"#size):$cmp,
+                                       !cast<MxRegOp>("MxDRD"#size):$new,
+                                       !cast<MxRegOp>("MxARD"#size):$ptr)>;
+} // size
 } // let Predicates = [AtLeastM68020]
diff --git a/llvm/test/CodeGen/M68k/Atomics/load-store.ll b/llvm/test/CodeGen/M68k/Atomics/load-store.ll
index 23fdfad05cab5d..c00a1faf2634b4 100644
--- a/llvm/test/CodeGen/M68k/Atomics/load-store.ll
+++ b/llvm/test/CodeGen/M68k/Atomics/load-store.ll
@@ -604,3 +604,51 @@ define void @atomic_store_i64_seq_cst(ptr %a, i64 %val) nounwind {
   store atomic i64 %val, ptr %a seq_cst, align 8
   ret void
 }
+
+define void @store_arid(ptr nonnull align 4 %a) {
+; NO-ATOMIC-LABEL: store_arid:
+; NO-ATOMIC:         .cfi_startproc
+; NO-ATOMIC-NEXT:  ; %bb.0: ; %start
+; NO-ATOMIC-NEXT:    moveq #1, %d0
+; NO-ATOMIC-NEXT:    move.l (4,%sp), %a0
+; NO-ATOMIC-NEXT:    move.l %d0, (32,%a0)
+; NO-ATOMIC-NEXT:    rts
+;
+; ATOMIC-LABEL: store_arid:
+; ATOMIC:         .cfi_startproc
+; ATOMIC-NEXT:  ; %bb.0: ; %start
+; ATOMIC-NEXT:    moveq #1, %d0
+; ATOMIC-NEXT:    move.l (4,%sp), %a0
+; ATOMIC-NEXT:    move.l %d0, (32,%a0)
+; ATOMIC-NEXT:    rts
+start:
+  %1 = getelementptr inbounds i32, ptr %a, i32 8
+  store atomic i32 1, ptr %1 seq_cst, align 4
+  br label %exit
+
+exit:                                              ; preds = %start
+  ret void
+}
+
+define i32 @load_arid(ptr nonnull align 4 %a) {
+; NO-ATOMIC-LABEL: load_arid:
+; NO-ATOMIC:         .cfi_startproc
+; NO-ATOMIC-NEXT:  ; %bb.0: ; %start
+; NO-ATOMIC-NEXT:    move.l (4,%sp), %a0
+; NO-ATOMIC-NEXT:    move.l (32,%a0), %d0
+; NO-ATOMIC-NEXT:    rts
+;
+; ATOMIC-LABEL: load_arid:
+; ATOMIC:         .cfi_startproc
+; ATOMIC-NEXT:  ; %bb.0: ; %start
+; ATOMIC-NEXT:    move.l (4,%sp), %a0
+; ATOMIC-NEXT:    move.l (32,%a0), %d0
+; ATOMIC-NEXT:    rts
+start:
+  %1 = getelementptr inbounds i32, ptr %a, i32 8
+  %2 = load atomic i32, ptr %1 seq_cst, align 4
+  br label %exit
+
+exit:                                              ; preds = %start
+  ret i32 %2
+}
diff --git a/llvm/test/CodeGen/M68k/Atomics/rmw.ll b/llvm/test/CodeGen/M68k/Atomics/rmw.ll
index ce456f0960eec1..a277b8fe72ae47 100644
--- a/llvm/test/CodeGen/M68k/Atomics/rmw.ll
+++ b/llvm/test/CodeGen/M68k/Atomics/rmw.ll
@@ -588,3 +588,144 @@ entry:
   %old = atomicrmw xchg ptr %ptr, i32 %val monotonic
   ret i32 %old
 }
+
+define i8 @atomicrmw_sub_i8_arid(ptr align 2 %self) {
+; NO-ATOMIC-LABEL: atomicrmw_sub_i8_arid:
+; NO-ATOMIC:         .cfi_startproc
+; NO-ATOMIC-NEXT:  ; %bb.0: ; %start
+; NO-ATOMIC-NEXT:    suba.l #12, %sp
+; NO-ATOMIC-NEXT:    .cfi_def_cfa_offset -16
+; NO-ATOMIC-NEXT:    move.l (16,%sp), %a0
+; NO-ATOMIC-NEXT:    move.l (%a0), %d0
+; NO-ATOMIC-NEXT:    add.l #4, %d0
+; NO-ATOMIC-NEXT:    move.l %d0, (%sp)
+; NO-ATOMIC-NEXT:    move.l #1, (4,%sp)
+; NO-ATOMIC-NEXT:    jsr __sync_fetch_and_sub_1
+; NO-ATOMIC-NEXT:    adda.l #12, %sp
+; NO-ATOMIC-NEXT:    rts
+;
+; ATOMIC-LABEL: atomicrmw_sub_i8_arid:
+; ATOMIC:         .cfi_startproc
+; ATOMIC-NEXT:  ; %bb.0: ; %start
+; ATOMIC-NEXT:    suba.l #4, %sp
+; ATOMIC-NEXT:    .cfi_def_cfa_offset -8
+; ATOMIC-NEXT:    movem.l %d2, (0,%sp) ; 8-byte Folded Spill
+; ATOMIC-NEXT:    move.l (8,%sp), %a0
+; ATOMIC-NEXT:    move.l (%a0), %a0
+; ATOMIC-NEXT:    move.b (4,%a0), %d1
+; ATOMIC-NEXT:    move.b %d1, %d0
+; ATOMIC-NEXT:  .LBB12_1: ; %atomicrmw.start
+; ATOMIC-NEXT:    ; =>This Inner Loop Header: Depth=1
+; ATOMIC-NEXT:    move.b %d1, %d2
+; ATOMIC-NEXT:    add.b #-1, %d2
+; ATOMIC-NEXT:    cas.b %d0, %d2, (4,%a0)
+; ATOMIC-NEXT:    move.b %d0, %d2
+; ATOMIC-NEXT:    sub.b %d1, %d2
+; ATOMIC-NEXT:    seq %d1
+; ATOMIC-NEXT:    sub.b #1, %d1
+; ATOMIC-NEXT:    move.b %d0, %d1
+; ATOMIC-NEXT:    bne .LBB12_1
+; ATOMIC-NEXT:  ; %bb.2: ; %atomicrmw.end
+; ATOMIC-NEXT:    movem.l (0,%sp), %d2 ; 8-byte Folded Reload
+; ATOMIC-NEXT:    adda.l #4, %sp
+; ATOMIC-NEXT:    rts
+start:
+  %self1 = load ptr, ptr %self, align 2
+  %_18.i.i = getelementptr inbounds i8, ptr %self1, i32 4
+  %6 = atomicrmw sub ptr %_18.i.i, i8 1 release, align 4
+  ret i8 %6
+}
+
+define i16 @atomicrmw_sub_i16_arid(ptr align 2 %self) {
+; NO-ATOMIC-LABEL: atomicrmw_sub_i16_arid:
+; NO-ATOMIC:         .cfi_startproc
+; NO-ATOMIC-NEXT:  ; %bb.0: ; %start
+; NO-ATOMIC-NEXT:    suba.l #12, %sp
+; NO-ATOMIC-NEXT:    .cfi_def_cfa_offset -16
+; NO-ATOMIC-NEXT:    move.l (16,%sp), %a0
+; NO-ATOMIC-NEXT:    move.l (%a0), %d0
+; NO-ATOMIC-NEXT:    add.l #4, %d0
+; NO-ATOMIC-NEXT:    move.l %d0, (%sp)
+; NO-ATOMIC-NEXT:    move.l #1, (4,%sp)
+; NO-ATOMIC-NEXT:    jsr __sync_fetch_and_sub_2
+; NO-ATOMIC-NEXT:    adda.l #12, %sp
+; NO-ATOMIC-NEXT:    rts
+;
+; ATOMIC-LABEL: atomicrmw_sub_i16_arid:
+; ATOMIC:         .cfi_startproc
+; ATOMIC-NEXT:  ; %bb.0: ; %start
+; ATOMIC-NEXT:    suba.l #4, %sp
+; ATOMIC-NEXT:    .cfi_def_cfa_offset -8
+; ATOMIC-NEXT:    movem.l %d2, (0,%sp) ; 8-byte Folded Spill
+; ATOMIC-NEXT:    move.l (8,%sp), %a0
+; ATOMIC-NEXT:    move.l (%a0), %a0
+; ATOMIC-NEXT:    move.w (4,%a0), %d1
+; ATOMIC-NEXT:    move.w %d1, %d0
+; ATOMIC-NEXT:  .LBB13_1: ; %atomicrmw.start
+; ATOMIC-NEXT:    ; =>This Inner Loop Header: Depth=1
+; ATOMIC-NEXT:    move.w %d1, %d2
+; ATOMIC-NEXT:    add.w #-1, %d2
+; ATOMIC-NEXT:    cas.w %d0, %d2, (4,%a0)
+; ATOMIC-NEXT:    move.w %d0, %d2
+; ATOMIC-NEXT:    sub.w %d1, %d2
+; ATOMIC-NEXT:    seq %d1
+; ATOMIC-NEXT:    sub.b #1, %d1
+; ATOMIC-NEXT:    move.w %d0, %d1
+; ATOMIC-NEXT:    bne .LBB13_1
+; ATOMIC-NEXT:  ; %bb.2: ; %atomicrmw.end
+; ATOMIC-NEXT:    movem.l (0,%sp), %d2 ; 8-byte Folded Reload
+; ATOMIC-NEXT:    adda.l #4, %sp
+; ATOMIC-NEXT:    rts
+start:
+  %self1 = load ptr, ptr %self, align 2
+  %_18.i.i = getelementptr inbounds i8, ptr %self1, i32 4
+  %6 = atomicrmw sub ptr %_18.i.i, i16 1 release, align 4
+  ret i16 %6
+}
+
+define i32 @atomicrmw_sub_i32_arid(ptr align 2 %self) {
+; NO-ATOMIC-LABEL: atomicrmw_sub_i32_arid:
+; NO-ATOMIC:         .cfi_startproc
+; NO-ATOMIC-NEXT:  ; %bb.0: ; %start
+; NO-ATOMIC-NEXT:    suba.l #12, %sp
+; NO-ATOMIC-NEXT:    .cfi_def_cfa_offset -16
+; NO-ATOMIC-NEXT:    move.l (16,%sp), %a0
+; NO-ATOMIC-NEXT:    move.l (%a0), %d0
+; NO-ATOMIC-NEXT:    add.l #4, %d0
+; NO-ATOMIC-NEXT:    move.l %d0, (%sp)
+; NO-ATOMIC-NEXT:    move.l #1, (4,%sp)
+; NO-ATOMIC-NEXT:    jsr __sync_fetch_and_sub_4
+; NO-ATOMIC-NEXT:    adda.l #12, %sp
+; NO-ATOMIC-NEXT:    rts
+;
+; ATOMIC-LABEL: atomicrmw_sub_i32_arid:
+; ATOMIC:         .cfi_startproc
+; ATOMIC-NEXT:  ; %bb.0: ; %start
+; ATOMIC-NEXT:    suba.l #4, %sp
+; ATOMIC-NEXT:    .cfi_def_cfa_offset -8
+; ATOMIC-NEXT:    movem.l %d2, (0,%sp) ; 8-byte Folded Spill
+; ATOMIC-NEXT:    move.l (8,%sp), %a0
+; ATOMIC-NEXT:    move.l (%a0), %a0
+; ATOMIC-NEXT:    move.l (4,%a0), %d1
+; ATOMIC-NEXT:    move.l %d1, %d0
+; ATOMIC-NEXT:  .LBB14_1: ; %atomicrmw.start
+; ATOMIC-NEXT:    ; =>This Inner Loop Header: Depth=1
+; ATOMIC-NEXT:    move.l %d1, %d2
+; ATOMIC-NEXT:    add.l #-1, %d2
+; ATOMIC-NEXT:    cas.l %d0, %d2, (4,%a0)
+; ATOMIC-NEXT:    move.l %d0, %d2
+; ATOMIC-NEXT:    sub.l %d1, %d2
+; ATOMIC-NEXT:    seq %d1
+; ATOMIC-NEXT:    sub.b #1, %d1
+; ATOMIC-NEXT:    move.l %d0, %d1
+; ATOMIC-NEXT:    bne .LBB14_1
+; ATOMIC-NEXT:  ; %bb.2: ; %atomicrmw.end
+; ATOMIC-NEXT:    movem.l (0,%sp), %d2 ; 8-byte Folded Reload
+; ATOMIC-NEXT:    adda.l #4, %sp
+; ATOMIC-NEXT:    rts
+start:
+  %self1 = load ptr, ptr %self, align 2
+  %_18.i.i = getelementptr inbounds i8, ptr %self1, i32 4
+  %6 = atomicrmw sub ptr %_18.i.i, i32 1 release, align 4
+  ret i32 %6
+}
diff --git a/llvm/test/CodeGen/M68k/CodeModel/Large/Atomics/cmpxchg.ll b/llvm/test/CodeGen/M68k/CodeModel/Large/Atomics/cmpxchg.ll
new file mode 100644
index 00000000000000..e21364a8d71186
--- /dev/null
+++ b/llvm/test/CodeGen/M68k/CodeModel/Large/Atomics/cmpxchg.ll
@@ -0,0 +1,317 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc %s -o - -mtriple=m68k -mcpu=M68000 --code-model=large | FileCheck %s --check-prefix=NO-ATOMIC
+; RUN: llc %s -o - -mtriple=m68k -mcpu=M68010 --code-model=large | FileCheck %s --check-prefix=NO-ATOMIC
+; RUN: llc %s -o - -mtriple=m68k -mcpu=M68000 --code-model=large --relocation-model=pic | FileCheck %s --check-prefix=NO-ATOMIC-PIC
+; RUN: llc %s -o - -mtriple=m68k -mcpu=M68010 --code-model=large --relocation-model=pic | FileCheck %s --check-prefix=NO-ATOMIC-PIC
+; RUN: llc %s -o - -mtriple=m68k -mcpu=M68020 --code-model=large | FileCheck %s --check-prefix=ATOMIC
+; RUN: llc %s -o - -mtriple=m68k -mcpu=M68030 --code-model=large | FileCheck %s --check-prefix=ATOMIC
+; RUN: llc %s -o - -mtriple=m68k -mcpu=M68040 --code-model=large | FileCheck %s --check-prefix=ATOMIC
+; RUN: llc %s -o - -mtriple=m68k -mcpu=M68020 --code-model=large --relocation-model=pic | FileCheck %s --check-prefix=ATOMIC-PIC
+; RUN: llc %s -o - -mtriple=m68k -mcpu=M68030 --code-model=large --relocation-model=pic | FileCheck %s --check-prefix=ATOMIC-PIC
+; RUN: llc %s -o - -mtriple=m68k -mcpu=M68040 --code-model=large --relocation-model=pic | FileCheck %s --check-prefix=ATOMIC-PIC
+
+@thread_id = internal global <{ [5 x i8] }> <{ [5 x i8] zeroinitializer}>, align 4
+
+define { i32, i1 } @std_thread_new() {
+; NO-ATOMIC-LABEL: std_thread_new:
+; NO-ATOMIC:         .cfi_startproc
+; NO-ATOMIC-NEXT:  ; %bb.0: ; %start
+; NO-ATOMIC-NEXT:    suba.l #12, %sp
+; NO-ATOMIC-NEXT:    .cfi_def_cfa_offset -16
+; NO-ATOMIC-NEXT:    move.l #1, (8,%sp)
+; NO-ATOMIC-NEXT:    move.l #0, (4,%sp)
+; NO-ATOMIC-NEXT:    move.l #thread_id, (%sp)
+; NO-ATOMIC-NEXT:    jsr __sync_val_compare_and_swap_4
+; NO-ATOMIC-NEXT:    cmpi.l #0, %d0
+; NO-ATOMIC-NEXT:    seq %d1
+; NO-ATOM...
[truncated]

@glaubitz
Copy link
Contributor

glaubitz commented Nov 8, 2024

Thanks. I'm going to test this and report back. Sorry for not being able to test your previous patch, I was busy with work.

@knickish
Copy link
Contributor Author

knickish commented Nov 8, 2024

Thanks, am curious to hear if it works as expected

@glaubitz
Copy link
Contributor

glaubitz commented Nov 9, 2024

The backend still seems to have problems to select certain atomic instructions:

rustc-LLVM ERROR: Cannot select: t14: ch = AtomicStore<(store unordered (s8) into %ir.25)> t13:1, t13, t9
  t13: i8,ch = AtomicLoad<(load unordered (s8) from %ir.26)> t0, t12
    t12: i32 = add t11, t2
      t11: i32,ch = CopyFromReg t0, Register:i32 %0
        t10: i32 = Register %0
      t2: i32,ch = CopyFromReg t0, Register:i32 %8
        t1: i32 = Register %8
  t9: i32 = add t8, t2
    t8: i32,ch = CopyFromReg t0, Register:i32 %1
      t7: i32 = Register %1
    t2: i32,ch = CopyFromReg t0, Register:i32 %8
      t1: i32 = Register %8
In function: __llvm_memmove_element_unordered_atomic_1
error: rustc interrupted by SIGSEGV, printing backtrace

FWIW, I tested this with LLVM 19 as LLVM 20 is not compatible with the Rust compiler at the moment since LLVM upstream changed their interfaces again. So, I might be missing some important fixes for M68k which are not part of LLVM 19.

@glaubitz
Copy link
Contributor

glaubitz commented Nov 9, 2024

I tried building rustc from git master, but that fails because now rust will always try to build the embedded copy of LLVM.

@glaubitz
Copy link
Contributor

glaubitz commented Nov 9, 2024

OK, I managed to get further and noticed that I also needed #114714. After that, it already came to the linker stage, so that's a huge progress.

At that point, I needed to add M68k support to the object crate only to find out you had just done that. ;-)

I'll keep on testing.

Let's hope that @mshockwave and @0x59616e can quickly review both this PR and #114714 because it would be the first time we'd be able to build libstd.

@glaubitz
Copy link
Contributor

glaubitz commented Nov 9, 2024

@knickish Can you tell me what changes are necessary for Rust to generate EM_68K binaries?

I tried your patch on top of the object crate plus the following change:

diff --git a/compiler/rustc_codegen_ssa/src/back/metadata.rs b/compiler/rustc_codegen_ssa/src/back/metadata.rs
index 3f3cb8b4073..c50a5eb5d64 100644
--- a/compiler/rustc_codegen_ssa/src/back/metadata.rs
+++ b/compiler/rustc_codegen_ssa/src/back/metadata.rs
@@ -197,6 +197,7 @@ pub(crate) fn create_object_file(sess: &Session) -> Option<write::Object<'static
         ),
         "x86" => (Architecture::I386, None),
         "s390x" => (Architecture::S390x, None),
+        "m68k" => (Architecture::M68k, None),
         "mips" | "mips32r6" => (Architecture::Mips, None),
         "mips64" | "mips64r6" => (Architecture::Mips64, None),
         "x86_64" => (

but Rust still uses the generic ELF target for m68k.

@glaubitz
Copy link
Contributor

glaubitz commented Nov 9, 2024

Ah, I think I need to build a bootstrap compiler first which includes the change to rustc_codegen_ssa as otherwise it won't be possible to build any artifacts for m68k.

@glaubitz
Copy link
Contributor

glaubitz commented Nov 9, 2024

Hmm, building a bootstrap compiler with the patch didn't help.

I had this problem in the past when adding support for 32-bit SPARC to the object crate, but I absolutely cannot remember how I fixed it.

@knickish
Copy link
Contributor Author

knickish commented Nov 9, 2024

@knickish Can you tell me what changes are necessary for Rust to generate EM_68K binaries?

@glaubitz I cannot, I am stuck on this also, at least for large enough binaries :( Have been experimenting with the relocation model linker flags and such, but can't seem to get it working. I will ping you on rust zulip to continue discussing if that works for you

@mshockwave mshockwave self-requested a review November 10, 2024 23:26
Copy link
Member

@mshockwave mshockwave left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the patch! I only have some comments regarding MC tests

@@ -708,6 +708,20 @@ bool M68kDAGToDAGISel::SelectARIPD(SDNode *Parent, SDValue N, SDValue &Base) {
return false;
}

static bool allowARIDWithDisp(SDNode *Parent) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will the compiler prompt unused function warning when building LLVM without assertion?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style guide was unclear about what do in this case, so I added [[maybe_unused]] here

def CASARID16 : MxCASARIDOp<0x2, MxType16d>;
def CASARID32 : MxCASARIDOp<0x3, MxType32d>;

class MxCASARIIOp<bits<2> size_encoding, MxType type>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have MC tests for these new instructions (I assume you already have their codegen tests)? If not, could you add them?

Ditto for other CAS variants below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a new test at llvm/test/MC/M68k/Atomics/cas.s which shows all of them except ARD, which I couldn't get to parse for some reason. I was getting an illegal operand error when I tried to add cas.w %d4, %d5, %a3 as a test case. Am I expecting something to work incorrectly here, or is it likely an issue with the AsmParser?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On further investigation, looks like PC-relative and ARD addressing modes are not legal for CAS, will have to lower these to some different instruction sequences

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the offending modes for now, as it looks like the code changes will be in different files. All of the addressing modes added by the patch are now tested in the new MC test

@knickish knickish force-pushed the m68k_atomic_addr_modes_for_merge branch from 853f5fe to 9f89ee4 Compare November 11, 2024 05:43
@glaubitz
Copy link
Contributor

@knickish Could you begin the commit summary with a capital letter and maybe replace "remaining" with "missing"? And I think "pic" should be all all-caps, i.e. "PIC" as it's an acronym.

As for the testing goes, I can confirm that this change fixes the remaining LLVM backend errors we have seen so far when building Rust code on M68k which is amazing ;-).

@knickish knickish force-pushed the m68k_atomic_addr_modes_for_merge branch from 9f89ee4 to 19325b8 Compare November 11, 2024 13:27
@knickish
Copy link
Contributor Author

@glaubitz Glad you're seeing the same thing. I've updated the commit descriptions as you asked, just need to get the MC tests added still

@knickish knickish force-pushed the m68k_atomic_addr_modes_for_merge branch from 19325b8 to 8b57f50 Compare November 11, 2024 15:23
@llvmbot llvmbot added the mc Machine (object) code label Nov 11, 2024
@knickish knickish force-pushed the m68k_atomic_addr_modes_for_merge branch from 8b57f50 to 2a6ff8d Compare November 11, 2024 15:55
@knickish knickish force-pushed the m68k_atomic_addr_modes_for_merge branch from 2a6ff8d to 0cc7089 Compare November 20, 2024 16:34
@knickish knickish force-pushed the m68k_atomic_addr_modes_for_merge branch from 0cc7089 to 2827514 Compare November 20, 2024 17:42
@knickish
Copy link
Contributor Author

Ping @mshockwave I think everything is addressed that you mentioned before

@knickish knickish requested a review from mshockwave November 30, 2024 21:08
Copy link
Member

@mshockwave mshockwave left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks!

@glaubitz
Copy link
Contributor

@mshockwave Could you land this change as well?

@mshockwave mshockwave merged commit 4cce107 into llvm:main Dec 12, 2024
6 of 8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:m68k mc Machine (object) code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants