Skip to content

Commit 641d2e5

Browse files
committed
[X86] Clamp large constant shift amounts for MMX shift intrinsics to 8-bits.
The MMX intrinsics for shift by immediate take a 32-bit shift amount but the hardware for shifting by immediate only encodes 8-bits. For the intrinsic we don't require the shift amount to fit in 8-bits in the frontend because we don't check that its an immediate in the frontend. If its is not an immediate we move it to an MMX register and use the shift by register. But if it is an immediate we'll use the shift by immediate instruction. But we need to change the shift amount to 8-bits. We were previously doing this accidentally by masking it in the encoder. But this can make a large shift amount into a small in bounds shift amount. Instead we should clamp larger shift amounts to 255 so that the they don't become in bounds. Fixes PR43922
1 parent 35cf9a1 commit 641d2e5

File tree

2 files changed

+39
-2
lines changed

2 files changed

+39
-2
lines changed

llvm/lib/Target/X86/X86ISelLowering.cpp

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -23636,9 +23636,12 @@ SDValue X86TargetLowering::LowerINTRINSIC_WO_CHAIN(SDValue Op,
2363623636
SDValue ShAmt = Op.getOperand(2);
2363723637
// If the argument is a constant, convert it to a target constant.
2363823638
if (auto *C = dyn_cast<ConstantSDNode>(ShAmt)) {
23639-
ShAmt = DAG.getTargetConstant(C->getZExtValue(), DL, MVT::i32);
23639+
// Clamp out of bounds shift amounts since they will otherwise be masked
23640+
// to 8-bits which may make it no longer out of bounds.
23641+
unsigned ShiftAmount = C->getAPIntValue().getLimitedValue(255);
2364023642
return DAG.getNode(ISD::INTRINSIC_WO_CHAIN, DL, Op.getValueType(),
23641-
Op.getOperand(0), Op.getOperand(1), ShAmt);
23643+
Op.getOperand(0), Op.getOperand(1),
23644+
DAG.getTargetConstant(ShiftAmount, DL, MVT::i32));
2364223645
}
2364323646

2364423647
unsigned NewIntrinsic;

llvm/test/CodeGen/X86/mmx-arith.ll

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -647,6 +647,40 @@ entry:
647647
ret void
648648
}
649649

650+
; Make sure we clamp large shift amounts to 255
651+
define i64 @pr43922() {
652+
; X32-LABEL: pr43922:
653+
; X32: # %bb.0: # %entry
654+
; X32-NEXT: pushl %ebp
655+
; X32-NEXT: .cfi_def_cfa_offset 8
656+
; X32-NEXT: .cfi_offset %ebp, -8
657+
; X32-NEXT: movl %esp, %ebp
658+
; X32-NEXT: .cfi_def_cfa_register %ebp
659+
; X32-NEXT: andl $-8, %esp
660+
; X32-NEXT: subl $8, %esp
661+
; X32-NEXT: movq {{\.LCPI.*}}, %mm0 # mm0 = 0x7AAAAAAA7AAAAAAA
662+
; X32-NEXT: psrad $255, %mm0
663+
; X32-NEXT: movq %mm0, (%esp)
664+
; X32-NEXT: movl (%esp), %eax
665+
; X32-NEXT: movl {{[0-9]+}}(%esp), %edx
666+
; X32-NEXT: movl %ebp, %esp
667+
; X32-NEXT: popl %ebp
668+
; X32-NEXT: .cfi_def_cfa %esp, 4
669+
; X32-NEXT: retl
670+
;
671+
; X64-LABEL: pr43922:
672+
; X64: # %bb.0: # %entry
673+
; X64-NEXT: movq {{.*}}(%rip), %mm0 # mm0 = 0x7AAAAAAA7AAAAAAA
674+
; X64-NEXT: psrad $255, %mm0
675+
; X64-NEXT: movq %mm0, %rax
676+
; X64-NEXT: retq
677+
entry:
678+
%0 = tail call x86_mmx @llvm.x86.mmx.psrai.d(x86_mmx bitcast (<2 x i32> <i32 2058005162, i32 2058005162> to x86_mmx), i32 268435456)
679+
%1 = bitcast x86_mmx %0 to i64
680+
ret i64 %1
681+
}
682+
declare x86_mmx @llvm.x86.mmx.psrai.d(x86_mmx, i32)
683+
650684
declare x86_mmx @llvm.x86.mmx.padd.b(x86_mmx, x86_mmx)
651685
declare x86_mmx @llvm.x86.mmx.padd.w(x86_mmx, x86_mmx)
652686
declare x86_mmx @llvm.x86.mmx.padd.d(x86_mmx, x86_mmx)

0 commit comments

Comments
 (0)