[LV] Decompose WidenIntOrFPInduction into phi and update recipes #82021

nikolaypanchenko · 2024-02-16T18:33:20Z

Loop Vectorizer still has two recipes VPWidenIntOrFpInductionRecipe and VPWidenPointerInductionRecipe that behave in a VPlan as phi-like, as they're derived from VPHeaderPHIRecipe, but their generate functions construct vector phi and vector self-update in the vectorized loop.

This is not only bad from readability of a VPlan, but also requires more code to maintain such behavior. For instance, there's already ad-hoc code motion to move generated updates of these recipes closer to the loop latch.

The changeset:

Adds WidenVFxUF to represent broadcast({1...UF} x VFxUF) value
Decomposes existing VPWidenIntOrFpInductionRecipe into

  WIDEN-INDUCTION vp<%iv> = phi ir<0>, vp<%be-value>
  ...
  EMIT vp<%widen-step> = mul ir<%step>, vp<WidenVFxUF>
  EMIT vp<%be-value> = add vp<%iv>,vp<%widen-step>

Moves trunc optimization of widen IV into VPlan xform
Adds trivial cyclic dependency removal and mark some binops as non side-effecting
Adds element type to VPValue to query it for artifical added VPValue without underlying instruction

llvmbot · 2024-02-16T18:33:56Z

@llvm/pr-subscribers-llvm-analysis
@llvm/pr-subscribers-llvm-transforms

@llvm/pr-subscribers-backend-risc-v

Author: Kolya Panchenko (nikolaypanchenko)

Changes

Loop Vectorizer still has two recipes VPWidenIntOrFpInductionRecipe and VPWidenPointerInductionRecipe that behave in a VPlan as phi-like, as they're derived from VPHeaderPHIRecipe, but their generate functions construct vector phi and vector self-update in the vectorized loop.

This is not only bad from readability of a VPlan, but also requires more code to maintain such behavior. For instance, there's already ad-hoc code motion to move generated updates of these recipes closer to the loop latch.

The changeset:

Adds WidenVFxUF to represent broadcast({1...UF} x VFxUF) value
Decomposes existing VPWidenIntOrFpInductionRecipe into

  WIDEN-INDUCTION vp&lt;%iv&gt; = phi ir&lt;0&gt;, vp&lt;%be-value&gt;
  ...
  EMIT vp&lt;%widen-step&gt; = mul ir&lt;%step&gt;, vp&lt;WidenVFxUF&gt;
  EMIT vp&lt;%be-value&gt; = add vp&lt;%iv&gt;,vp&lt;%widen-step&gt;

Moves trunc optimization of widen IV into VPlan xform
Adds trivial cyclic dependency removal and mark some binops as non side-effecting
Adds element type to VPValue to query it for artifical added VPValue without underlying instruction

Patch is 3.06 MiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/82021.diff

171 Files Affected:

(modified) llvm/include/llvm/Analysis/IVDescriptors.h (+5)
(modified) llvm/lib/Transforms/Vectorize/LoopVectorize.cpp (+88-36)
(modified) llvm/lib/Transforms/Vectorize/VPRecipeBuilder.h (+13-1)
(modified) llvm/lib/Transforms/Vectorize/VPlan.cpp (+54-16)
(modified) llvm/lib/Transforms/Vectorize/VPlan.h (+57-32)
(modified) llvm/lib/Transforms/Vectorize/VPlanAnalysis.cpp (+13-1)
(modified) llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp (+26-60)
(modified) llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp (+81-2)
(modified) llvm/lib/Transforms/Vectorize/VPlanValue.h (+19-1)
(modified) llvm/test/Transforms/LoopVectorize/AArch64/clamped-trip-count.ll (+42-44)
(modified) llvm/test/Transforms/LoopVectorize/AArch64/epilog-vectorization-widen-inductions.ll (+120-120)
(modified) llvm/test/Transforms/LoopVectorize/AArch64/first-order-recurrence-fold-tail.ll (+5-5)
(modified) llvm/test/Transforms/LoopVectorize/AArch64/gather-do-not-vectorize-addressing.ll (+64-12)
(modified) llvm/test/Transforms/LoopVectorize/AArch64/induction-trunc.ll (+62-12)
(modified) llvm/test/Transforms/LoopVectorize/AArch64/interleave-allocsize-not-equal-typesize.ll (+9-9)
(modified) llvm/test/Transforms/LoopVectorize/AArch64/interleaved-store-of-first-order-recurrence.ll (+49-14)
(modified) llvm/test/Transforms/LoopVectorize/AArch64/outer_loop_prefer_scalable.ll (+31-31)
(modified) llvm/test/Transforms/LoopVectorize/AArch64/outer_loop_test1_no_explicit_vect_width.ll (+123-57)
(modified) llvm/test/Transforms/LoopVectorize/AArch64/pr60831-sve-inv-store-crash.ll (+11-11)
(modified) llvm/test/Transforms/LoopVectorize/AArch64/scalable-avoid-scalarization.ll (+19-20)
(modified) llvm/test/Transforms/LoopVectorize/AArch64/scalable-reductions-tf.ll (+78-16)
(modified) llvm/test/Transforms/LoopVectorize/AArch64/scalable-strict-fadd.ll (+844-844)
(modified) llvm/test/Transforms/LoopVectorize/AArch64/streaming-compatible-sve-no-maximize-bandwidth.ll (+36-36)
(modified) llvm/test/Transforms/LoopVectorize/AArch64/strict-fadd.ll (+2903-778)
(modified) llvm/test/Transforms/LoopVectorize/AArch64/sve-epilog-vect-inloop-reductions.ll (+24-24)
(modified) llvm/test/Transforms/LoopVectorize/AArch64/sve-epilog-vect-reductions.ll (+22-22)
(modified) llvm/test/Transforms/LoopVectorize/AArch64/sve-epilog-vect.ll (+18-18)
(modified) llvm/test/Transforms/LoopVectorize/AArch64/sve-gather-scatter.ll (+14-15)
(modified) llvm/test/Transforms/LoopVectorize/AArch64/sve-inductions-unusual-types.ll (+149-54)
(modified) llvm/test/Transforms/LoopVectorize/AArch64/sve-inductions.ll (+11-11)
(modified) llvm/test/Transforms/LoopVectorize/AArch64/sve-interleaved-accesses.ll (+173-168)
(modified) llvm/test/Transforms/LoopVectorize/AArch64/sve-interleaved-masked-accesses.ll (+170-170)
(modified) llvm/test/Transforms/LoopVectorize/AArch64/sve-low-trip-count.ll (+63-19)
(modified) llvm/test/Transforms/LoopVectorize/AArch64/sve-runtime-check-size-based-threshold.ll (+43-43)
(modified) llvm/test/Transforms/LoopVectorize/AArch64/sve-tail-folding-forced.ll (+11-11)
(modified) llvm/test/Transforms/LoopVectorize/AArch64/sve-tail-folding-reductions.ll (+109-109)
(modified) llvm/test/Transforms/LoopVectorize/AArch64/sve-tail-folding-unroll.ll (+158-158)
(modified) llvm/test/Transforms/LoopVectorize/AArch64/sve-tail-folding.ll (+149-149)
(modified) llvm/test/Transforms/LoopVectorize/AArch64/tail-fold-uniform-memops.ll (+131-32)
(modified) llvm/test/Transforms/LoopVectorize/AArch64/tail-folding-styles.ll (+56-51)
(modified) llvm/test/Transforms/LoopVectorize/AArch64/vector-call-linear-args.ll (+56-69)
(modified) llvm/test/Transforms/LoopVectorize/AArch64/wider-VF-for-callinst.ll (+9-9)
(modified) llvm/test/Transforms/LoopVectorize/RISCV/interleaved-accesses.ll (+81-83)
(modified) llvm/test/Transforms/LoopVectorize/RISCV/lmul.ll (+35-35)
(modified) llvm/test/Transforms/LoopVectorize/RISCV/mask-index-type.ll (+21-22)
(modified) llvm/test/Transforms/LoopVectorize/RISCV/masked_gather_scatter.ll (+66-66)
(modified) llvm/test/Transforms/LoopVectorize/RISCV/ordered-reduction.ll (+39-39)
(modified) llvm/test/Transforms/LoopVectorize/RISCV/riscv-interleaved.ll (+1-1)
(modified) llvm/test/Transforms/LoopVectorize/RISCV/scalable-basics.ll (+106-106)
(modified) llvm/test/Transforms/LoopVectorize/RISCV/select-cmp-reduction.ll (+580-214)
(modified) llvm/test/Transforms/LoopVectorize/RISCV/strided-accesses.ll (+123-125)
(modified) llvm/test/Transforms/LoopVectorize/RISCV/uniform-load-store.ll (+238-243)
(modified) llvm/test/Transforms/LoopVectorize/RISCV/zvl32b.ll (+4-4)
(modified) llvm/test/Transforms/LoopVectorize/X86/consecutive-ptr-uniforms.ll (+202-41)
(modified) llvm/test/Transforms/LoopVectorize/X86/constant-fold.ll (+1-1)
(modified) llvm/test/Transforms/LoopVectorize/X86/conversion-cost.ll (+30-30)
(modified) llvm/test/Transforms/LoopVectorize/X86/cost-model.ll (+1-1)
(modified) llvm/test/Transforms/LoopVectorize/X86/drop-poison-generating-flags.ll (+496-119)
(modified) llvm/test/Transforms/LoopVectorize/X86/epilog-vectorization-inductions.ll (+167-104)
(modified) llvm/test/Transforms/LoopVectorize/X86/fixed-order-recurrence.ll (+6-6)
(modified) llvm/test/Transforms/LoopVectorize/X86/float-induction-x86.ll (+31-40)
(modified) llvm/test/Transforms/LoopVectorize/X86/gather_scatter.ll (+54-54)
(modified) llvm/test/Transforms/LoopVectorize/X86/illegal-parallel-loop-uniform-write.ll (+4-4)
(modified) llvm/test/Transforms/LoopVectorize/X86/interleaved-accesses-sink-store-across-load.ll (+12-12)
(modified) llvm/test/Transforms/LoopVectorize/X86/load-deref-pred.ll (+27-27)
(modified) llvm/test/Transforms/LoopVectorize/X86/masked_load_store.ll (+364-364)
(modified) llvm/test/Transforms/LoopVectorize/X86/optsize.ll (+42-46)
(modified) llvm/test/Transforms/LoopVectorize/X86/outer_loop_test1_no_explicit_vect_width.ll (+118-57)
(modified) llvm/test/Transforms/LoopVectorize/X86/pr34438.ll (+1-1)
(modified) llvm/test/Transforms/LoopVectorize/X86/pr36524.ll (+24-27)
(modified) llvm/test/Transforms/LoopVectorize/X86/pr51366-sunk-instruction-used-outside-of-loop.ll (+39-10)
(modified) llvm/test/Transforms/LoopVectorize/X86/pr54634.ll (+19-25)
(modified) llvm/test/Transforms/LoopVectorize/X86/scatter_crash.ll (+245-15)
(modified) llvm/test/Transforms/LoopVectorize/X86/small-size.ll (+60-61)
(modified) llvm/test/Transforms/LoopVectorize/X86/tail_loop_folding.ll (+29-32)
(modified) llvm/test/Transforms/LoopVectorize/X86/uniform_mem_op.ll (+47-58)
(modified) llvm/test/Transforms/LoopVectorize/X86/vect.omp.force.small-tc.ll (+8-9)
(modified) llvm/test/Transforms/LoopVectorize/X86/vectorize-interleaved-accesses-gap.ll (+6-7)
(modified) llvm/test/Transforms/LoopVectorize/X86/x86-interleaved-accesses-masked-group.ll (+189-191)
(modified) llvm/test/Transforms/LoopVectorize/X86/x86-interleaved-store-accesses-with-gaps.ll (+23-24)
(modified) llvm/test/Transforms/LoopVectorize/X86/x86-predication.ll (+88-98)
(modified) llvm/test/Transforms/LoopVectorize/branch-weights.ll (+101-52)
(modified) llvm/test/Transforms/LoopVectorize/bsd_regex.ll (+6-7)
(modified) llvm/test/Transforms/LoopVectorize/cast-induction.ll (+363-58)
(modified) llvm/test/Transforms/LoopVectorize/consecutive-ptr-uniforms.ll (+1-1)
(modified) llvm/test/Transforms/LoopVectorize/create-induction-resume.ll (+2-2)
(modified) llvm/test/Transforms/LoopVectorize/dbg-outer-loop-vect.ll (+12-12)
(modified) llvm/test/Transforms/LoopVectorize/dont-fold-tail-for-divisible-TC.ll (+5-5)
(modified) llvm/test/Transforms/LoopVectorize/epilog-vectorization-reductions.ll (+31-31)
(modified) llvm/test/Transforms/LoopVectorize/epilog-vectorization-trunc-induction-steps.ll (+4-4)
(modified) llvm/test/Transforms/LoopVectorize/first-order-recurrence-chains-vplan.ll (+3-72)
(modified) llvm/test/Transforms/LoopVectorize/first-order-recurrence-chains.ll (+648-198)
(modified) llvm/test/Transforms/LoopVectorize/first-order-recurrence-sink-replicate-region.ll (+3-352)
(modified) llvm/test/Transforms/LoopVectorize/first-order-recurrence.ll (+174-176)
(modified) llvm/test/Transforms/LoopVectorize/float-induction.ll (+138-149)
(modified) llvm/test/Transforms/LoopVectorize/float-minmax-instruction-flag.ll (+1-1)
(modified) llvm/test/Transforms/LoopVectorize/fpsat.ll (+3-3)
(modified) llvm/test/Transforms/LoopVectorize/i8-induction.ll (+98-4)
(modified) llvm/test/Transforms/LoopVectorize/icmp-uniforms.ll (+3-1)
(modified) llvm/test/Transforms/LoopVectorize/if-pred-non-void.ll (+104-112)
(modified) llvm/test/Transforms/LoopVectorize/induction-multiple-uses-in-same-instruction.ll (+8-7)
(modified) llvm/test/Transforms/LoopVectorize/induction-ptrcasts.ll (+83-17)
(modified) llvm/test/Transforms/LoopVectorize/induction-step.ll (+226-75)
(modified) llvm/test/Transforms/LoopVectorize/induction-unroll-novec.ll (+59-20)
(modified) llvm/test/Transforms/LoopVectorize/induction.ll (+839-880)
(modified) llvm/test/Transforms/LoopVectorize/instruction-only-used-outside-of-loop.ll (+15-17)
(modified) llvm/test/Transforms/LoopVectorize/interleave-and-scalarize-only.ll (+10-13)
(modified) llvm/test/Transforms/LoopVectorize/interleaved-accesses.ll (+35-35)
(modified) llvm/test/Transforms/LoopVectorize/load-of-struct-deref-pred.ll (+8-8)
(modified) llvm/test/Transforms/LoopVectorize/loop-form.ll (+12-12)
(modified) llvm/test/Transforms/LoopVectorize/loop-scalars.ll (+4-4)
(modified) llvm/test/Transforms/LoopVectorize/memdep-fold-tail.ll (+4-4)
(modified) llvm/test/Transforms/LoopVectorize/multiple-strides-vectorization.ll (+8-8)
(modified) llvm/test/Transforms/LoopVectorize/no_outside_user.ll (+1-1)
(modified) llvm/test/Transforms/LoopVectorize/optimal-epilog-vectorization-liveout.ll (+1-1)
(modified) llvm/test/Transforms/LoopVectorize/optimal-epilog-vectorization.ll (+114-114)
(modified) llvm/test/Transforms/LoopVectorize/outer-loop-vec-phi-predecessor-order.ll (+5-5)
(modified) llvm/test/Transforms/LoopVectorize/outer_loop_hcfg_construction.ll (+27-18)
(modified) llvm/test/Transforms/LoopVectorize/outer_loop_scalable.ll (+32-31)
(modified) llvm/test/Transforms/LoopVectorize/outer_loop_test1.ll (+62-29)
(modified) llvm/test/Transforms/LoopVectorize/outer_loop_test2.ll (+94-40)
(modified) llvm/test/Transforms/LoopVectorize/pointer-induction-unroll.ll (+28-28)
(modified) llvm/test/Transforms/LoopVectorize/pointer-select-runtime-checks.ll (+99-99)
(modified) llvm/test/Transforms/LoopVectorize/pr30654-phiscev-sext-trunc.ll (+33-33)
(modified) llvm/test/Transforms/LoopVectorize/pr35773.ll (+57-16)
(modified) llvm/test/Transforms/LoopVectorize/pr37248.ll (+10-10)
(modified) llvm/test/Transforms/LoopVectorize/pr44488-predication.ll (+1-1)
(modified) llvm/test/Transforms/LoopVectorize/pr45259.ll (+5-5)
(modified) llvm/test/Transforms/LoopVectorize/pr45679-fold-tail-by-masking.ll (+90-102)
(modified) llvm/test/Transforms/LoopVectorize/pr47343-expander-lcssa-after-cfg-update.ll (+1-1)
(modified) llvm/test/Transforms/LoopVectorize/pr50686.ll (+9-9)
(modified) llvm/test/Transforms/LoopVectorize/pr51614-fold-tail-by-masking.ll (+45-45)
(modified) llvm/test/Transforms/LoopVectorize/pr55100-expand-scev-predicate-used.ll (+1-1)
(modified) llvm/test/Transforms/LoopVectorize/pr55167-fold-tail-live-out.ll (+27-27)
(modified) llvm/test/Transforms/LoopVectorize/pr58811-scev-expansion.ll (+8-8)
(modified) llvm/test/Transforms/LoopVectorize/pr59319-loop-access-info-invalidation.ll (+2-2)
(modified) llvm/test/Transforms/LoopVectorize/reduction-align.ll (+1-1)
(modified) llvm/test/Transforms/LoopVectorize/reduction-inloop-pred.ll (+154-154)
(modified) llvm/test/Transforms/LoopVectorize/reduction-inloop-uf4.ll (+192-198)
(modified) llvm/test/Transforms/LoopVectorize/reduction-inloop.ll (+21-21)
(modified) llvm/test/Transforms/LoopVectorize/reduction-odd-interleave-counts.ll (+136-70)
(modified) llvm/test/Transforms/LoopVectorize/reduction-predselect.ll (+61-61)
(modified) llvm/test/Transforms/LoopVectorize/reduction-small-size.ll (+14-14)
(modified) llvm/test/Transforms/LoopVectorize/reduction.ll (+87-87)
(modified) llvm/test/Transforms/LoopVectorize/runtime-check-needed-but-empty.ll (+16-17)
(modified) llvm/test/Transforms/LoopVectorize/runtime-check-small-clamped-bounds.ll (+11-11)
(modified) llvm/test/Transforms/LoopVectorize/scalable-first-order-recurrence.ll (+1026-88)
(modified) llvm/test/Transforms/LoopVectorize/scalable-inductions.ll (+66-65)
(modified) llvm/test/Transforms/LoopVectorize/scalable-reduction-inloop.ll (+69-26)
(modified) llvm/test/Transforms/LoopVectorize/scalable-trunc-min-bitwidth.ll (+15-15)
(modified) llvm/test/Transforms/LoopVectorize/scalarize-masked-call.ll (+1-1)
(modified) llvm/test/Transforms/LoopVectorize/scev-exit-phi-invalidation.ll (+8-8)
(modified) llvm/test/Transforms/LoopVectorize/scev-predicate-reasoning.ll (+4-4)
(modified) llvm/test/Transforms/LoopVectorize/single-value-blend-phis.ll (+86-87)
(modified) llvm/test/Transforms/LoopVectorize/skeleton-lcssa-crash.ll (+10-10)
(modified) llvm/test/Transforms/LoopVectorize/strict-fadd-interleave-only.ll (+33-35)
(modified) llvm/test/Transforms/LoopVectorize/trunc-shifts.ll (+14-30)
(modified) llvm/test/Transforms/LoopVectorize/uniform-blend.ll (+135-49)
(modified) llvm/test/Transforms/LoopVectorize/uniform_across_vf_induction1.ll (+64-63)
(modified) llvm/test/Transforms/LoopVectorize/uniform_across_vf_induction1_and.ll (+32-32)
(modified) llvm/test/Transforms/LoopVectorize/uniform_across_vf_induction1_div_urem.ll (+12-12)
(modified) llvm/test/Transforms/LoopVectorize/uniform_across_vf_induction1_lshr.ll (+52-52)
(modified) llvm/test/Transforms/LoopVectorize/uniform_across_vf_induction2.ll (+210-210)
(modified) llvm/test/Transforms/LoopVectorize/use-scalar-epilogue-if-tp-fails.ll (+1-1)
(modified) llvm/test/Transforms/LoopVectorize/vector-geps.ll (+4-4)
(modified) llvm/test/Transforms/LoopVectorize/vplan-iv-transforms.ll (+3-1)
(modified) llvm/test/Transforms/LoopVectorize/vplan-printing.ll (+14-4)
(modified) llvm/test/Transforms/LoopVectorize/vplan-sink-scalars-and-merge.ll (+30-16)
(modified) llvm/test/Transforms/LoopVectorize/vplan-vectorize-inner-loop-reduction.ll (+1-1)
(modified) llvm/test/Transforms/LoopVectorize/vplan-widen-call-instruction.ll (+1-1)
(modified) llvm/unittests/Transforms/Vectorize/VPlanTest.cpp (-10)

diff --git a/llvm/include/llvm/Analysis/IVDescriptors.h b/llvm/include/llvm/Analysis/IVDescriptors.h
index 5c7b613ac48c40..7ca13adae87f6a 100644
--- a/llvm/include/llvm/Analysis/IVDescriptors.h
+++ b/llvm/include/llvm/Analysis/IVDescriptors.h
@@ -363,6 +363,11 @@ class InductionDescriptor {
     return nullptr;
   }
 
+  const Instruction *getExactFPMathInst() const {
+    return const_cast<const Instruction *>(
+        const_cast<InductionDescriptor *>(this)->getExactFPMathInst());
+  }
+
   /// Returns binary opcode of the induction operator.
   Instruction::BinaryOps getInductionOpcode() const {
     return InductionBinOp ? InductionBinOp->getOpcode()
diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index 98b177cf5d2d0e..92b783d3badeae 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -8114,34 +8114,6 @@ VPHeaderPHIRecipe *VPRecipeBuilder::tryToOptimizeInductionPHI(
   return nullptr;
 }
 
-VPWidenIntOrFpInductionRecipe *VPRecipeBuilder::tryToOptimizeInductionTruncate(
-    TruncInst *I, ArrayRef<VPValue *> Operands, VFRange &Range, VPlan &Plan) {
-  // Optimize the special case where the source is a constant integer
-  // induction variable. Notice that we can only optimize the 'trunc' case
-  // because (a) FP conversions lose precision, (b) sext/zext may wrap, and
-  // (c) other casts depend on pointer size.
-
-  // Determine whether \p K is a truncation based on an induction variable that
-  // can be optimized.
-  auto isOptimizableIVTruncate =
-      [&](Instruction *K) -> std::function<bool(ElementCount)> {
-    return [=](ElementCount VF) -> bool {
-      return CM.isOptimizableIVTruncate(K, VF);
-    };
-  };
-
-  if (LoopVectorizationPlanner::getDecisionAndClampRange(
-          isOptimizableIVTruncate(I), Range)) {
-
-    auto *Phi = cast<PHINode>(I->getOperand(0));
-    const InductionDescriptor &II = *Legal->getIntOrFpInductionDescriptor(Phi);
-    VPValue *Start = Plan.getVPValueOrAddLiveIn(II.getStartValue());
-    return createWidenInductionRecipes(Phi, I, Start, II, Plan, *PSE.getSE(),
-                                       *OrigLoop, Range);
-  }
-  return nullptr;
-}
-
 VPBlendRecipe *VPRecipeBuilder::tryToBlend(PHINode *Phi,
                                            ArrayRef<VPValue *> Operands,
                                            VPlanPtr &Plan) {
@@ -8275,6 +8247,70 @@ bool VPRecipeBuilder::shouldWiden(Instruction *I, VFRange &Range) const {
                                                              Range);
 }
 
+VPWidenCastRecipe *VPRecipeBuilder::createCast(VPValue *V, Type *From,
+                                               Type *To) {
+  if (From == To)
+    return nullptr;
+  Instruction::CastOps CastOpcode;
+  if (To->isIntegerTy() && From->isIntegerTy())
+    CastOpcode = To->getPrimitiveSizeInBits() < From->getPrimitiveSizeInBits()
+                     ? Instruction::Trunc
+                     : Instruction::ZExt;
+  else if (To->isIntegerTy())
+    CastOpcode = Instruction::FPToUI;
+  else
+    CastOpcode = Instruction::UIToFP;
+
+  return new VPWidenCastRecipe(CastOpcode, V, To);
+}
+
+VPRecipeBase *
+VPRecipeBuilder::createWidenStep(VPWidenIntOrFpInductionRecipe &WIV,
+                                 ScalarEvolution &SE, VPlan &Plan,
+                                 DenseSet<VPRecipeBase *> *CreatedRecipes) {
+  PHINode *PN = WIV.getPHINode();
+  const InductionDescriptor &IndDesc = WIV.getInductionDescriptor();
+  VPValue *ScalarStep =
+      vputils::getOrCreateVPValueForSCEVExpr(Plan, IndDesc.getStep(), SE);
+  Type *VFxUFTy = Plan.getVFxUF().getElementType();
+  Type *StepTy = IndDesc.getStep()->getType();
+  VPValue *WidenVFxUF = &Plan.getWidenVFxUF();
+  VPBasicBlock *LatchVPBB = Plan.getVectorLoopRegion()->getExitingBasicBlock();
+  if (VPWidenCastRecipe *WidenVFxUFCast =
+          createCast(&Plan.getWidenVFxUF(), VFxUFTy, StepTy)) {
+    WidenVFxUFCast->insertBefore(LatchVPBB->getTerminator());
+    if (CreatedRecipes)
+      CreatedRecipes->insert(WidenVFxUFCast);
+    WidenVFxUF = WidenVFxUFCast->getVPSingleValue();
+  }
+  const Instruction::BinaryOps UpdateOp =
+      IndDesc.getInductionOpcode() != Instruction::BinaryOpsEnd
+          ? IndDesc.getInductionOpcode()
+          : Instruction::Add;
+  VPInstruction *Update;
+  if (StepTy->isIntegerTy()) {
+    VPInstruction *Mul = new VPInstruction(
+        Instruction::Mul, {WidenVFxUF, ScalarStep}, PN->getDebugLoc());
+    Mul->insertBefore(LatchVPBB->getTerminator());
+    if (CreatedRecipes)
+      CreatedRecipes->insert(Mul);
+    Update = new VPInstruction(UpdateOp, {&WIV, Mul}, PN->getDebugLoc());
+    Update->insertBefore(LatchVPBB->getTerminator());
+  } else {
+    FastMathFlags FMF = IndDesc.getExactFPMathInst()
+                            ? IndDesc.getExactFPMathInst()->getFastMathFlags()
+                            : FastMathFlags();
+    VPInstruction *Mul = new VPInstruction(
+        Instruction::FMul, {WidenVFxUF, ScalarStep}, FMF, PN->getDebugLoc());
+    Mul->insertBefore(LatchVPBB->getTerminator());
+    Update = new VPInstruction(UpdateOp, {&WIV, Mul}, FMF, PN->getDebugLoc());
+    Update->insertBefore(LatchVPBB->getTerminator());
+  }
+  if (CreatedRecipes)
+    CreatedRecipes->insert(Update);
+  return Update;
+}
+
 VPWidenRecipe *VPRecipeBuilder::tryToWiden(Instruction *I,
                                            ArrayRef<VPValue *> Operands,
                                            VPBasicBlock *VPBB, VPlanPtr &Plan) {
@@ -8324,10 +8360,15 @@ VPWidenRecipe *VPRecipeBuilder::tryToWiden(Instruction *I,
   };
 }
 
-void VPRecipeBuilder::fixHeaderPhis() {
+void VPRecipeBuilder::fixHeaderPhis(VPlan &Plan) {
   BasicBlock *OrigLatch = OrigLoop->getLoopLatch();
   for (VPHeaderPHIRecipe *R : PhisToFix) {
-    auto *PN = cast<PHINode>(R->getUnderlyingValue());
+    if (auto *VPWIFR = dyn_cast<VPWidenIntOrFpInductionRecipe>(R)) {
+      VPWIFR->addOperand(
+          createWidenStep(*VPWIFR, *PSE.getSE(), Plan)->getVPSingleValue());
+      continue;
+    }
+    PHINode *PN = cast<PHINode>(R->getUnderlyingValue());
     VPRecipeBase *IncR =
         getRecipe(cast<Instruction>(PN->getIncomingValueForBlock(OrigLatch)));
     R->addOperand(IncR->getVPSingleValue());
@@ -8405,8 +8446,12 @@ VPRecipeBase *VPRecipeBuilder::tryToCreateWidenRecipe(
     // can have earlier phis as incoming values.
     recordRecipeOf(Phi);
 
-    if ((Recipe = tryToOptimizeInductionPHI(Phi, Operands, *Plan, Range)))
+    if ((Recipe = tryToOptimizeInductionPHI(Phi, Operands, *Plan, Range))) {
+      if (isa<VPWidenPointerInductionRecipe>(Recipe))
+        return Recipe;
+      PhisToFix.push_back(cast<VPWidenIntOrFpInductionRecipe>(Recipe));
       return Recipe;
+    }
 
     VPHeaderPHIRecipe *PhiRecipe = nullptr;
     assert((Legal->isReductionVariable(Phi) ||
@@ -8441,10 +8486,17 @@ VPRecipeBase *VPRecipeBuilder::tryToCreateWidenRecipe(
     return PhiRecipe;
   }
 
-  if (isa<TruncInst>(Instr) &&
-      (Recipe = tryToOptimizeInductionTruncate(cast<TruncInst>(Instr), Operands,
-                                               Range, *Plan)))
-    return Recipe;
+  if (isa<TruncInst>(Instr)) {
+    auto IsOptimizableIVTruncate =
+        [&](Instruction *K) -> std::function<bool(ElementCount)> {
+      return [=](ElementCount VF) -> bool {
+        return CM.isOptimizableIVTruncate(K, VF);
+      };
+    };
+
+    LoopVectorizationPlanner::getDecisionAndClampRange(
+        IsOptimizableIVTruncate(Instr), Range);
+  }
 
   // All widen recipes below deal only with VF > 1.
   if (LoopVectorizationPlanner::getDecisionAndClampRange(
@@ -8707,7 +8759,7 @@ LoopVectorizationPlanner::tryToBuildVPlanWithVPRecipes(VFRange &Range) {
          !Plan->getVectorLoopRegion()->getEntryBasicBlock()->empty() &&
          "entry block must be set to a VPRegionBlock having a non-empty entry "
          "VPBasicBlock");
-  RecipeBuilder.fixHeaderPhis();
+  RecipeBuilder.fixHeaderPhis(*Plan);
 
   // ---------------------------------------------------------------------------
   // Transform initial VPlan: Apply previously taken decisions, in order, to
diff --git a/llvm/lib/Transforms/Vectorize/VPRecipeBuilder.h b/llvm/lib/Transforms/Vectorize/VPRecipeBuilder.h
index b1498026adadfe..126a6b1c061265 100644
--- a/llvm/lib/Transforms/Vectorize/VPRecipeBuilder.h
+++ b/llvm/lib/Transforms/Vectorize/VPRecipeBuilder.h
@@ -146,6 +146,18 @@ class VPRecipeBuilder {
   /// between SRC and DST.
   VPValue *getEdgeMask(BasicBlock *Src, BasicBlock *Dst) const;
 
+  /// A helper function to create VPWidenCastRecipe of a \p V VPValue to a \p To
+  /// type.
+  /// FIXME: Remove \p From argument and take it from a \p V value
+  static VPWidenCastRecipe *createCast(VPValue *V, Type *From, Type *To);
+
+  /// A helper function which widens \p WIV step, multiplies it by WidenVFxUF
+  /// and attaches to loop latch of the \p Plan. Returns multiplication.
+  static VPRecipeBase *
+  createWidenStep(VPWidenIntOrFpInductionRecipe &WIV, ScalarEvolution &SE,
+                  VPlan &Plan,
+                  DenseSet<VPRecipeBase *> *CreatedRecipes = nullptr);
+
   /// Mark given ingredient for recording its recipe once one is created for
   /// it.
   void recordRecipeOf(Instruction *I) {
@@ -171,7 +183,7 @@ class VPRecipeBuilder {
 
   /// Add the incoming values from the backedge to reduction & first-order
   /// recurrence cross-iteration phis.
-  void fixHeaderPhis();
+  void fixHeaderPhis(VPlan &Plan);
 };
 } // end namespace llvm
 
diff --git a/llvm/lib/Transforms/Vectorize/VPlan.cpp b/llvm/lib/Transforms/Vectorize/VPlan.cpp
index 2c0daa82afa59f..96732b77a9db3d 100644
--- a/llvm/lib/Transforms/Vectorize/VPlan.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlan.cpp
@@ -76,12 +76,25 @@ Value *VPLane::getAsRuntimeExpr(IRBuilderBase &Builder,
   llvm_unreachable("Unknown lane kind");
 }
 
-VPValue::VPValue(const unsigned char SC, Value *UV, VPDef *Def)
-    : SubclassID(SC), UnderlyingVal(UV), Def(Def) {
+VPValue::VPValue(const unsigned char SC, Value *UV, VPDef *Def, Type *Ty)
+    : SubclassID(SC), UnderlyingVal(UV), UnderlyingTy(Ty), Def(Def) {
+  if (UnderlyingTy)
+    assert((!UnderlyingVal || UnderlyingVal->getType() == UnderlyingTy) &&
+           "VPValue with set type should either be created without underlying "
+           "value or type should match the given type");
   if (Def)
     Def->addDefinedValue(this);
 }
 
+Type *VPValue::getElementType() {
+  return const_cast<Type *>(
+      const_cast<const VPValue *>(this)->getElementType());
+}
+
+const Type *VPValue::getElementType() const {
+  return UnderlyingVal ? UnderlyingVal->getType() : UnderlyingTy;
+}
+
 VPValue::~VPValue() {
   assert(Users.empty() && "trying to delete a VPValue with remaining users");
   if (Def)
@@ -763,6 +776,10 @@ VPlanPtr VPlan::createInitialVPlan(const SCEV *TripCount, ScalarEvolution &SE) {
   auto Plan = std::make_unique<VPlan>(Preheader, VecPreheader);
   Plan->TripCount =
       vputils::getOrCreateVPValueForSCEVExpr(*Plan, TripCount, SE);
+  Type *TCType = TripCount->getType();
+  Plan->getVectorTripCount().setElementType(TCType);
+  Plan->getVFxUF().setElementType(TCType);
+  Plan->getWidenVFxUF().setElementType(TCType);
   // Create empty VPRegionBlock, to be filled during processing later.
   auto *TopRegion = new VPRegionBlock("vector loop", false /*isReplicator*/);
   VPBlockUtils::insertBlockAfter(TopRegion, VecPreheader);
@@ -796,6 +813,18 @@ void VPlan::prepareToExecute(Value *TripCountV, Value *VectorTripCountV,
             createStepForVF(Builder, TripCountV->getType(), State.VF, State.UF),
             0);
 
+  if (WidenVFxUF.getNumUsers() > 0)
+    for (unsigned Part = 0, UF = State.UF; Part < UF; ++Part) {
+      Value *Step =
+          createStepForVF(Builder, TripCountV->getType(), State.VF, Part+1);
+      if (State.VF.isScalar())
+        State.set(&WidenVFxUF, Step, Part);
+      else
+        State.set(&WidenVFxUF,
+                  Builder.CreateVectorSplat(State.VF, Step, "widen.vfxuf"),
+                  Part);
+    }
+
   // When vectorizing the epilogue loop, the canonical induction start value
   // needs to be changed from zero to the value after the main vector loop.
   // FIXME: Improve modeling for canonical IV start values in the epilogue loop.
@@ -845,21 +874,16 @@ void VPlan::execute(VPTransformState *State) {
     if (isa<VPWidenPHIRecipe>(&R))
       continue;
 
-    if (isa<VPWidenPointerInductionRecipe>(&R) ||
-        isa<VPWidenIntOrFpInductionRecipe>(&R)) {
+    if (isa<VPWidenPointerInductionRecipe>(&R)) {
       PHINode *Phi = nullptr;
-      if (isa<VPWidenIntOrFpInductionRecipe>(&R)) {
-        Phi = cast<PHINode>(State->get(R.getVPSingleValue(), 0));
-      } else {
-        auto *WidenPhi = cast<VPWidenPointerInductionRecipe>(&R);
-        // TODO: Split off the case that all users of a pointer phi are scalar
-        // from the VPWidenPointerInductionRecipe.
-        if (WidenPhi->onlyScalarsGenerated(State->VF.isScalable()))
-          continue;
-
-        auto *GEP = cast<GetElementPtrInst>(State->get(WidenPhi, 0));
-        Phi = cast<PHINode>(GEP->getPointerOperand());
-      }
+      auto *WidenPhi = cast<VPWidenPointerInductionRecipe>(&R);
+      // TODO: Split off the case that all users of a pointer phi are scalar
+      // from the VPWidenPointerInductionRecipe.
+      if (WidenPhi->onlyScalarsGenerated(State->VF.isScalable()))
+        continue;
+
+      auto *GEP = cast<GetElementPtrInst>(State->get(WidenPhi, 0));
+      Phi = cast<PHINode>(GEP->getPointerOperand());
 
       Phi->setIncomingBlock(1, VectorLatchBB);
 
@@ -877,6 +901,7 @@ void VPlan::execute(VPTransformState *State) {
     // generated.
     bool SinglePartNeeded = isa<VPCanonicalIVPHIRecipe>(PhiR) ||
                             isa<VPFirstOrderRecurrencePHIRecipe>(PhiR) ||
+                            isa<VPWidenIntOrFpInductionRecipe>(PhiR) ||
                             (isa<VPReductionPHIRecipe>(PhiR) &&
                              cast<VPReductionPHIRecipe>(PhiR)->isOrdered());
     unsigned LastPartForNewPhi = SinglePartNeeded ? 1 : State->UF;
@@ -908,6 +933,12 @@ void VPlan::printLiveIns(raw_ostream &O) const {
     O << " = VF * UF";
   }
 
+  if (WidenVFxUF.getNumUsers() > 0) {
+    O << "\nLive-in ";
+    WidenVFxUF.printAsOperand(O, SlotTracker);
+    O << " = WIDEN VF * UF";
+  }
+
   if (VectorTripCount.getNumUsers() > 0) {
     O << "\nLive-in ";
     VectorTripCount.printAsOperand(O, SlotTracker);
@@ -1083,6 +1114,11 @@ VPlan *VPlan::duplicate() {
   }
   Old2NewVPValues[&VectorTripCount] = &NewPlan->VectorTripCount;
   Old2NewVPValues[&VFxUF] = &NewPlan->VFxUF;
+  Old2NewVPValues[&WidenVFxUF] = &NewPlan->WidenVFxUF;
+  NewPlan->getVectorTripCount().setElementType(
+      getVectorTripCount().getElementType());
+  NewPlan->getVFxUF().setElementType(getVFxUF().getElementType());
+  NewPlan->getWidenVFxUF().setElementType(getWidenVFxUF().getElementType());
   if (BackedgeTakenCount) {
     NewPlan->BackedgeTakenCount = new VPValue();
     Old2NewVPValues[BackedgeTakenCount] = NewPlan->BackedgeTakenCount;
@@ -1379,6 +1415,8 @@ void VPSlotTracker::assignSlot(const VPValue *V) {
 void VPSlotTracker::assignSlots(const VPlan &Plan) {
   if (Plan.VFxUF.getNumUsers() > 0)
     assignSlot(&Plan.VFxUF);
+  if (Plan.WidenVFxUF.getNumUsers() > 0)
+    assignSlot(&Plan.WidenVFxUF);
   assignSlot(&Plan.VectorTripCount);
   if (Plan.BackedgeTakenCount)
     assignSlot(Plan.BackedgeTakenCount);
diff --git a/llvm/lib/Transforms/Vectorize/VPlan.h b/llvm/lib/Transforms/Vectorize/VPlan.h
index 13e1859ad6b250..306c2200ca34c9 100644
--- a/llvm/lib/Transforms/Vectorize/VPlan.h
+++ b/llvm/lib/Transforms/Vectorize/VPlan.h
@@ -1618,38 +1618,65 @@ class VPHeaderPHIRecipe : public VPSingleDefRecipe {
   }
 };
 
-/// A recipe for handling phi nodes of integer and floating-point inductions,
-/// producing their vector values.
-class VPWidenIntOrFpInductionRecipe : public VPHeaderPHIRecipe {
-  PHINode *IV;
-  TruncInst *Trunc;
+/// A base class for all widen induction-like recipes
+class VPWidenInductionBasePHIRecipe : public VPHeaderPHIRecipe {
+protected:
   const InductionDescriptor &IndDesc;
 
 public:
-  VPWidenIntOrFpInductionRecipe(PHINode *IV, VPValue *Start, VPValue *Step,
+  VPWidenInductionBasePHIRecipe(unsigned char VPDefID, Instruction *Instr,
+                                VPValue *Start, VPValue *Step,
                                 const InductionDescriptor &IndDesc)
-      : VPHeaderPHIRecipe(VPDef::VPWidenIntOrFpInductionSC, IV, Start), IV(IV),
-        Trunc(nullptr), IndDesc(IndDesc) {
+      : VPHeaderPHIRecipe(VPDefID, Instr, Start), IndDesc(IndDesc) {
     addOperand(Step);
   }
 
+  ~VPWidenInductionBasePHIRecipe() override = default;
+
+  /// Returns the step value of the induction.
+  VPValue *getStepValue() { return getOperand(1); }
+  const VPValue *getStepValue() const { return getOperand(1); }
+
+  /// Returns the induction descriptor for the recipe.
+  const InductionDescriptor &getInductionDescriptor() const { return IndDesc; }
+};
+
+/// A recipe for handling phi nodes of integer and floating-point inductions,
+/// producing their vector values.
+class VPWidenIntOrFpInductionRecipe : public VPWidenInductionBasePHIRecipe {
+  PHINode *IV = nullptr;
+  TruncInst *Trunc = nullptr;
+
+public:
+  VPWidenIntOrFpInductionRecipe(PHINode *IV, VPValue *Start, VPValue *Step,
+                                const InductionDescriptor &IndDesc)
+      : VPWidenInductionBasePHIRecipe(VPDef::VPWidenIntOrFpInductionSC, IV,
+                                      Start, Step, IndDesc),
+        IV(IV), Trunc(nullptr) {}
+
   VPWidenIntOrFpInductionRecipe(PHINode *IV, VPValue *Start, VPValue *Step,
                                 const InductionDescriptor &IndDesc,
                                 TruncInst *Trunc)
-      : VPHeaderPHIRecipe(VPDef::VPWidenIntOrFpInductionSC, Trunc, Start),
-        IV(IV), Trunc(Trunc), IndDesc(IndDesc) {
-    addOperand(Step);
-  }
+      : VPWidenInductionBasePHIRecipe(VPDef::VPWidenIntOrFpInductionSC, Trunc,
+                                      Start, Step, IndDesc),
+        IV(IV), Trunc(Trunc) {}
 
   ~VPWidenIntOrFpInductionRecipe() override = default;
 
   VPRecipeBase *clone() override {
-    return new VPWidenIntOrFpInductionRecipe(IV, getStartValue(),
-                                             getStepValue(), IndDesc, Trunc);
+    VPRecipeBase *Cloned = new VPWidenIntOrFpInductionRecipe(
+        getPHINode(), getStartValue(), getStepValue(), IndDesc, Trunc);
+    if (getNumOperands() == 3)
+      Cloned->addOperand(getOperand(2));
+    return Cloned;
   }
 
   VP_CLASSOF_IMPL(VPDef::VPWidenIntOrFpInductionSC)
 
+  static inline bool classof(const VPHeaderPHIRecipe *R) {
+    return R->getVPDefID() == VPDef::VPWidenIntOrFpInductionSC;
+  }
+
   /// Generate the vectorized and scalarized versions of the phi node as
   /// needed by their users.
   void execute(VPTransformState &State) override;
@@ -1660,33 +1687,24 @@ class VPWidenIntOrFpInductionRecipe : public VPHeaderPHIRecipe {
              VPSlotTracker &SlotTracker) const override;
 #endif
 
-  VPValue *getBackedgeValue() override {
-    // TODO: All operands of base recipe must exist and be at same index in
-    // derived recipe.
-    llvm_unreachable(
-        "VPWidenIntOrFpInductionRecipe generates its own backedge value");
+  VPValue *getBackedgeValue() override final {
+    if (getNumOperands() != 3)
+      llvm_unreachable(
+          "VPWidenIntOrFpInductionRecipe::getBackedgeValue is not yet valid");
+    return getOperand(2);
   }
 
-  VPRecipeBase &getBackedgeRecipe() override {
-    // TODO: All operands of base recipe must exist and be at same index in
-    // derived recipe.
-    llvm_unreachable(
-        "VPWidenIntOrFpInductionRecipe generates its own backedge value");
+  VPRecipeBase &getBackedgeRecipe() override final {
+    return *getBackedgeValue()->getDefiningRecipe();
   }
 
-  /// Returns the step value of the induction.
-  VPValue *getStepValue() { return getOperand(1); }
-  const VPValue *getStepValue() const { return getOperand(1); }
-
   /// Returns the first defined value as TruncInst, if it is one or nullptr
   /// otherwise.
   TruncInst *getTruncInst() { return Trunc; }
   const TruncInst *getTruncInst() const { retu...
[truncated]

github-actions · 2024-02-16T18:35:49Z

✅ With the latest revision this PR passed the C/C++ code formatter.

llvm/lib/Transforms/Vectorize/VPlan.h

llvm/test/Transforms/LoopVectorize/first-order-recurrence-chains-vplan.ll

llvm/test/Transforms/LoopVectorize/first-order-recurrence-sink-replicate-region.ll

Mel-Chen · 2024-02-22T09:05:58Z

llvm/test/Transforms/LoopVectorize/icmp-uniforms.ll

@@ -48,7 +49,7 @@ for.end:
 ; CHECK-NEXT: <x1> vector loop: {
 ; CHECK-NEXT: vector.body:
 ; CHECK-NEXT:   EMIT vp<[[CAN_IV:%.+]]> = CANONICAL-INDUCTION
-; CHECK-NEXT:   WIDEN-INDUCTION %iv = phi 0, %iv.next, ir<1>
+; CHECK-NEXT:   WIDEN-INDUCTION ir<%iv> = phi ir<0>, vp<[[NEXT_WIV:%.+]]>, ir<1>


What is the meaning of ir<1>?
Is it necessary after this patch decompose WidenIntOrFPInduction?

in dump ir<> represents VPValue that has underlying LLVM IR value.
Prior to that changeset WidenIntOrFPInduction::print method printed LLVM IR PHINode. I've changed it to print VPValues used by the WidenIntOrFPInduction

Loop Vectorizer still has two recipes `VPWidenIntOrFpInductionRecipe` and `VPWidenPointerInductionRecipe` that behave in a VPlan as phi-like, as they're derived from `VPHeaderPHIRecipe`, but their generate functions construct vector phi and vector self-update in the vectorized loop. This is not only bad from readability of a VPlan, but also requires more code to maintain such behavior. For instance, there's already ad-hoc code motion to move generated updates of these recipes closer to the loop latch. The changeset: * Adds `WidenVFxUF` to represent `broadcast({1...UF} x `VFxUF`)` value * Decomposes existing `VPWidenIntOrFpInductionRecipe` into ``` WIDEN-INDUCTION vp<%iv> = phi ir<0>, vp<%be-value> ... EMIT vp<%widen-step> = mul ir<%step>, vp<WidenVFxUF> EMIT vp<%be-value> = add vp<%iv>,vp<%widen-step> ``` * Moves trunc optimization of widen IV into VPlan xform * Adds trivial cyclic dependency removal and mark some binops as non side-effecting * Adds element type to `VPValue` to query it for artifical added `VPValue` without underlying instruction

nikolaypanchenko · 2024-03-14T15:32:50Z

@fhahn ping

nikolaypanchenko · 2024-04-11T21:57:41Z

@fhahn ping

nikolaypanchenko · 2024-05-31T12:49:29Z

@fhahn ping

nikolaypanchenko · 2024-07-09T23:34:26Z

@fhahn ping

nikolaypanchenko · 2024-08-23T18:07:46Z

@fhahn ping

…ataWithEVL vectorization mode. As an alternative approach to llvm#82021, this patch lowers VPWidenIntOrFpInductionRecipe into a widen phi recipe and step recipes, computed using EVL in the EVL transformation phase.

llvmbot added backend:RISC-V vectorizers llvm:analysis Includes value tracking, cost tables and constant folding llvm:transforms labels Feb 16, 2024

npanchen requested review from fhahn, arcbbb, alexey-bataev and Mel-Chen February 16, 2024 18:33

arcbbb reviewed Feb 17, 2024

View reviewed changes

llvm/lib/Transforms/Vectorize/VPlan.h Show resolved Hide resolved

Mel-Chen reviewed Feb 22, 2024

View reviewed changes

fhahn mentioned this pull request Feb 25, 2024

[VPlan] Delay adding canonical IV increment. #82270

Open

nikolaypanchenko force-pushed the kolyap/vplan_split_wif branch from 372daf0 to f3d58bf Compare February 29, 2024 19:52

nikolaypanchenko force-pushed the kolyap/vplan_split_wif branch from f3d58bf to 3dc7746 Compare March 12, 2024 22:26

nikolaypanchenko added 3 commits March 13, 2024 12:19

format + addressed comments

ef2dade

Rebase

484e061

nikolaypanchenko force-pushed the kolyap/vplan_split_wif branch from 3dc7746 to 484e061 Compare March 13, 2024 20:53

fhahn requested a review from ayalz June 4, 2024 11:56

nikolaypanchenko mentioned this pull request Aug 21, 2024

[LV] Support binary and unary operations with EVL-vectorization #93854

Merged

arcbbb mentioned this pull request Nov 7, 2024

[VPlan] Add support for VPWidenIntOrFpInductionRecipe in predicated D… #115274

Open

lukel97 mentioned this pull request Dec 4, 2024

[VPlan] Expand VPWidenIntOrFpInductionRecipe into separate recipes #118638

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[LV] Decompose WidenIntOrFPInduction into phi and update recipes #82021

[LV] Decompose WidenIntOrFPInduction into phi and update recipes #82021

Uh oh!

nikolaypanchenko commented Feb 16, 2024

Uh oh!

llvmbot commented Feb 16, 2024 •

edited

Loading

Uh oh!

github-actions bot commented Feb 16, 2024 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Mel-Chen Feb 22, 2024

Uh oh!

nikolaypanchenko Feb 29, 2024

Uh oh!

nikolaypanchenko commented Mar 14, 2024

Uh oh!

nikolaypanchenko commented Apr 11, 2024

Uh oh!

nikolaypanchenko commented May 31, 2024

Uh oh!

nikolaypanchenko commented Jul 9, 2024

Uh oh!

nikolaypanchenko commented Aug 23, 2024

Uh oh!

Uh oh!

[LV] Decompose WidenIntOrFPInduction into phi and update recipes #82021

Are you sure you want to change the base?

[LV] Decompose WidenIntOrFPInduction into phi and update recipes #82021

Uh oh!

Conversation

nikolaypanchenko commented Feb 16, 2024

Uh oh!

llvmbot commented Feb 16, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Feb 16, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Mel-Chen Feb 22, 2024

Choose a reason for hiding this comment

Uh oh!

nikolaypanchenko Feb 29, 2024

Choose a reason for hiding this comment

Uh oh!

nikolaypanchenko commented Mar 14, 2024

Uh oh!

nikolaypanchenko commented Apr 11, 2024

Uh oh!

nikolaypanchenko commented May 31, 2024

Uh oh!

nikolaypanchenko commented Jul 9, 2024

Uh oh!

nikolaypanchenko commented Aug 23, 2024

Uh oh!

Uh oh!

llvmbot commented Feb 16, 2024 •

edited

Loading

github-actions bot commented Feb 16, 2024 •

edited

Loading