Skip to content

JIT: Loop hoisting inhibited by phase-ordering issue #6554

@JosephTremoulet

Description

@JosephTremoulet

Consider this code:

using System;
using System.Diagnostics;

class ArrayPerf
{

    private static int[] m_arr;
    private const int MAX_ARRAY_SIZE = 4096;

    private static readonly int DIM_1 = MAX_ARRAY_SIZE;

    static void Main(string[] args)
    {
        long iterations = Int64.Parse(args[0]);

        int value;
        m_arr = new int[DIM_1];

        for (int i = 0; i < DIM_1; i++)
            m_arr[i] = i;

        for (int j = 0; j < DIM_1; j++)
            value = m_arr[j];

        //var sw = Stopwatch.StartNew();

        for (long i = 0; i < iterations; i++)
        {
            for (int j = 0; j < DIM_1; j++)
            {
                value = m_arr[j];
                value = m_arr[j];
                value = m_arr[j];
                value = m_arr[j];
                value = m_arr[j];
                value = m_arr[j];
                value = m_arr[j];
                value = m_arr[j];
                value = m_arr[j];
                value = m_arr[j];
            }
        }

        //sw.Stop();
        //Console.WriteLine(sw.ElapsedMilliseconds);
    }
}

Even with fixes in place for #6552 and #6553, we're unable to hoist the invariant array length loads (that get inserted for the bounds-checks on all those m_arr[i] accesses) out of the loops, because the m_arr[i] expressions look like this when optHoistLoopCode runs:

N027 (  6,  7) [000185] a--XG-------                |     /--*  indir     int    <l:$422, c:$787>
N025 (  1,  1) [000440] ------------                |     |  |  /--*  const     long   16 Fseq[#FirstElem] $c1
N026 (  5,  6) [000441] -------N----                |     |  \--*  +         byref  $2c3
N022 (  1,  1) [000437] -------N----                |     |     |     /--*  const     long   2 $c3
N023 (  3,  4) [000438] -------N----                |     |     |  /--*  <<        long   $3c9
N021 (  2,  3) [000436] ------------                |     |     |  |  \--*  cast      long <- int $3c8
N020 (  1,  1) [000433] i-----------                |     |     |  |     \--*  lclVar    int    V05 loc4         u:4 $483
N024 (  4,  5) [000439] -------N----                |     |     \--*  +         byref  <l:$24a, c:$251>
N019 (  1,  1) [000432] ------------                |     |        \--*  lclVar    ref    V12 tmp6         u:3 (last use) <l:$441, c:$7c6>
N028 ( 14, 18) [000449] ---XG-------                |  /--*  comma     void   <l:$424, c:$42b>
N018 (  8, 11) [000435] ---X--------                |  |  \--*  arrBndsChk_Rng void   <l:$235, c:$845>
N016 (  3,  3) [000434] ---X--------                |  |     +--*  arrLen    int    <l:$1c2, c:$1c7>
N015 (  1,  1) [000431] ------------                |  |     |  \--*  lclVar    ref    V12 tmp6         u:3 <l:$441, c:$7c6>
N017 (  1,  1) [000184] ------------                |  |     \--*  lclVar    int    V05 loc4         u:4 $483
N029 ( 37, 48) [000450] -ACXG-------                \--*  comma     void   <l:$426, c:$42c>
N011 (  5, 12) [000176] ----G-------                   |     /--*  indir     ref    <l:$441, c:$7c6>
N010 (  3, 10) [000448] ------------                   |     |  \--*  const(h)  long   0x221398b27f0 static Fseq[m_arr] $283
N012 ( 23, 30) [000183] --CXG-------                   |  /--*  comma     ref    <l:$214, c:$842>
N009 ( 18, 18) [000182] H-CXG-------                   |  |  \--*  call help long   HELPER.CORINFO_HELP_GETSHARED_NONGCSTATIC_BASE $3c1
N005 (  3, 10) [000178] ------------ arg0 in rcx       |  |     +--*  const     long   0x7ff9b3b150e8 $c2
N006 (  1,  1) [000179] ------------ arg1 in rdx       |  |     \--*  const     int    1 $42
N014 ( 23, 30) [000430] -ACXG---R---                   \--*  =         ref    <l:$214, c:$842>
N013 (  1,  1) [000429] D------N----                      \--*  lclVar    ref    V12 tmp6         d:3 <l:$441, c:$7c6>

Note that the static field m_arr is loaded and then stored to a local in the left side of a comma, then subsequently referenced in the arrLen and load expressions. Even though the arrLen expression is loop-invariant (and recognized as such with fixes for #6552 and #6553 in place), we can't hoist it because optTreeIsValidAtLoopHead fails for it -- the use of lclVar ref V12 tmp6 (the temp which holds the pointer to the array) can't be hoisted above its definition without a rewrite. The rewrite that we'd need actually does happen (as a result of the RHS of the assign into the def being hoisted out of the loop) during CSE, but since that runs after optLoopHoist we're at something of an impasse.

To my mind, this sort of issue is an argument for a rewrite-as-you-go, ssa-based implementation of CSE and LICM (which would necessitate recording heap dependencies in SSA, but also ought to enable dropping the liberal/conservative value number bifurcation).

category:cq
theme:loop-opt
skill-level:expert
cost:medium

Metadata

Metadata

Assignees

No one assigned

    Labels

    area-CodeGen-coreclrCLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMIenhancementProduct code improvement that does NOT require public API changes/additionsoptimizationtenet-performancePerformance related issue

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions