Skip to content

Conversation

jakobbotsch
Copy link
Member

  • Stop considering "successor EH successors" to be EH successors of each block. Instead, add this special case where it is actually needed: when computing the dominator tree.
  • Update PHI arg insertion to insert the handler phi args with the 'try' entry block as the pred instead of the pred of the 'try'
  • Optimize computation of the dominator tree to avoid multiple enumerations of the preds
  • Optimize computation of the dominator tree to utilize computed postorder indices for "visited" checks, instead of a block set

A few minor diffs are expected in some cases due to building a different block visit order in VN.

* Stop considering "successor EH successors" to be EH successors of each
  block. Instead, add this special case where it is actually needed:
  when computing the dominator tree.
* Update PHI arg insertion to insert the handler phi args with the 'try'
  entry block as the pred instead of the pred of the 'try'
* Optimize computation of the dominator tree to avoid multiple
  enumerations of the preds
* Optimize computation of the dominator tree to utilize computed
  postorder indices for "visited" checks, instead of a block set

A few minor diffs are expected in some cases due to building a different
block visit order in VN.
@ghost ghost added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Nov 13, 2023
@ghost ghost assigned jakobbotsch Nov 13, 2023
@ghost
Copy link

ghost commented Nov 13, 2023

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details
  • Stop considering "successor EH successors" to be EH successors of each block. Instead, add this special case where it is actually needed: when computing the dominator tree.
  • Update PHI arg insertion to insert the handler phi args with the 'try' entry block as the pred instead of the pred of the 'try'
  • Optimize computation of the dominator tree to avoid multiple enumerations of the preds
  • Optimize computation of the dominator tree to utilize computed postorder indices for "visited" checks, instead of a block set

A few minor diffs are expected in some cases due to building a different block visit order in VN.

Author: jakobbotsch
Assignees: jakobbotsch
Labels:

area-CodeGen-coreclr

Milestone: -

@@ -1311,7 +1298,7 @@ void SsaBuilder::AddPhiArgsToSuccessors(BasicBlock* block)
GenTreePhi* phi = tree->AsLclVar()->Data()->AsPhi();
unsigned ssaNum = m_renameStack.Top(lclNum);

AddPhiArg(handlerStart, stmt, phi, lclNum, ssaNum, block);
AddPhiArg(handlerStart, stmt, phi, lclNum, ssaNum, succ);
Copy link
Member Author

@jakobbotsch jakobbotsch Nov 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seemed a bit questionable before -- e.g. for

x = 123;
try
{
  M();
  x = 456;
  M();
}
catch
{
}

we would have no phi arg in the handler indicating that the value of x could be 123 even if control came from the try block (note that we did have the value, but only from the pred outside the try).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this changes the catch from something like $x_2 = \phi ([x_0, pretry], [x_1, try])$ to $x_2 = \phi ([x_0, pretry], [x_0, try], [x_1, try])$?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have $pretry$ in the list anymore, so rather to $x_2 = \phi ([x_0, try], [x_1, try])$

@jakobbotsch
Copy link
Member Author

/azp run runtime-coreclr jitstress, runtime-coreclr libraries-jitstress, Fuzzlyn

Copy link

Azure Pipelines successfully started running 3 pipeline(s).

@jakobbotsch
Copy link
Member Author

jakobbotsch commented Nov 13, 2023

cc @dotnet/jit-contrib PTAL @AndyAyersMS

Diffs. Minor codegen diffs from the removal of successors leading to a new order built by VN (some of these will probably go away with #94623), otherwise just throughput improvements.

The Fuzzlyn failures are #94660 and #94680. The linux-arm32 failure is known.

@jakobbotsch
Copy link
Member Author

Merged up and pushed, but it passed jitstress/libraries-jitstress/Fuzzlyn on the previous commit. I'm expecting fewer (or no) diffs now with #94623 included, will double check why they are there tomorrow if there still are some.

Copy link
Member

@AndyAyersMS AndyAyersMS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice improvement. Great to see us leveraging reverse postorder information like this.

{
DBG_SSA_JITDUMP("Pred block is " FMT_BB ".\n", predBlock->bbNum);
}
if ((numIters <= 0) && (domPred->bbPostorderNum <= (unsigned)i))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This <= 0 confused me for a second, maybe == 0 ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're not the first to be confused by my use of <= 0 for these kinds of comparisons, so maybe I should switch to == in the future. But let me avoid rerunning CI for it here.

@jakobbotsch
Copy link
Member Author

jakobbotsch commented Nov 14, 2023

Hmm, the diffs now look like something I need to fix in assertion prop. In this code:

[UnmanagedCallersOnly]
private static void _freeArray(IntPtr thisHandle, IntPtr* ppException, void* array)
{
var _this = GetThis(thisHandle);
try
{
_this.freeArray(array);
}
catch (Exception ex)
{
*ppException = _this.AllocException(ex);
}
}

we now seem to be able to prove that _this is non-null inside the catch, from the non-null assertion generated by the instance call inside the try. That's odd since that assertion shouldn't be live into the try. I'll take a closer look.

@jakobbotsch
Copy link
Member Author

jakobbotsch commented Nov 14, 2023

Seems like the data flow of assertion prop reasons that since the assertion is live out of all preds of the handler, we don't need to even consider the handler. But that's not how data flow for handlers works.

@jakobbotsch
Copy link
Member Author

The DataFlow class tries to be pretty general, but the logic it has for handlers does not really make much sense to me. It worked for assertion prop because global assertion prop never kills any assertions (besides at joins), but it would not work for general dataflow analyses -- for example, liveness could not be switched to it. A general dataflow analysis has to propagate facts that may change mid block to the handler, but DataFlow does not allow for something like that.

Assertion prop also uses DataFlow in a strange way -- we push the handler into the work list for every block in the enclosed range, but every time we do that assertion prop only propagates the facts from the "in" of the try region. This again only works because global assertion prop never kills any assertions.

CSE also uses DataFlow, but it does not support CSE'ing into handlers, so it seems like it could be optimized a bit. I'll try to revise how DataFlow works to capture more precisely what assertion prop/CSE need.

@jakobbotsch
Copy link
Member Author

The remaining diffs now seem to be from LSRA using VisitAllSuccs in one place to figure out where to insert "exposed uses". LSRA inserts these fake uses at the end of blocks for locals that are live only into successors we have already processed in the linear order. I think the idea is to make it explicit in the linear order of ref positions that there is a kind of use there (due to the backedge). If a local is live in an unvisited successor, then we know we are going to get a "natural" ref position when visiting that block, and don't need to insert the fake use.

Using successors to avoid inserting these seems like an approximation, in reality the precise thing would be to check if any of the remaining blocks in linear order has the local as live-in. So I don't think there is a correctness issue here.

@jakobbotsch jakobbotsch merged commit e0b1e4d into dotnet:main Nov 16, 2023
@jakobbotsch jakobbotsch deleted the stop-visiting-successor-eh-successors branch November 16, 2023 10:55
@github-actions github-actions bot locked and limited conversation to collaborators Dec 17, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants