Skip to content

[WebAssembly] Remove wasm-specific findWasmUnwindDestinations #130374

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Mar 11, 2025

Conversation

aheejin
Copy link
Member

@aheejin aheejin commented Mar 8, 2025

Unlike in Itanium EH IR, WinEH IR's unwinding instructions (e.g. invokes) can have multiple possible unwind destinations.

For example:

entry:
  invoke void @foo()
     to label %cont unwind label %catch.dispatch

catch.dispatch:                                ; preds = %entry
  %0 = catchswitch within none [label %catch.start] unwind label %terminate

catch.start:                                   ; preds = %catch.dispatch
  %1 = catchpad within %0 [ptr null]
  ...

terminate:                                     ; preds = %catch.dispatch
  %2 = catchpad within none []
  ...
...

In this case, if an exception is not caught by catch.dispatch (and thus catch.start), it should next unwind to terminate. findUnwindDestination in ISel gathers the list of this unwind destinations traversing the unwind edges:

/// When an invoke or a cleanupret unwinds to the next EH pad, there are
/// many places it could ultimately go. In the IR, we have a single unwind
/// destination, but in the machine CFG, we enumerate all the possible blocks.
/// This function skips over imaginary basic blocks that hold catchswitch
/// instructions, and finds all the "real" machine
/// basic block destinations. As those destinations may not be successors of
/// EHPadBB, here we also calculate the edge probability to those destinations.
/// The passed-in Prob is the edge probability to EHPadBB.
static void findUnwindDestinations(
FunctionLoweringInfo &FuncInfo, const BasicBlock *EHPadBB,
BranchProbability Prob,
SmallVectorImpl<std::pair<MachineBasicBlock *, BranchProbability>>
&UnwindDests) {
EHPersonality Personality =
classifyEHPersonality(FuncInfo.Fn->getPersonalityFn());
bool IsMSVCCXX = Personality == EHPersonality::MSVC_CXX;
bool IsCoreCLR = Personality == EHPersonality::CoreCLR;
bool IsWasmCXX = Personality == EHPersonality::Wasm_CXX;
bool IsSEH = isAsynchronousEHPersonality(Personality);
if (IsWasmCXX) {
findWasmUnwindDestinations(FuncInfo, EHPadBB, Prob, UnwindDests);
assert(UnwindDests.size() <= 1 &&
"There should be at most one unwind destination for wasm");
return;
}
while (EHPadBB) {
BasicBlock::const_iterator Pad = EHPadBB->getFirstNonPHIIt();
BasicBlock *NewEHPadBB = nullptr;
if (isa<LandingPadInst>(Pad)) {
// Stop on landingpads. They are not funclets.
UnwindDests.emplace_back(FuncInfo.getMBB(EHPadBB), Prob);
break;
} else if (isa<CleanupPadInst>(Pad)) {
// Stop on cleanup pads. Cleanups are always funclet entries for all known
// personalities.
UnwindDests.emplace_back(FuncInfo.getMBB(EHPadBB), Prob);
UnwindDests.back().first->setIsEHScopeEntry();
UnwindDests.back().first->setIsEHFuncletEntry();
break;
} else if (const auto *CatchSwitch = dyn_cast<CatchSwitchInst>(Pad)) {
// Add the catchpad handlers to the possible destinations.
for (const BasicBlock *CatchPadBB : CatchSwitch->handlers()) {
UnwindDests.emplace_back(FuncInfo.getMBB(CatchPadBB), Prob);
// For MSVC++ and the CLR, catchblocks are funclets and need prologues.
if (IsMSVCCXX || IsCoreCLR)
UnwindDests.back().first->setIsEHFuncletEntry();
if (!IsSEH)
UnwindDests.back().first->setIsEHScopeEntry();
}
NewEHPadBB = CatchSwitch->getUnwindDest();
} else {
continue;
}
BranchProbabilityInfo *BPI = FuncInfo.BPI;
if (BPI && NewEHPadBB)
Prob *= BPI->getEdgeProbability(EHPadBB, NewEHPadBB);
EHPadBB = NewEHPadBB;
}
}
But we don't use that, and instead use our custom findWasmUnwindDestinations that only adds the first unwind destination, catch.start, to the successor list of entry, and not terminate:
// In wasm EH, even though a catchpad may not catch an exception if a tag does
// not match, it is OK to add only the first unwind destination catchpad to the
// successors, because there will be at least one invoke instruction within the
// catch scope that points to the next unwind destination, if one exists, so
// CFGSort cannot mess up with BB sorting order.
// (All catchpads with 'catch (type)' clauses have a 'llvm.rethrow' intrinsic
// call within them, and catchpads only consisting of 'catch (...)' have a
// '__cxa_end_catch' call within them, both of which generate invokes in case
// the next unwind destination exists, i.e., the next unwind destination is not
// the caller.)
//
// Having at most one EH pad successor is also simpler and helps later
// transformations.
//
// For example,
// current:
// invoke void @foo to ... unwind label %catch.dispatch
// catch.dispatch:
// %0 = catchswitch within ... [label %catch.start] unwind label %next
// catch.start:
// ...
// ... in this BB or some other child BB dominated by this BB there will be an
// invoke that points to 'next' BB as an unwind destination
//
// next: ; We don't need to add this to 'current' BB's successor
// ...
static void findWasmUnwindDestinations(
FunctionLoweringInfo &FuncInfo, const BasicBlock *EHPadBB,
BranchProbability Prob,
SmallVectorImpl<std::pair<MachineBasicBlock *, BranchProbability>>
&UnwindDests) {
while (EHPadBB) {
BasicBlock::const_iterator Pad = EHPadBB->getFirstNonPHIIt();
if (isa<CleanupPadInst>(Pad)) {
// Stop on cleanup pads.
UnwindDests.emplace_back(FuncInfo.getMBB(EHPadBB), Prob);
UnwindDests.back().first->setIsEHScopeEntry();
break;
} else if (const auto *CatchSwitch = dyn_cast<CatchSwitchInst>(Pad)) {
// Add the catchpad handlers to the possible destinations. We don't
// continue to the unwind destination of the catchswitch for wasm.
for (const BasicBlock *CatchPadBB : CatchSwitch->handlers()) {
UnwindDests.emplace_back(FuncInfo.getMBB(CatchPadBB), Prob);
UnwindDests.back().first->setIsEHScopeEntry();
}
break;
} else {
continue;
}
}
}

The reason behind it was, as described in the comment block in the code, it was assumed that there always would be an invoke that connects catch.start and terminate. In case of catch (type), there will be call void @llvm.wasm.rethrow() in catch.start's predecessor that unwinds to the next destination. For example:

invoke void @llvm.wasm.rethrow() #1 [ "funclet"(token %1) ]
to label %unreachable unwind label %ehcleanup
In case of catch (...), __cxa_end_catch can throw, so it becomes an invoke that unwinds to the next destination. For example:
invoke void @__cxa_end_catch() [ "funclet"(token %2) ]
to label %invoke.cont.i unwind label %terminate.i
So the unwind ordering relationship between catch.start and terminate here would be preserved.

But turns out this assumption does not always hold. For example:

entry:
  invoke void @foo()
     to label %cont unwind label %catch.dispatch

catch.dispatch:                                ; preds = %entry
  %0 = catchswitch within none [label %catch.start] unwind label %terminate

catch.start:                                   ; preds = %catch.dispatch
  %1 = catchpad within %0 [ptr null]
  ...
  call void @_ZSt9terminatev()
  unreachable

terminate:                                     ; preds = %catch.dispatch
  %2 = catchpad within none []
  call void @_ZSt9terminatev()
  unreachable
...

In this case there is no invoke that connects catch.start to terminate. So after catch.dispatch BB is removed in ISel, terminate is considered unreachable and incorrectly removed in DCE.

This makes Wasm just use the general findUnwindDestination. In that case entry's successor is going to be [catch.start, terminate]. We can get the first unwind destination by just traversing the list from the front.


This required another change in WinEHPrepare. WinEHPrepare demotes all PHIs in EH pads because they are funclets in Windows and funclets can't have PHIs. When used in Wasm they are not funclets so we don't need to do that wholesale but we still need to demote PHIs in catchswitch BBs because they are deleted during ISel. (So we created -demote-catchswitch-only option for that)

But turns out we need to remove PHIs that have a catchswitch BB as an incoming block too:

...
catch.dispatch:
  %0 = catchswitch within none [label %catch.start] unwind label %terminate

catch.start:
  ...

somebb:
  ...

ehcleanup                                      ; preds = %catch.dispatch, %somebb
  %1 = phi i32 [ 10, %catch.dispatch ], [ 20, %somebb ]
  ...

In this case the phi in ehcleanup BB should be demoted too because catch.dispatch BB will be removed in ISel so one if its incoming block will be gone. This pattern didn't manifest before presumably due to how findWasmUnwindDestinations worked. (In this example, in our findWasmUnwindDestinations, catch.dispatch would have had only one successor, catch.start. But now catch.dispatch has both catch.start and ehcleanup as successors, revealing this bug. This case is represented by rethrow_terminator function in exception.ll (or exception-legacy.ll) and without the WinEHPrepare fix it will crash.


This is one of the two bugs reported in #126916.

Unlike in Itanium EH IR, WinEH IR's unwinding instructions (e.g.
`invoke`s) can have multiple possible unwind destinations.

For example:
```ll
entry:
  invoke void @foo()
     to label %cont unwind label %catch.dispatch

catch.dispatch:                                ; preds = %entry
  %0 = catchswitch within none [label %catch.start] unwind label %terminate

catch.start:                                   ; preds = %catch.dispatch
  %1 = catchpad within %0 [ptr null]
  ...

terminate:                                     ; preds = %catch.dispatch
  %2 = catchpad within none []
  ...
...
```

In this case, if an exception is not caught by `catch.dispatch` (and
thus `catch.start`), it should next unwind to `terminate`.
`findUnwindDestination` in ISel gathers the list of this unwind
destinations traversing the unwind edges:
https://github.com/llvm/llvm-project/blob/ae42f071032b29821beef6a33771258086bbbb1c/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp#L2089-L2150
But we don't use that, and instead use our custom
`findWasmUnwindDestinations` that only adds the first unwind
destination, `catch.start`, to the successor list of `entry`, and not
`terminate`:
https://github.com/llvm/llvm-project/blob/ae42f071032b29821beef6a33771258086bbbb1c/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp#L2037-L2087

The reason behind it was, as described in the comment block in the code,
it was assumed that there always would be an `invoke` that connects
`catch.start` and `terminate`. In case of `catch (type)`, there will be
`call void @llvm.wasm.rethrow()` in `catch.start`'s predecessor that
unwinds to the next destination. For example:
https://github.com/llvm/llvm-project/blob/0db702ac8e06911478615ac537f75ac778817c04/llvm/test/CodeGen/WebAssembly/exception.ll#L429-L430
In case of `catch (...)`, `__cxa_end_catch` can throw, so it becomes an
`invoke` that unwinds to the next destination. For example:
https://github.com/llvm/llvm-project/blob/0db702ac8e06911478615ac537f75ac778817c04/llvm/test/CodeGen/WebAssembly/exception.ll#L537-L538
So the unwind ordering relationship between `catch.start` and
`terminate` here would be preserved.

But turns out this assumption does not always hold. For example:
```ll
entry:
  invoke void @foo()
     to label %cont unwind label %catch.dispatch

catch.dispatch:                                ; preds = %entry
  %0 = catchswitch within none [label %catch.start] unwind label %terminate

catch.start:                                   ; preds = %catch.dispatch
  %1 = catchpad within %0 [ptr null]
  ...
  call void @_ZSt9terminatev()
  unreachable

terminate:                                     ; preds = %catch.dispatch
  %2 = catchpad within none []
  call void @_ZSt9terminatev()
  unreachable
...
```
In this case there is no `invoke` that connects `catch.start` to
`terminate`. So after `catch.dispatch` BB is removed in ISel,
`terminate` is considered unreachable and incorrectly removed in DCE.

This makes Wasm just use the general `findUnwindDestination`. In that
case `entry`'s successor is going to be [`catch.start`, `terminate`]. We
can get the first unwind destination by just traversing the list from
the front.

---

This required another change in WinEHPrepare. WinEHPrepare demotes all
PHIs in EH pads because they are funclets in Windows and funclets can't
have PHIs. When used in Wasm they are not funclets so we don't need to
do that wholesale but we still need to demote PHIs in `catchswitch` BBs
because they are deleted during ISel. But turns out we need to remove
PHIs that have a `catchswitch` BB as an incoming block too:
```ll
...
catch.dispatch:
  %0 = catchswitch within none [label %catch.start] unwind label %terminate

catch.start:
  ...

somebb:
  ...

ehcleanup                                      ; preds = %catch.dispatch, %somebb
  %1 = phi i32 [ 10, %catch.dispatch ], [ 20, %somebb ]
  ...
```
In this case the `phi` in `ehcleanup` BB should be demoted too because
`catch.dispatch` BB will be removed in ISel so one if its incoming block
will be gone. This pattern didn't manifest before presumably due to how
`findWasmUnwindDestinations` worked. (In this example, in our
`findWasmUnwindDestinations`, `catch.dispatch` would have had only one
successor, `catch.start`. But now `catch.dispatch` has both
`catch.start` and `ehcleanup` as successors, revealing this bug. This
case is represented by `rethrow_terminator` function in `exception.ll`
(or `exception-legacy.ll`) and without this fix it will crash.

---

Discovered by the reproducer provided in llvm#126916, even though the bug
reported there was not this one.
@aheejin aheejin requested a review from dschuff March 8, 2025 00:50
@llvmbot llvmbot added backend:WebAssembly llvm:SelectionDAG SelectionDAGISel as well labels Mar 8, 2025
@llvmbot
Copy link
Member

llvmbot commented Mar 8, 2025

@llvm/pr-subscribers-llvm-selectiondag

@llvm/pr-subscribers-backend-webassembly

Author: Heejin Ahn (aheejin)

Changes

Unlike in Itanium EH IR, WinEH IR's unwinding instructions (e.g. invokes) can have multiple possible unwind destinations.

For example:

entry:
  invoke void @<!-- -->foo()
     to label %cont unwind label %catch.dispatch

catch.dispatch:                                ; preds = %entry
  %0 = catchswitch within none [label %catch.start] unwind label %terminate

catch.start:                                   ; preds = %catch.dispatch
  %1 = catchpad within %0 [ptr null]
  ...

terminate:                                     ; preds = %catch.dispatch
  %2 = catchpad within none []
  ...
...

In this case, if an exception is not caught by catch.dispatch (and thus catch.start), it should next unwind to terminate. findUnwindDestination in ISel gathers the list of this unwind destinations traversing the unwind edges:

/// When an invoke or a cleanupret unwinds to the next EH pad, there are
/// many places it could ultimately go. In the IR, we have a single unwind
/// destination, but in the machine CFG, we enumerate all the possible blocks.
/// This function skips over imaginary basic blocks that hold catchswitch
/// instructions, and finds all the "real" machine
/// basic block destinations. As those destinations may not be successors of
/// EHPadBB, here we also calculate the edge probability to those destinations.
/// The passed-in Prob is the edge probability to EHPadBB.
static void findUnwindDestinations(
FunctionLoweringInfo &FuncInfo, const BasicBlock *EHPadBB,
BranchProbability Prob,
SmallVectorImpl<std::pair<MachineBasicBlock *, BranchProbability>>
&UnwindDests) {
EHPersonality Personality =
classifyEHPersonality(FuncInfo.Fn->getPersonalityFn());
bool IsMSVCCXX = Personality == EHPersonality::MSVC_CXX;
bool IsCoreCLR = Personality == EHPersonality::CoreCLR;
bool IsWasmCXX = Personality == EHPersonality::Wasm_CXX;
bool IsSEH = isAsynchronousEHPersonality(Personality);
if (IsWasmCXX) {
findWasmUnwindDestinations(FuncInfo, EHPadBB, Prob, UnwindDests);
assert(UnwindDests.size() <= 1 &&
"There should be at most one unwind destination for wasm");
return;
}
while (EHPadBB) {
BasicBlock::const_iterator Pad = EHPadBB->getFirstNonPHIIt();
BasicBlock *NewEHPadBB = nullptr;
if (isa<LandingPadInst>(Pad)) {
// Stop on landingpads. They are not funclets.
UnwindDests.emplace_back(FuncInfo.getMBB(EHPadBB), Prob);
break;
} else if (isa<CleanupPadInst>(Pad)) {
// Stop on cleanup pads. Cleanups are always funclet entries for all known
// personalities.
UnwindDests.emplace_back(FuncInfo.getMBB(EHPadBB), Prob);
UnwindDests.back().first->setIsEHScopeEntry();
UnwindDests.back().first->setIsEHFuncletEntry();
break;
} else if (const auto *CatchSwitch = dyn_cast<CatchSwitchInst>(Pad)) {
// Add the catchpad handlers to the possible destinations.
for (const BasicBlock *CatchPadBB : CatchSwitch->handlers()) {
UnwindDests.emplace_back(FuncInfo.getMBB(CatchPadBB), Prob);
// For MSVC++ and the CLR, catchblocks are funclets and need prologues.
if (IsMSVCCXX || IsCoreCLR)
UnwindDests.back().first->setIsEHFuncletEntry();
if (!IsSEH)
UnwindDests.back().first->setIsEHScopeEntry();
}
NewEHPadBB = CatchSwitch->getUnwindDest();
} else {
continue;
}
BranchProbabilityInfo *BPI = FuncInfo.BPI;
if (BPI && NewEHPadBB)
Prob *= BPI->getEdgeProbability(EHPadBB, NewEHPadBB);
EHPadBB = NewEHPadBB;
}
}
But we don't use that, and instead use our custom
findWasmUnwindDestinations that only adds the first unwind destination, catch.start, to the successor list of entry, and not terminate:
// In wasm EH, even though a catchpad may not catch an exception if a tag does
// not match, it is OK to add only the first unwind destination catchpad to the
// successors, because there will be at least one invoke instruction within the
// catch scope that points to the next unwind destination, if one exists, so
// CFGSort cannot mess up with BB sorting order.
// (All catchpads with 'catch (type)' clauses have a 'llvm.rethrow' intrinsic
// call within them, and catchpads only consisting of 'catch (...)' have a
// '__cxa_end_catch' call within them, both of which generate invokes in case
// the next unwind destination exists, i.e., the next unwind destination is not
// the caller.)
//
// Having at most one EH pad successor is also simpler and helps later
// transformations.
//
// For example,
// current:
// invoke void @foo to ... unwind label %catch.dispatch
// catch.dispatch:
// %0 = catchswitch within ... [label %catch.start] unwind label %next
// catch.start:
// ...
// ... in this BB or some other child BB dominated by this BB there will be an
// invoke that points to 'next' BB as an unwind destination
//
// next: ; We don't need to add this to 'current' BB's successor
// ...
static void findWasmUnwindDestinations(
FunctionLoweringInfo &FuncInfo, const BasicBlock *EHPadBB,
BranchProbability Prob,
SmallVectorImpl<std::pair<MachineBasicBlock *, BranchProbability>>
&UnwindDests) {
while (EHPadBB) {
BasicBlock::const_iterator Pad = EHPadBB->getFirstNonPHIIt();
if (isa<CleanupPadInst>(Pad)) {
// Stop on cleanup pads.
UnwindDests.emplace_back(FuncInfo.getMBB(EHPadBB), Prob);
UnwindDests.back().first->setIsEHScopeEntry();
break;
} else if (const auto *CatchSwitch = dyn_cast<CatchSwitchInst>(Pad)) {
// Add the catchpad handlers to the possible destinations. We don't
// continue to the unwind destination of the catchswitch for wasm.
for (const BasicBlock *CatchPadBB : CatchSwitch->handlers()) {
UnwindDests.emplace_back(FuncInfo.getMBB(CatchPadBB), Prob);
UnwindDests.back().first->setIsEHScopeEntry();
}
break;
} else {
continue;
}
}
}

The reason behind it was, as described in the comment block in the code, it was assumed that there always would be an invoke that connects catch.start and terminate. In case of catch (type), there will be call void @<!-- -->llvm.wasm.rethrow() in catch.start's predecessor that unwinds to the next destination. For example:

invoke void @llvm.wasm.rethrow() #1 [ "funclet"(token %1) ]
to label %unreachable unwind label %ehcleanup
In case of catch (...), __cxa_end_catch can throw, so it becomes an invoke that unwinds to the next destination. For example:
invoke void @__cxa_end_catch() [ "funclet"(token %2) ]
to label %invoke.cont.i unwind label %terminate.i
So the unwind ordering relationship between catch.start and terminate here would be preserved.

But turns out this assumption does not always hold. For example:

entry:
  invoke void @<!-- -->foo()
     to label %cont unwind label %catch.dispatch

catch.dispatch:                                ; preds = %entry
  %0 = catchswitch within none [label %catch.start] unwind label %terminate

catch.start:                                   ; preds = %catch.dispatch
  %1 = catchpad within %0 [ptr null]
  ...
  call void @<!-- -->_ZSt9terminatev()
  unreachable

terminate:                                     ; preds = %catch.dispatch
  %2 = catchpad within none []
  call void @<!-- -->_ZSt9terminatev()
  unreachable
...

In this case there is no invoke that connects catch.start to terminate. So after catch.dispatch BB is removed in ISel, terminate is considered unreachable and incorrectly removed in DCE.

This makes Wasm just use the general findUnwindDestination. In that case entry's successor is going to be [catch.start, terminate]. We can get the first unwind destination by just traversing the list from the front.


This required another change in WinEHPrepare. WinEHPrepare demotes all PHIs in EH pads because they are funclets in Windows and funclets can't have PHIs. When used in Wasm they are not funclets so we don't need to do that wholesale but we still need to demote PHIs in catchswitch BBs because they are deleted during ISel. But turns out we need to remove PHIs that have a catchswitch BB as an incoming block too:

...
catch.dispatch:
  %0 = catchswitch within none [label %catch.start] unwind label %terminate

catch.start:
  ...

somebb:
  ...

ehcleanup                                      ; preds = %catch.dispatch, %somebb
  %1 = phi i32 [ 10, %catch.dispatch ], [ 20, %somebb ]
  ...

In this case the phi in ehcleanup BB should be demoted too because catch.dispatch BB will be removed in ISel so one if its incoming block will be gone. This pattern didn't manifest before presumably due to how findWasmUnwindDestinations worked. (In this example, in our findWasmUnwindDestinations, catch.dispatch would have had only one successor, catch.start. But now catch.dispatch has both catch.start and ehcleanup as successors, revealing this bug. This case is represented by rethrow_terminator function in exception.ll (or exception-legacy.ll) and without this fix it will crash.


Discovered by the reproducer provided in #126916, even though the bug reported there was not this one.


Full diff: https://github.com/llvm/llvm-project/pull/130374.diff

6 Files Affected:

  • (modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp (+3-60)
  • (modified) llvm/lib/CodeGen/WinEHPrepare.cpp (+17-3)
  • (modified) llvm/lib/Target/WebAssembly/WebAssemblyCFGStackify.cpp (+4-5)
  • (modified) llvm/lib/Target/WebAssembly/WebAssemblyLowerEmscriptenEHSjLj.cpp (-3)
  • (modified) llvm/test/CodeGen/WebAssembly/exception-legacy.ll (+47-2)
  • (modified) llvm/test/CodeGen/WebAssembly/exception.ll (+53-2)
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
index d5a07e616236e..0ff5e7410df7e 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
@@ -2028,58 +2028,6 @@ void SelectionDAGBuilder::visitCleanupPad(const CleanupPadInst &CPI) {
   }
 }
 
-// In wasm EH, even though a catchpad may not catch an exception if a tag does
-// not match, it is OK to add only the first unwind destination catchpad to the
-// successors, because there will be at least one invoke instruction within the
-// catch scope that points to the next unwind destination, if one exists, so
-// CFGSort cannot mess up with BB sorting order.
-// (All catchpads with 'catch (type)' clauses have a 'llvm.rethrow' intrinsic
-// call within them, and catchpads only consisting of 'catch (...)' have a
-// '__cxa_end_catch' call within them, both of which generate invokes in case
-// the next unwind destination exists, i.e., the next unwind destination is not
-// the caller.)
-//
-// Having at most one EH pad successor is also simpler and helps later
-// transformations.
-//
-// For example,
-// current:
-//   invoke void @foo to ... unwind label %catch.dispatch
-// catch.dispatch:
-//   %0 = catchswitch within ... [label %catch.start] unwind label %next
-// catch.start:
-//   ...
-//   ... in this BB or some other child BB dominated by this BB there will be an
-//   invoke that points to 'next' BB as an unwind destination
-//
-// next: ; We don't need to add this to 'current' BB's successor
-//   ...
-static void findWasmUnwindDestinations(
-    FunctionLoweringInfo &FuncInfo, const BasicBlock *EHPadBB,
-    BranchProbability Prob,
-    SmallVectorImpl<std::pair<MachineBasicBlock *, BranchProbability>>
-        &UnwindDests) {
-  while (EHPadBB) {
-    BasicBlock::const_iterator Pad = EHPadBB->getFirstNonPHIIt();
-    if (isa<CleanupPadInst>(Pad)) {
-      // Stop on cleanup pads.
-      UnwindDests.emplace_back(FuncInfo.getMBB(EHPadBB), Prob);
-      UnwindDests.back().first->setIsEHScopeEntry();
-      break;
-    } else if (const auto *CatchSwitch = dyn_cast<CatchSwitchInst>(Pad)) {
-      // Add the catchpad handlers to the possible destinations. We don't
-      // continue to the unwind destination of the catchswitch for wasm.
-      for (const BasicBlock *CatchPadBB : CatchSwitch->handlers()) {
-        UnwindDests.emplace_back(FuncInfo.getMBB(CatchPadBB), Prob);
-        UnwindDests.back().first->setIsEHScopeEntry();
-      }
-      break;
-    } else {
-      continue;
-    }
-  }
-}
-
 /// When an invoke or a cleanupret unwinds to the next EH pad, there are
 /// many places it could ultimately go. In the IR, we have a single unwind
 /// destination, but in the machine CFG, we enumerate all the possible blocks.
@@ -2100,13 +2048,6 @@ static void findUnwindDestinations(
   bool IsWasmCXX = Personality == EHPersonality::Wasm_CXX;
   bool IsSEH = isAsynchronousEHPersonality(Personality);
 
-  if (IsWasmCXX) {
-    findWasmUnwindDestinations(FuncInfo, EHPadBB, Prob, UnwindDests);
-    assert(UnwindDests.size() <= 1 &&
-           "There should be at most one unwind destination for wasm");
-    return;
-  }
-
   while (EHPadBB) {
     BasicBlock::const_iterator Pad = EHPadBB->getFirstNonPHIIt();
     BasicBlock *NewEHPadBB = nullptr;
@@ -2119,7 +2060,9 @@ static void findUnwindDestinations(
       // personalities.
       UnwindDests.emplace_back(FuncInfo.getMBB(EHPadBB), Prob);
       UnwindDests.back().first->setIsEHScopeEntry();
-      UnwindDests.back().first->setIsEHFuncletEntry();
+      // In Wasm, EH scopes are not funclets
+      if (!IsWasmCXX)
+        UnwindDests.back().first->setIsEHFuncletEntry();
       break;
     } else if (const auto *CatchSwitch = dyn_cast<CatchSwitchInst>(Pad)) {
       // Add the catchpad handlers to the possible destinations.
diff --git a/llvm/lib/CodeGen/WinEHPrepare.cpp b/llvm/lib/CodeGen/WinEHPrepare.cpp
index 1970716485613..de351903efa79 100644
--- a/llvm/lib/CodeGen/WinEHPrepare.cpp
+++ b/llvm/lib/CodeGen/WinEHPrepare.cpp
@@ -866,9 +866,6 @@ void WinEHPrepareImpl::demotePHIsOnFunclets(Function &F,
   for (BasicBlock &BB : make_early_inc_range(F)) {
     if (!BB.isEHPad())
       continue;
-    if (DemoteCatchSwitchPHIOnly &&
-        !isa<CatchSwitchInst>(BB.getFirstNonPHIIt()))
-      continue;
 
     for (Instruction &I : make_early_inc_range(BB)) {
       auto *PN = dyn_cast<PHINode>(&I);
@@ -876,6 +873,23 @@ void WinEHPrepareImpl::demotePHIsOnFunclets(Function &F,
       if (!PN)
         break;
 
+      // If DemoteCatchSwitchPHIOnly is true, we only demote a PHI when
+      // 1. The PHI is within a catchswitch BB
+      // 2. The PHI has a catchswitch BB has one of its incoming blocks
+      if (DemoteCatchSwitchPHIOnly) {
+        bool IsCatchSwitchBB = isa<CatchSwitchInst>(BB.getFirstNonPHIIt());
+        bool HasIncomingCatchSwitchBB = false;
+        for (unsigned I = 0, E = PN->getNumIncomingValues(); I < E; ++I) {
+          if (isa<CatchSwitchInst>(
+                  PN->getIncomingBlock(I)->getFirstNonPHIIt())) {
+            HasIncomingCatchSwitchBB = true;
+            break;
+          }
+        }
+        if (!IsCatchSwitchBB && !HasIncomingCatchSwitchBB)
+          break;
+      }
+
       AllocaInst *SpillSlot = insertPHILoads(PN, F);
       if (SpillSlot)
         insertPHIStores(PN, SpillSlot);
diff --git a/llvm/lib/Target/WebAssembly/WebAssemblyCFGStackify.cpp b/llvm/lib/Target/WebAssembly/WebAssemblyCFGStackify.cpp
index 277d353d1db10..640be5fe8e8c9 100644
--- a/llvm/lib/Target/WebAssembly/WebAssemblyCFGStackify.cpp
+++ b/llvm/lib/Target/WebAssembly/WebAssemblyCFGStackify.cpp
@@ -1850,13 +1850,12 @@ bool WebAssemblyCFGStackify::fixCallUnwindMismatches(MachineFunction &MF) {
 
       // If the EH pad on the stack top is where this instruction should unwind
       // next, we're good.
-      MachineBasicBlock *UnwindDest = getFakeCallerBlock(MF);
+      MachineBasicBlock *UnwindDest = nullptr;
       for (auto *Succ : MBB.successors()) {
         // Even though semantically a BB can have multiple successors in case an
-        // exception is not caught by a catchpad, in our backend implementation
-        // it is guaranteed that a BB can have at most one EH pad successor. For
-        // details, refer to comments in findWasmUnwindDestinations function in
-        // SelectionDAGBuilder.cpp.
+        // exception is not caught by a catchpad, the first unwind destination
+        // should appear first in the successor list, based on the calculation
+        // in findUnwindDestinations() in SelectionDAGBuilder.cpp.
         if (Succ->isEHPad()) {
           UnwindDest = Succ;
           break;
diff --git a/llvm/lib/Target/WebAssembly/WebAssemblyLowerEmscriptenEHSjLj.cpp b/llvm/lib/Target/WebAssembly/WebAssemblyLowerEmscriptenEHSjLj.cpp
index 550d8b64dca35..2d11019291a7a 100644
--- a/llvm/lib/Target/WebAssembly/WebAssemblyLowerEmscriptenEHSjLj.cpp
+++ b/llvm/lib/Target/WebAssembly/WebAssemblyLowerEmscriptenEHSjLj.cpp
@@ -659,9 +659,6 @@ static bool canLongjmp(const Value *Callee) {
   // Every catchpad generated by Wasm C++ contains __cxa_end_catch, so we
   // intentionally treat it as longjmpable to work around this problem. This is
   // a hacky fix but an easy one.
-  //
-  // The comment block in findWasmUnwindDestinations() in
-  // SelectionDAGBuilder.cpp is addressing a similar problem.
   if (CalleeName == "__cxa_end_catch")
     return WebAssembly::WasmEnableSjLj;
   if (CalleeName == "__cxa_begin_catch" ||
diff --git a/llvm/test/CodeGen/WebAssembly/exception-legacy.ll b/llvm/test/CodeGen/WebAssembly/exception-legacy.ll
index 3fe45bcc4cd29..b4ffd185e3ca8 100644
--- a/llvm/test/CodeGen/WebAssembly/exception-legacy.ll
+++ b/llvm/test/CodeGen/WebAssembly/exception-legacy.ll
@@ -171,7 +171,7 @@ invoke.cont2:                                     ; preds = %ehcleanup
 
 terminate:                                        ; preds = %ehcleanup
   %6 = cleanuppad within %5 []
-  call void @_ZSt9terminatev() [ "funclet"(token %6) ]
+  call void @_ZSt9terminatev() #2 [ "funclet"(token %6) ]
   unreachable
 }
 
@@ -477,7 +477,7 @@ invoke.cont.i:                                    ; preds = %catch.start.i
 
 terminate.i:                                      ; preds = %catch.start.i, %catch.dispatch.i
   %6 = cleanuppad within %0 []
-  call void @_ZSt9terminatev() [ "funclet"(token %6) ]
+  call void @_ZSt9terminatev() #2 [ "funclet"(token %6) ]
   unreachable
 
 _ZN4TempD2Ev.exit:                                ; preds = %invoke.cont.i
@@ -501,6 +501,50 @@ unreachable:                                      ; preds = %entry
   unreachable
 }
 
+; Regression test for a bug where, when an invoke unwinds to a catchswitch, the
+; catchswitch's unwind destination was not included in the invoke's unwind
+; destination when there was no direct link from catch.start to there.
+
+; CHECK-LABEL: unwind_destinations:
+; CHECK: try
+; CHECK:   try
+; CHECK:     call  foo
+; CHECK:   catch  $0=, __cpp_exception
+; CHECK:     call  _ZSt9terminatev
+; CHECK:     unreachable
+; CHECK:   end_try
+; Note the below is "terminate" BB and should not be DCE'd
+; CHECK: catch_all
+; CHECK:   call  _ZSt9terminatev
+; CHECK:   unreachable
+; CHECK: end_try
+; CHECK: return
+define void @unwind_destinations() personality ptr @__gxx_wasm_personality_v0 {
+entry:
+  invoke void @foo()
+          to label %try.cont unwind label %catch.dispatch
+
+catch.dispatch:                                   ; preds = %entry
+  %0 = catchswitch within none [label %catch.start] unwind label %terminate
+
+catch.start:                                      ; preds = %catch.dispatch
+  %1 = catchpad within %0 [ptr null]
+  %2 = call ptr @llvm.wasm.get.exception(token %1)
+  %3 = call ptr @__cxa_begin_catch(ptr %2) #5 [ "funclet"(token %1) ]
+  call void @_ZSt9terminatev() #2 [ "funclet"(token %1) ]
+  unreachable
+
+; Even if there is no link from catch.start to this terminate BB, when there is
+; an exception that catch.start does not catch (e.g. a foreign exception), it
+; should end up here, so this BB should NOT be DCE'ed
+terminate:                                        ; preds = %catch.dispatch
+  %4 = cleanuppad within none []
+  call void @_ZSt9terminatev() #2 [ "funclet"(token %4) ]
+  unreachable
+
+try.cont:                                         ; preds = %entry
+  ret void
+}
 
 declare void @foo()
 declare void @bar(ptr)
@@ -527,5 +571,6 @@ declare ptr @_ZN4TempD2Ev(ptr returned)
 
 attributes #0 = { nounwind }
 attributes #1 = { noreturn }
+attributes #2 = { noreturn nounwind }
 
 ; CHECK: __cpp_exception:
diff --git a/llvm/test/CodeGen/WebAssembly/exception.ll b/llvm/test/CodeGen/WebAssembly/exception.ll
index 57d1f37c0039f..fc42f050a687e 100644
--- a/llvm/test/CodeGen/WebAssembly/exception.ll
+++ b/llvm/test/CodeGen/WebAssembly/exception.ll
@@ -207,7 +207,7 @@ invoke.cont2:                                     ; preds = %ehcleanup
 
 terminate:                                        ; preds = %ehcleanup
   %6 = cleanuppad within %5 []
-  call void @_ZSt9terminatev() [ "funclet"(token %6) ]
+  call void @_ZSt9terminatev() #2 [ "funclet"(token %6) ]
   unreachable
 }
 
@@ -542,7 +542,7 @@ invoke.cont.i:                                    ; preds = %catch.start.i
 
 terminate.i:                                      ; preds = %catch.start.i, %catch.dispatch.i
   %6 = cleanuppad within %0 []
-  call void @_ZSt9terminatev() [ "funclet"(token %6) ]
+  call void @_ZSt9terminatev() #2 [ "funclet"(token %6) ]
   unreachable
 
 _ZN4TempD2Ev.exit:                                ; preds = %invoke.cont.i
@@ -593,6 +593,56 @@ try.cont:                                         ; preds = %catch, %entry
   ret void
 }
 
+; Regression test for a bug where, when an invoke unwinds to a catchswitch, the
+; catchswitch's unwind destination was not included in the invoke's unwind
+; destination when there was no direct link from catch.start to there.
+
+; CHECK-LABEL: unwind_destinations:
+; CHECK: block
+; CHECK:   block
+; CHECK:     try_table    (catch_all 0)
+; CHECK:       block     i32
+; CHECK:         try_table    (catch __cpp_exception 0)
+; CHECK:           call  foo
+; CHECK:         end_try_table
+; CHECK:         unreachable
+; CHECK:       end_block
+; CHECK:       call  _ZSt9terminatev
+; CHECK:       unreachable
+; CHECK:     end_try_table
+; CHECK:     unreachable
+; CHECK:   end_block
+; Note the below is "terminate" BB and should not be DCE'd
+; CHECK:   call  _ZSt9terminatev
+; CHECK:   unreachable
+; CHECK: end_block
+define void @unwind_destinations() personality ptr @__gxx_wasm_personality_v0 {
+entry:
+  invoke void @foo()
+          to label %try.cont unwind label %catch.dispatch
+
+catch.dispatch:                                   ; preds = %entry
+  %0 = catchswitch within none [label %catch.start] unwind label %terminate
+
+catch.start:                                      ; preds = %catch.dispatch
+  %1 = catchpad within %0 [ptr null]
+  %2 = call ptr @llvm.wasm.get.exception(token %1)
+  %3 = call ptr @__cxa_begin_catch(ptr %2) #5 [ "funclet"(token %1) ]
+  call void @_ZSt9terminatev() #2 [ "funclet"(token %1) ]
+  unreachable
+
+; Even if there is no link from catch.start to this terminate BB, when there is
+; an exception that catch.start does not catch (e.g. a foreign exception), it
+; should end up here, so this BB should NOT be DCE'ed
+terminate:                                        ; preds = %catch.dispatch
+  %4 = cleanuppad within none []
+  call void @_ZSt9terminatev() #2 [ "funclet"(token %4) ]
+  unreachable
+
+try.cont:                                         ; preds = %entry
+  ret void
+}
+
 declare void @foo()
 declare void @bar(ptr)
 declare void @take_i32(i32)
@@ -618,5 +668,6 @@ declare ptr @_ZN4TempD2Ev(ptr returned)
 
 attributes #0 = { nounwind }
 attributes #1 = { noreturn }
+attributes #2 = { noreturn nounwind }
 
 ; CHECK: __cpp_exception:

@@ -1850,13 +1850,12 @@ bool WebAssemblyCFGStackify::fixCallUnwindMismatches(MachineFunction &MF) {

// If the EH pad on the stack top is where this instruction should unwind
// next, we're good.
MachineBasicBlock *UnwindDest = getFakeCallerBlock(MF);
MachineBasicBlock *UnwindDest = nullptr;
Copy link
Member Author

@aheejin aheejin Mar 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drive-by NFC fix. Fake caller block is used not here but somewhere below, and this part of the code cannot be entered when there is no EH pad successor:

if (SeenThrowableInstInBB || !MBB.hasEHPadSuccessor() ||
!WebAssembly::mayThrow(MI))
continue;

which will be assigned to UnwindDest below, so this has no meaning other than creating a may-not-be-necessary BB.

@@ -2119,7 +2060,9 @@ static void findUnwindDestinations(
// personalities.
UnwindDests.emplace_back(FuncInfo.getMBB(EHPadBB), Prob);
UnwindDests.back().first->setIsEHScopeEntry();
UnwindDests.back().first->setIsEHFuncletEntry();
// In Wasm, EH scopes are not funclets
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the comment on line 2059 above should be updated too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@aheejin aheejin merged commit 494fe0b into llvm:main Mar 11, 2025
11 checks passed
@aheejin aheejin deleted the unwind_dest_fix branch March 11, 2025 03:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants