Skip to content

Conversation

janvorli
Copy link
Member

This change adds support for walking stack with interpreter frames present. The StackFrameIterator now considers the interpreter frames to be frameless frames like regular JITted / crossgened folder. This is a basis for GC, EH and debugger stack walking. The SOS clrstack and clrstack -i both display correct stack trace with interpreter frames on the stack with this change.

It required changing the GetCallerSP and EnsureCallerContextIsValid to become virtual methods so that the right variant for the interpreter or JITted / AOTed code can be invoked. Before, they were both static methods on the EECodeManager.

I have also added some extra handling of the case when the interpreter frames are on top of the stack so that the clrstack can dump them.

@janvorli janvorli added this to the 10.0.0 milestone Mar 25, 2025
@janvorli janvorli requested review from kg and jkotas March 25, 2025 21:40
@janvorli janvorli self-assigned this Mar 25, 2025
@Copilot Copilot AI review requested due to automatic review settings March 25, 2025 21:40
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 1 out of 20 changed files in this pull request and generated no comments.

Files not reviewed (19)
  • src/coreclr/debug/daccess/dacdbiimplstackwalk.cpp: Language not supported
  • src/coreclr/debug/daccess/stack.cpp: Language not supported
  • src/coreclr/debug/inc/dacdbistructures.inl: Language not supported
  • src/coreclr/inc/eetwain.h: Language not supported
  • src/coreclr/vm/FrameTypes.h: Language not supported
  • src/coreclr/vm/codeman.cpp: Language not supported
  • src/coreclr/vm/codeman.h: Language not supported
  • src/coreclr/vm/eetwain.cpp: Language not supported
  • src/coreclr/vm/eventtrace.cpp: Language not supported
  • src/coreclr/vm/exceptionhandling.cpp: Language not supported
  • src/coreclr/vm/frames.cpp: Language not supported
  • src/coreclr/vm/frames.h: Language not supported
  • src/coreclr/vm/gcinfodecoder.cpp: Language not supported
  • src/coreclr/vm/interpexec.cpp: Language not supported
  • src/coreclr/vm/interpexec.h: Language not supported
  • src/coreclr/vm/jitinterface.h: Language not supported
  • src/coreclr/vm/prestub.cpp: Language not supported
  • src/coreclr/vm/stackwalk.cpp: Language not supported
  • src/coreclr/vm/threadsuspend.cpp: Language not supported

Copy link
Contributor

Tagging subscribers to this area: @mangod9
See info in area-owners.md if you want to be subscribed.

@janvorli
Copy link
Member Author

cc: @BrzVlad, @cshung

FRAME_TYPE_NAME(DebuggerU2MCatchHandlerFrame)
FRAME_TYPE_NAME(ExceptionFilterFrame)
#ifdef FEATURE_INTERPRETER
FRAME_TYPE_NAME(InterpreterEntryFrame)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need really to have both entry and exit frames? I would expect that one InterpreterFrame should be enough. We will know that we have exited the interpreter if we find some other Frame or managed code first on the stack.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was considering that and this seems to work better with the stack walker. We could possibly have just the InterpreterEntryFrame, but then upon reaching it, we would immediately need to switch to the underlying interpreter frames and then once we get out of them, hit the InterpreterEntryFrame again and copy the transition record to the context to move to the caller of the sequence of interpreted frames. Also, the InterpreterEntryFrame doesn't have a direct pointer to the topmost frame in the InterpExecMethod, Vlad was concerned about updating that on every interpreter to interpreter call.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the interpreter -> AOT code calls going to push InterpreterExitFrame?

I think we should be optimizing for making the interpreter -> AOT calls as cheap as possible.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, those would need it too. We can pass current thread as a parameter to the InterpMethodFrame and then use it to push that frame without the thread local access overhead.

Copy link
Member

@BrzVlad BrzVlad Mar 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the purpose of the InterpreterEntryFrame is to store the context so can unwind to AOT callers, while the purpose of the InterpreterExitFrame is to make it cleaner to access the top most interpreter frame ? Would we always resume execution in the interpreter loop via a try/catch ? So if EH determines that an interpreter method caught the exception, in order to resume in the interpreter it would throw a C++ exception that will get caught at this interpreter exit location ?

PTR_InterpMethodContextFrame pFrame = m_pInterpMethodContextFrame;
_ASSERTE(pFrame != NULL && pFrame->ip != NULL);

// The frames of a method are linked in a reverse order (from bottom to top of the part of the stack)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an advantage in this reverse order? I would expect the interpreter frame to point to what we are executing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Vlad was concerned about updating such a pointer on every interpreter to interpreter call. The current call implementation is pretty efficient and according to his experience from Mono, this would add noticeable overhead.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moreover, this reverse walk is only needed in the native debugger on desktop scenario when someone has the debugger sitting in the middle of the InterpExecMethod, nothing else needs to walk that way (provided we have the two interpreter frame kinds).

Copy link
Member

@jkotas jkotas Mar 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Vlad was concerned about updating such a pointer on every interpreter to interpreter call.

We do not need to update the pointer on every interpreter-to-interpreter call. We only need to update the pointer when we exit the interpreter. ie instead of pushing the full Frame to exit the interpreter, we would just need to patch the current IP or current method in the Interpreter frame.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that since only the native debugger needs to walk this list and it would need to walk it even if I changed it the way you suggested in case it was sitting in the InterpExecMethod and someone has invoked the clrstack command, it doesn't seem really beneficial to store it. Unless we moved to the single interpreter frame way, which would introduce the unusual double handling of single explicit frame to the StackFrameIterator or something similar to that.
What is your concern about having two separate explicit frames?

Copy link
Member

@jkotas jkotas Mar 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the scheme looks more expensive than it needs to be and non-intuitive (the linked list is in wrong direction). EDIT: I see that it is double linked list so it has links in the right direction too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the direction I am using for this specific native debugger scenario is primarily used for the interpreter internal purposes, I have just reused it for these rare scenarios. This is not used for managed debuggers, GC or EH. In those cases we would always have the interpreter exit frame.

@janvorli janvorli force-pushed the stack-walk-interpreter-support-work branch from dd98e2b to 62f3514 Compare March 28, 2025 00:39
@janvorli
Copy link
Member Author

@jkotas I have made changes based on your feedback. There is now a single InterpreterFrame and it keeps pointer to the top frame of the related InterpExecMethod. That pointer is updated whenever the InterpExecMethod calls something external where runtime can be suspended. To enable it to work with SOS when debugging dumps or live debugging and the interpreter running in the InterpExecMethod without leaving it, we still search for the real top starting at the value that's stored there. In the EH, GC and managed debugger cases, the pointer will always be the right one.

@janvorli
Copy link
Member Author

The CI test failures are unrelated.

InterpExecMethod(&interpFrame, threadContext);
InterpreterFrame interpreterFrame(pTransitionBlock, &interpMethodContextFrame);

InterpExecMethod(&interpreterFrame, &interpMethodContextFrame, threadContext);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change looks ok for now. Do we have a plan for where and how the arguments are going to converted from the TransitionBlock to the interpreter representation?

Copy link
Member

@BrzVlad BrzVlad Mar 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking when invoking an interpreted method through the precode thunks, instead of calling ExecuteInterpretedMethod like we do now, it will call some other assembly thunk that receives a lower level bytecode referenced from the interpreter bytecode (maybe in some Interpreter code header). These assembly thunks will move arguments from the transition block to the interpreter stack, finally dispatching to ExecuteInterpreterMethod. I described this mechanism a while ago in a doc. Last time I spoke with @janvorli, the plan was to use this mechanism from the start since we will need it as a fallback anyway on iOS.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've used the TransitionBlock here for two reasons. One, as a possibly easy starting point for passing arguments until we implement the proper mechanism and second, as a mean to save and restore the callee saved registers. This is necessary as we need to be able to restore them during stack walking from interpreted to JITted / AOTed methods. Once we have the optimized mechanism, we can use a smaller structure than the TransitionBlock that would contain only the callee saved registers.

This change adds support for walking stack with interpreter frames
present. The StackFrameIterator now considers the interpreter frames to
be frameless frames like regular JITted / crossgened folder.
This is a basis for GC, EH and debugger stack walking. The SOS
`clrstack` and `clrstack -i` both display correct stack trace with
interpreter frames on the stack with this change.

I have added some extra handling of the case when the interpreter frames
are on top of the stack so that the clrstack can dump them.
In case we were generating the caller context, the SP and ContextFlags
were missing.
* Change the GET_CALLER_SP to assert and return NULL
* Remove extra assert at the call to GET_CALLER_SP
* Revert some changes removing "virtual" keyword that were not
  intentional
@janvorli janvorli force-pushed the stack-walk-interpreter-support-work branch from 5d43a67 to a257f58 Compare March 31, 2025 12:47
}
}
else
{
Copy link
Member

@BrzVlad BrzVlad Mar 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The iteration here seems weird. I don't see why we would ever reach this path, given we always start for from the "list head". Why would we ever need to iterate backwards. Nvm, I noticed this is actually the top and we set it when leaving interp

MethodDesc *pMD = (MethodDesc*)(targetMethod & ~INTERP_METHOD_DESC_TAG);
PCODE code = pMD->GetNativeCode();
if (!code) {
pInterpreterFrame->SetTopInterpMethodContextFrame(pFrame);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand this correctly, this is just an optimization and not required for correctness. Is that correct? It may be worth mentioning it in a comment.

Does the setting need to be undone when the call returns?

It is not clear to me whether this optimization is worth it. It may be better to do it during stackwalk - so that repeated stackwalks from approximately the same spot avoid walking the whole list.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it would still work correctly without it, so it can be viewed as an optimization.

It doesn't need to be undone when the call returns:

  • for the SOS scenario, we would seek for the top frame going in one way or the other through the linked list from the last value stored there. It doesn't matter if we start on a frame that's in the middle of the "substack" belonging to the current InterpExecMethod or if we start on an inactive frame that's somewhere by the end of the list. Based on the inactive vs active state of the frame (ip being 0 or non-zero), we know which way to seek for the last active frame.
  • for other scenarios, we would maintain the exact value so that we don't waste time scanning the list.

Since this optimization is just one store, I think it is worth doing that instead of updating it during the stack walk. Imagine a case with a deep recursion within interpreted frames only. It would cause a lot of seeking the first time we do a stack walk which is completely unnecessary.
We will need to do other stuff while leaving the interpreter, like catching exceptions from the native code and calling the DispatchManagedException, so this is going to be just one of the things to do. I am planning to add a macro named "EXIT_INTERPRETER_BEGIN/END or something along those lines to contain that stuff.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would cause a lot of seeking the first time we do a stack walk which is completely unnecessary.

This work is small and proportional to the total cost of stack walk (assuming we typically walk large part of the stack).

I am looking at it as "total number of instructions executed" in given scenario. I think updating the pointer in the stackwalker would be better on average in this metric.

This is a micro-optimization. It is fine to keep this as is. Comment would be nice.

We will need to do other stuff while leaving the interpreter, like catching exceptions from the native code and calling the DispatchManagedException, so this is going to be just one of the things to do.

Yep. I expect that the total cost of this other stuff will make us consider building the interpreter executor in C# eventually - but that is for future discussion.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a comment

{
// We already have the caller's frame context
// We just switch the pointers
PT_CONTEXT temp = pRD->pCurrentContext;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't unwinding the process of walking up the stack? Why are we switching the pointers instead of just overwriting the current context with the caller context? I assume I'm missing something here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we already have the caller context available, then you can just set the current context to it, there is no need to call unwind again to get the same values. The switching of pointer is to avoid copying in this specific case.
There are cases when we want to know the caller context without actually unwinding, so we call EnsureCallerContextIsValid. That's where the caller context gets created and then it can get reused here.

Copy link
Member

@kg kg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@janvorli
Copy link
Member Author

janvorli commented Apr 1, 2025

/ba-g the test failures in this PR are happening on all PRs.

@janvorli janvorli merged commit fa45aa5 into dotnet:main Apr 1, 2025
96 of 98 checks passed
@janvorli janvorli deleted the stack-walk-interpreter-support-work branch April 1, 2025 19:14
@github-actions github-actions bot locked and limited conversation to collaborators May 2, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants