Skip to content

Conversation

khushal1996
Copy link
Member

@khushal1996 khushal1996 commented Jun 19, 2025

Addresses Issue #112587

Current PR focusses on tracking the new APX EGPRs (R16 - R31) during GC tracking. It follows the discussion in issue #112587. APX registers are volatile and hence tracked in a separate volatileContextPointers like ARM64.

Current work focusses on enabling GC tracking for EGPRs in TARGET_UNIX

There are some things I need help with -

  1. Is it necessary to disable the processing of the extended registers if APX is not present on the system; if so, is there a similar mechanism to check in the VM/GC code if the APX ISA is enabled?
  2. Do we add offset asserts for EGPRs in asmconstants.h https://github.com/khushal1996/runtime/blob/ecdfb14194fce9462a612733c579fc6e55ffabe7/src/coreclr/vm/amd64/asmconstants.h#L344.
  3. Do we update the EGPR context here in FaultingExceptionFrame::UpdateRegDisplay_Impl https://github.com/khushal1996/runtime/blob/ecdfb14194fce9462a612733c579fc6e55ffabe7/src/coreclr/vm/amd64/cgenamd64.cpp#L144 ? This would lead to updating the dbgtargetcontext.h https://github.com/khushal1996/runtime/blob/ecdfb14194fce9462a612733c579fc6e55ffabe7/src/coreclr/debug/inc/dbgtargetcontext.h#L215
  4. Same goes for HijackFrame::UpdateRegDisplay_Impl https://github.com/khushal1996/runtime/blob/ecdfb14194fce9462a612733c579fc6e55ffabe7/src/coreclr/vm/amd64/cgenamd64.cpp#L229 ? Not sure which registers need to updated here (not all volatile registers are updated here)

TESTING


SDE test run with APX off
image

SDE test run with APX ON CCMP ON
image

@dotnet-policy-service dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Jun 19, 2025
Copy link
Contributor

Tagging subscribers to this area: @mangod9
See info in area-owners.md if you want to be subscribed.

PDWORD64 R29;
PDWORD64 R30;
PDWORD64 R31;
//X18 is reserved by OS, in userspace it represents TEB
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
//X18 is reserved by OS, in userspace it represents TEB

DWORD64 R29;
DWORD64 R30;
DWORD64 R31;
struct
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dotnet/dotnet-diag I believe CONTEXT structure is used by the debugger interfaces. Is this modification going to break them?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cannot say for sure at the moment. It depends on the decision if we want the debugger to track these new EGPRs. Once the context structure gets finalized is when we can say with surety about what might get impacted. Thoughts?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As of now, we have avoided making any kind of debugger changes since they would need windows OS to have XSTATE_APX support for extended CONTEXT. Once we have complete EGPR support, we can extend debugger support as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry missed this comment earlier when @jkotas first pinged. Generally the managed debugger needs to be aware of all registers because at various points it saves them and later restores them. If the debugger isn't aware of all the registers it needs to save then we can easily end up in situations where an app works without the debugger, but trying to step through the code in the debugger behaves unpredictably. For example here is a recent debugger bug where x86 floating point register capture/restore got inadvertently broken: https://devdiv.visualstudio.com/DevDiv/_workitems/edit/2428652

If we wanted to make this work available prior to having the debugger portions implemented and tested I think it would need to be explicitly an opt-in feature with warnings that debugging is unsupported.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @noahfalk
So what needs to be done in order to make this an opt in feature? Do I need to open a new issue to make this a opt in feature only with warnings or is that handled by the debugger team after this PR?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A common way we've done it in the past is to define an environment variable and then only use the new registers if the env var is set. Later on once it is fully supported we can switch the default value so the env var becomes an opt-out rather than opt-in.

Here is an example for AVX registers: https://github.com/dotnet/runtime/blob/main/src/coreclr/inc/clrconfigvalues.h#L673

I'd prefer if the opt-in mechanism was included in this PR so that we don't create any window of builds where debugging is broken.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@noahfalk, The opt-in for APX already exists on L681 (DOTNET_EnableAPX) and it is already defaulted to off (same with AVX10v2) since the hardware isn't available yet.

We automatically add a config switch per ISA (or ISA group in some cases) when the detection logic is added to the VM.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @tannergooding! Glad to see its already in place. I'd also ask that anywhere we advertise the env var should make it clear managed debugging isn't supported.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍. We notably do not document these switches and rarely advertise them. They are considered advanced and primarily for people doing local testing, so they can validate downlevel hardware.

There will likely be a blog post that lets people know about the switch when hardware does become available (so not until after .NET 10 ships) and we can ensure any nuances, such as the GC or debugger experience not working end to end, are called out there.

@khushal1996 khushal1996 force-pushed the kcm-gc-apx branch 3 times, most recently from c4c3fd4 to be67cad Compare July 2, 2025 22:19
@risc-vv

This comment was marked as outdated.

@risc-vv

This comment was marked as outdated.

@risc-vv

This comment was marked as outdated.

@risc-vv

This comment was marked as outdated.

@risc-vv

This comment was marked as outdated.

@@ -59,6 +59,11 @@ void ClearRegDisplayArgumentAndScratchRegisters(REGDISPLAY * pRD)
pContextPointers->R9 = NULL;
pContextPointers->R10 = NULL;
pContextPointers->R11 = NULL;

#if defined(TARGET_UNIX)
for (int i=0; i < 16; i++)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar question. Can/should we limit this to only scenarios with APX enabled?

Same question applies to all the other loops/handling for the extended registers here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Notably this can also just be a memset call and avoid the loop so its easier for the compiler to optimize.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar question. Can/should we limit this to only scenarios with APX enabled?

Same question applies to all the other loops/handling for the extended registers here.

Same question as here https://github.com/dotnet/runtime/pull/116806/files#r2208598605

Notably this can also just be a memset call and avoid the loop so its easier for the compiler to optimize.

Yes this can be a memset. But then it would differ from how things are done under ARM. Current coding convention only extends what was already present.

Comment on lines +1547 to +1551
#if defined(TARGET_UNIX)
_ASSERTE(regNum >= 0 && regNum <= 32);
#else // TARGET_UNIX
_ASSERTE(regNum >= 0 && regNum <= 16);
#endif // TARGET_UNIX
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feels like most of this handling isn't Unix specific. It's APX enabled vs disabled with Windows just not having it enabled yet.

Similar comment to other new #if TARGET_UNIX areas

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. Once windows OS gets support for APX XSTATE registers, we can get rid of UNIX checks.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tannergooding some places access the R16 inside the CONTEXT. can you guide me or point me to a similar place as to how those CONTEXT are handled by windows for extended context? I wanted to reomve the #ifdefs LINUX but since I dont have access to the extended CONTEXT for windows, the compilation fail on windows right now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As elaborated in a few other places, the "proper" way to deal with extended state is to query for the offset and size via LocateXStateFeature. This will tell you where the APX region begins and therefore where you can start accessing the relevant data.

This can be unique per CONTEXT given that a given instance may be using things like XSAVEOPT, XSAVEC, or even not saving a particular bit of extended state depending on scenario. The set of xstate features enabled for a given instance is queried using GetXStateFeaturesMask.

[MethodImpl(MethodImplOptions.NoInlining)]
static void StressRegisters()
{
// 32 reference variables to force JIT to use all GPRs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the JIT actually enregistering them all? I don't think simply having 32 objects is enough to guarantee that, particularly if calls and so some are going to require spilling anyways

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm.. I would look into the test one more time. We can force APX EPRs using JitStressRegs=4000 with what the testing was done but I added this test thinking JIT would enregister all registers. is there another way of using all the registers?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there is any way to guarantee it and the stress modes are primarily there to just help default to different registers to ensure they get tested as well.

The question was more because the comment is claiming to do x, but I don't think we can actually guarantee that happens. We could instead comment that we're using enough and at the time the test was written it was generating a particular bit of assembly, which is likely good enough to show that they're all being used.

@tannergooding
Copy link
Member

We should have a test covering P/Invokes and showing what the codegen difference is for those scenarios. If we're actually allowing the GC to use the extended registers, I would expect we are having to spill all 16 additional registers as part of the transition. -- We can, however, keep costs lower by spilling/loading via push2/pop2 and so the amount of instructions/latency should remain nearly the same.

@jkotas are there any other special cases we likely want to check as part of having an extended set of registers available to the GC?

@jkotas
Copy link
Member

jkotas commented Jul 16, 2025

My top concern with these changes is impact on debugger/diagnostic. I have commented on this above #116806 (comment) . I would like to see @dotnet/dotnet-diag signoff and validate that this is not breaking debugger/diagnostic.

  • CONTEXT structure is part of the debugger APIs like https://learn.microsoft.com/en-us/dotnet/framework/unmanaged-api/debugging/icordebugdatatarget-getthreadcontext-method . Is changing the CONTEXT structure a breaking change for the debugger and diagnostic tools? It is quite possible that the breaks that I am concerned about were introduced by earlier PR that changed the CONTEXT structure.

  • APX CONTEXT extension on Windows is going to be done using CONTEXT extension mechanism, without changing the CONTEXT structure itself. How are we going to adapt out code for this and how are the debugger / diagnostic tools going to deal with that?

  • HOST_UNIX ifdefs introduced in this PR are incompatible with cross-platform debugging (e.g. debugger running on Windows and debugged process running on Unix)

I expect that many of these changes will need to be redone to address these concerns.

@jkotas
Copy link
Member

jkotas commented Jul 16, 2025

I would expect we are having to spill all 16 additional registers as part of the transition.

The additional registers are volatile registers in the calling convention. Is that correct? It means they do not need saved as part of GC transition.

@khushal1996
Copy link
Member Author

The debugger context structure does not take into consideration the XSTATE_ISA context. hence nothing needs to be changed in the debugger context right now. Once we have windows support, we can complete the debugger part of the changes together since they would be simpler to handle.

  • APX CONTEXT extension on Windows is going to be done using CONTEXT extension mechanism, without changing the CONTEXT structure itself. How are we going to adapt out code for this and how are the debugger / diagnostic tools going to deal with that?

I was under the impression that we can get the state of the extended CONTEXT using OS APIs and load the EGPRs in debugger context. @tannergooding please correct me if I am wrong.

  • HOST_UNIX ifdefs introduced in this PR are incompatible with cross-platform debugging (e.g. debugger running on Windows and debugged process running on Unix)

Yes. Right now, cross compile support is not available since that would lead to using windows provided context in TARGET_UNIX and HOST_WINDOWS case.

@jkotas the current changes are aimed at having GC support only for HOST_UNIX and TARGET_UNIX cases.

return m_CPUCompileFlags.IsSet(InstructionSet_APX);
}
#endif // TARGET_AMD64

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what we want @khushal1996 is something like

inline bool IsAPXSupported()
{
#if defined(HOST_WINDOWS)
  return false;
#elif defined(HOST_UNIX)
  return m_CPUCompileFlags.IsSet(InstructionSet_APX);
#endif
}

and now instead of having the #ifdef for HOST_UNIX, we are simply checking IsAPXSupported(). When windows APIs become available to use, we will enabled that but the rest of the code should remain unchanged.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@anthonycanino I would actually expect we just do return m_CPUCompileFlags.IsSet(InstructionSet_APX); always.

I would rather expect that the minipal_getcpufeatures doesn't set InstructionSet_APX for Windows, given that the OS enablement query should be returning false today (as xgetbv should not have it set and the IsApxEnabled() query should end up reporting false on Windows, since there is no XSTATE_MASK_APX through which we can check that GetEnabledXStateFeatures() reports support).

@tannergooding
Copy link
Member

CONTEXT structure is part of the debugger APIs like https://learn.microsoft.com/en-us/dotnet/framework/unmanaged-api/debugging/icordebugdatatarget-getthreadcontext-method . Is changing the CONTEXT structure a breaking change for the debugger and diagnostic tools? It is quite possible that the breaks that I am concerned about were introduced by earlier PR that changed the CONTEXT structure.

@jkotas, the modified struct is Unix specific, we directly use the struct and appropriate APIs from the SDK on Windows. The PAL layer "should" be handling the mapping of this for the debug layer, it wouldn't work at all otherwise.


CONTEXT is a general purpose data structure that is used by many things, including the debugger. The layout is platform specific (a different struct exists for x86 vs x64 vs Arm64, etc) and is defined in winnt.h

However, this is just the "base size" and the actual size is dependent on the ContextFlags passed into InitializeContext (or InitializeContext2 if using xstate compaction). If xstate is enabled, then each xstate feature must have its offset within the buffer queried using LocateXStateFeature (which tells you the offset and length of said feature). -- In practice (but not guaranteed), this directly matches what xsave/xrstor produces on x64. It can vary due to additional things like xstate compaction, fast save/restore, or features being available or disabled, etc.

On Unix, we're "mirroring" the layout in the PAL, but we don't have the corresponding LocateXStateFeature and similar APIs and are just hard coding them as part of the struct. This makes it not as "robust" and we don't have the ability to do things like have APX but not have AVX512, so we're paying some additional cost for potentially unused features.

Realistically (assuming we need to keep emulating CONTEXT for the debugger or similar, so we couldn't just directly use ucontext), then we should probably mirror the Win32 stuff more accurately, including the way xstate is enabled and offsets are queried. The debugger would then need a way to query that information.

I don't believe the debugger ever directly tries to read the CONTEXT struct returned by GetContextState on Unix, since the relevant APIs to understand what exists in the struct (beyond the guaranteed basics) simply do not exist.

@anthonycanino
Copy link
Contributor

On Unix, we're "mirroring" the layout in the PAL, but we don't have the corresponding LocateXStateFeature and similar APIs and are just hard coding them as part of the struct. This makes it not as "robust" and we don't have the ability to do things like have APX but not have AVX512, so we're paying some additional cost for potentially unused features.

I think we are getting a little hung on on the mirroring in the PAL. I see spots in the code where the CONTEXT is used and fields like Rax are specifically referenced:

image

Following convention, we ifdef for UNIX and continued to use the PAL, though I see now that this will not work regardless based on the way the windows extended context works as you have described.

So in order to make this platform agnostic, it would seem that we can not use the PAL extended context to help save/restore some of this state, correct?

@tannergooding
Copy link
Member

I see spots in the code where the CONTEXT is used and fields like Rax are specifically referenced:

Right, but those are part of the base `CONTEXT` struct as defined by `winnt.h`:
typedef struct DECLSPEC_ALIGN(16) DECLSPEC_NOINITALL _CONTEXT {

    //
    // Register parameter home addresses.
    //
    // N.B. These fields are for convience - they could be used to extend the
    //      context record in the future.
    //

    DWORD64 P1Home;
    DWORD64 P2Home;
    DWORD64 P3Home;
    DWORD64 P4Home;
    DWORD64 P5Home;
    DWORD64 P6Home;

    //
    // Control flags.
    //

    DWORD ContextFlags;
    DWORD MxCsr;

    //
    // Segment Registers and processor flags.
    //

    WORD   SegCs;
    WORD   SegDs;
    WORD   SegEs;
    WORD   SegFs;
    WORD   SegGs;
    WORD   SegSs;
    DWORD EFlags;

    //
    // Debug registers
    //

    DWORD64 Dr0;
    DWORD64 Dr1;
    DWORD64 Dr2;
    DWORD64 Dr3;
    DWORD64 Dr6;
    DWORD64 Dr7;

    //
    // Integer registers.
    //

    DWORD64 Rax;
    DWORD64 Rcx;
    DWORD64 Rdx;
    DWORD64 Rbx;
    DWORD64 Rsp;
    DWORD64 Rbp;
    DWORD64 Rsi;
    DWORD64 Rdi;
    DWORD64 R8;
    DWORD64 R9;
    DWORD64 R10;
    DWORD64 R11;
    DWORD64 R12;
    DWORD64 R13;
    DWORD64 R14;
    DWORD64 R15;

    //
    // Program counter.
    //

    DWORD64 Rip;

    //
    // Floating point state.
    //

    union {
        XMM_SAVE_AREA32 FltSave;
        struct {
            M128A Header[2];
            M128A Legacy[8];
            M128A Xmm0;
            M128A Xmm1;
            M128A Xmm2;
            M128A Xmm3;
            M128A Xmm4;
            M128A Xmm5;
            M128A Xmm6;
            M128A Xmm7;
            M128A Xmm8;
            M128A Xmm9;
            M128A Xmm10;
            M128A Xmm11;
            M128A Xmm12;
            M128A Xmm13;
            M128A Xmm14;
            M128A Xmm15;
        } DUMMYSTRUCTNAME;
    } DUMMYUNIONNAME;

    //
    // Vector registers.
    //

    M128A VectorRegister[26];
    DWORD64 VectorControl;

    //
    // Special debug control registers.
    //

    DWORD64 DebugControl;
    DWORD64 LastBranchToRip;
    DWORD64 LastBranchFromRip;
    DWORD64 LastExceptionToRip;
    DWORD64 LastExceptionFromRip;
} CONTEXT, *PCONTEXT;

So in order to make this platform agnostic, it would seem that we can not use the PAL extended context to help save/restore some of this state, correct?

I believe the "correct" thing, particularly on Windows, is to get the base offset of the APX extended state by querying LocateXStateFeature with the XSTATE_MASK_APX control flag. Ideally we'd have a PAL equivalent of this helper so that we can keep the same abstraction on Unix platforms.

I don't think the GC has had to care about XSTATE up until this point since it's really only been interested in general-purpose registers, which were all part of the baseline


It looks like the DT_CONTEXT support used by the debugger is generally expecting the same and has some comments about the mismatch of it with T_CONTEXT defined by the PAL and how that is handled.

The debugger team likely needs to comment on how they expect cross debugging to work given the context and that there doesn't appear to be a way to query the extended XSTATE features in a cross platform way. If there were, which likely just requires them to have a way to call LocateXStateFeature and for the PAL to provide such a function, I would expect that all extended registers would be possible to make work (including YMM, ZMM, KMASK, and APX; none of which work and all of which have tracking issues; the YMM support in particular having been missing for 11 years)

@anthonycanino
Copy link
Contributor

I believe the "correct" thing, particularly on Windows, is to get the base offset of the APX extended state by querying LocateXStateFeature with the XSTATE_MASK_APX control flag. Ideally we'd have a PAL equivalent of this helper so that we can keep the same abstraction on Unix platforms.

Do you think its worth to mimic the LocateXStateFeature in the PAL then? I suppose on linux, something like LocateXStateFeature can return the offset into the defined CONTEXT extended registers. My only concern is, are we ok to adjust the PAL and get this in for RC1?

@tannergooding
Copy link
Member

tannergooding commented Jul 16, 2025

Do you think its worth to mimic the LocateXStateFeature in the PAL then? I suppose on linux, something like LocateXStateFeature can return the offset into the defined CONTEXT extended registers. My only concern is, are we ok to adjust the PAL and get this in for RC1?

I would defer to @jkotas and the debugger team. I expect that for the APX feature to work on Windows, we will have to use LocateXStateFeature and so to me it would make sense to mirror that on Unix via the PAL so that the same logic can work for both without #ifdef or specialization.

As iterated above, I believe the GC hasn't had to deal with this yet simply because it's only been concerned about looking at general purpose registers and those have always been part of the core state. APX is changing that to have some general-purpose registers as part of the XSTATE.

The debugger likewise has an 11 year old issue of not being able to observe or surface extended state. As such, you cannot observe YMM, ZMM, or KMASK state in the register window and a few other scenarios. Locals are still observable because they get spilled to the stack and the debugger just reads the memory for each backing field (there are some quirks with Vector<T>, but that's not really related: #9688).

Them not looking at extended state hasn't been a critical priority up to this point because they still pass CONTEXT through untouched for any sections they don't understand. So it's largely just been a painpoint for the end user experience when doing some more niche scenarios, like trying to view the register window for managed code.

The GC starting to have values stored in the XSTATE area is really the "forcing factor" here, since it will have to start observing these registers for correctness reasons. The debugger would be great to fix, but I don't think the register window not working is "blocking" (it hasn't been blocking for any of the other areas we've touched).

@noahfalk
Copy link
Member

The debugger team likely needs to comment...

I've got this in my todo list but I am heads down at the moment making sure one of our new diagnostics features is working properly for the Preview7 snap coming imminently. Hopefully next week things will be a little calmer.

ULONGLONG **ppRax = &pRD->pCurrentContextPointers->Rax;
#endif
if(ExecutionManager::GetEEJitManager()->IsAPXSupported() && regNum >= 16)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When APX is not supported, we should never see regNum >= 16 here. It is sufficient and more efficient to just check regNum >= 16.

IsAPXSupported can be assert.

Also, GCInfoDecoder is used in number of cross-process diagnostic/debugger scenarios (see https://github.com/dotnet/runtime/blob/main/src/coreclr/inc/gcinfodecoder.h#L13). If this code was dependent on checking IsAPXSupported, you would have to figure out how to get this value for the target.

@jkotas
Copy link
Member

jkotas commented Jul 17, 2025

GC

Terminology nit: I would call this "GC stack root enumeration" to avoid confusion. Unqualified "GC" typically means the code under src\coreclr\gc that I do not expect to affect by any of these changes.

@jkotas
Copy link
Member

jkotas commented Jul 17, 2025

My only concern is, are we ok to adjust the PAL and get this in for RC1?

I think these changes should wait for .NET 11. We have started stabilizing .NET 10 (main is going to switch to .NET 11 early august). We do not want to be merging new potentially destabilizing features for next few weeks.

Is there a reason why this change needs to be in .NET 10?

@jkotas
Copy link
Member

jkotas commented Jul 17, 2025

Do you think its worth to mimic the LocateXStateFeature in the PAL then?

It is probably the easiest option. It matches current CoreCLR PAL architecture that tries to emulate Windows.

An alternative is to create our own context abstraction that is optimized for the given job. PAL_LIMITED_CONTEXT in NativeAOT is on this plan.

…lso fix small bugs and remove ifdefs for linux wherever possible.
// Stores the ISA capability of the hardware
int cpuFeatures = 0;
#if defined(TARGET_AMD64)
inline bool IsAPXSupported()
Copy link
Member

@jkotas jkotas Jul 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unnecessary. We should introduce this in the GC only if the GC ever needs conditional paths for Apx - that is unlikely.

@anthonycanino
Copy link
Contributor

My only concern is, are we ok to adjust the PAL and get this in for RC1?

I think these changes should wait for .NET 11. We have started stabilizing .NET 10 (main is going to switch to .NET 11 early august). We do not want to be merging new potentially destabilizing features for next few weeks.

Is there a reason why this change needs to be in .NET 10?

We've been developing APX support for .NET 10 so as to enable it as an opt in preview feature. If changing the PAL is too much, it would be helpful to allow some kind solution that would only be enabled if the APX flags are explicitly turned on.

@noahfalk
Copy link
Member

noahfalk commented Aug 2, 2025

I would defer to @jkotas and the debugger team. I expect that for the APX feature to work on Windows, we will have to use LocateXStateFeature and so to me it would make sense to mirror that on Unix via the PAL so that the same logic can work for both without #ifdef or specialization.

I've been looking into this more. The current way we are doing our CONTEXT handling appears a bit fragile and challenging to maintain as we want to add support for more registers. A couple thoughts:

  1. The intent of T_CONTEXT and DT_CONTEXT is that these should match the definition of CONTEXT for the targetted platform. In most cases that appears true, but ever since the change to PAL CONTEXT in March 2023 they've been out-of-sync on non-Windows. CONTEXT has XState included on ARM64 and AMD64, T_CONTEXT has XState included if the debugger was compiled for HOST_UNIX but not HOST_WINDOWS, and DT_CONTEXT never has XState included regardless of HOST. This PR shows some examples of what breaks when the types are out-of-sync and how behavior of some of our public APIs currently rely on the CONTEXT size to remain unchanged. There are other places in the code which are going unnoticed either because they aren't well tested or they only run on Windows where the type definitions still match. For example:
        // Since Dac + DBI are tightly coupled, context sizes should be the same.
        if (cbSizeContext != sizeof(T_CONTEXT))
        {
            ThrowHR(E_INVALIDARG);
        }

I'd recommend rather than allowing them to remain out-of-sync we should restore PAL CONTEXT to its fixed pre-March 2023 definition and define new PAL APIs like LocateXStateFeature to work with the additional registers.

  1. Its challenging to look at any particular function in the runtime repo that manipulates CONTEXT/T_CONTEXT/DT_CONTEXT and know whether that code path properly handles (or even tries to handle) the variable sized XState portions of CONTEXT. In place of using CONTEXT* everywhere we might define some new type like CONTEXT_BUFFER* which provides a much stronger clue that the code using it is XState-enlightened. If we define sizeof(CONTEXT_BUFFER)=1 and make a private copy constructor then switching from CONTEXT* to CONTEXT_BUFFER* throughout our code would also surface many existing fixed-size assumptions as compiler errors or runtime errors. We could also define T_CONTEXT_BUFFER* to be used in place of T_CONTEXT* and DT_CONTEXT*. Hopefully only some much smaller number of code locations would ever cast CONTEXT_BUFFER* to CONTEXT* in order to directly read/write individual registers.

  2. We currently have three separate definitions of CONTEXT data structures (dbgtargetcontext.h, crosscompile.h, and pal.h). If we agree on keeping them fixed at the pre-March 2023 definitions we could just define the structures once and use typedef to define the others. We could also get rid of DT_CONTEXT entirely and consolidate on T_CONTEXT.

None of this has to happen as part of this PR, but we would need to address it at some point as part of the path to making the debugger functional with these registers in use. There would almost certainly be other elements of the work as well. I agree with you @tannergooding that using new registers as general purpose registers indeed raises the bar on what portions of the debugger functionality need to handle them correctly. When folks are ready I'm happy to discuss that in more detail but its probably out-of-scope for this PR.

@risc-vv
Copy link

risc-vv commented Aug 6, 2025

@dotnet/samsung Could you please take a look? These changes may be related to riscv64.

@anthonycanino
Copy link
Contributor

Just an FYI. We implemented a work around to prevent GC references from using the APX extended GPRs here #117991 (comment). in order to allow for the APX features in .NET 10 without requiring these context changes.

We plan pick this item up for .NET 11.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-VM-coreclr community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants