-
Notifications
You must be signed in to change notification settings - Fork 5.2k
Description
When running our test suite we got a crash: The process was terminated due to an internal error in the .NET Runtime at IP 00007FFDC50AB5FC (00007FFDC5010000) with exit code 80131506.
:
Faulting application name: dotnet.exe, version: 6.0.222.6406, time stamp: 0x61e1d8df
Faulting module name: coreclr.dll, version: 6.0.222.6406, time stamp: 0x61e1d09e
Exception code: 0xc0000005
Fault offset: 0x000000000009b5fc
Faulting process id: 0x22b4
Faulting application start time: 0x01d823e188848226
Faulting application path: C:\Program Files\dotnet\dotnet.exe
Faulting module path: C:\Program Files\dotnet\shared\Microsoft.NETCore.App\6.0.2\coreclr.dll
We have configured automatic memory dumps creation which resulted in creating the following memory dump:
https://drive.google.com/file/d/19S1k74Foe9V6A03hRwIuebE42GVQUirI/view?usp=sharing
In our project (github.com/ravendb/ravendb) we use unmanaged memory directly, so it might be that it's because of our code.
The following analysis was made so far in WinDBG.
- Based on
!analyze -v
I the crashing stacktrace is:
EXCEPTION_RECORD: (.exr -1)
ExceptionAddress: 00007ffdc50ab5fc (coreclr!WKS::gc_heap::mark_object_simple+0x000000000000011c)
ExceptionCode: c0000005 (Access violation)
ExceptionFlags: 00000001
NumberParameters: 2
Parameter[0]: 0000000000000000
Parameter[1]: 0000022da7b73000
Attempt to read from address 0000022da7b73000
coreclr!WKS::gc_heap::mark_object_simple+0x11c
coreclr!WKS::GCHeap::Promote+0x74
coreclr!GcEnumObject+0x76
coreclr!GcInfoDecoder::EnumerateLiveSlots+0x792
coreclr!EECodeManager::EnumGcRefs+0xe9
coreclr!GcStackCrawlCallBack+0x12f
coreclr!Thread::StackWalkFramesEx+0xee
coreclr!Thread::StackWalkFrames+0xae
coreclr!ScanStackRoots+0x7a
coreclr!GCToEEInterface::GcScanRoots+0x9f
coreclr!WKS::gc_heap::mark_phase+0x291
coreclr!WKS::gc_heap::gc1+0x98
coreclr!WKS::gc_heap::garbage_collect+0x1ad
coreclr!WKS::GCHeap::GarbageCollectGeneration+0x14f
coreclr!WKS::gc_heap::trigger_gc_for_alloc+0x2b
coreclr!WKS::gc_heap::try_allocate_more_space+0x5c141
coreclr!WKS::gc_heap::allocate_more_space+0x31
coreclr!WKS::GCHeap::Alloc+0x84
coreclr!JIT_NewArr1+0x4bd
0x00007ffd`778e62c6
0x00007ffd`6f3e3770
0x00007ffd`741c1efb
...
0x00007ffd`67d765da
0x00007ffd`67deeff2
coreclr!CallDescrWorkerInternal+0x83
coreclr!DispatchCallSimple+0x80
coreclr!ThreadNative::KickOffThread_Worker+0x63
coreclr!ManagedThreadBase_DispatchMiddle+0x85
coreclr!ManagedThreadBase_DispatchOuter+0xae
coreclr!ThreadNative::KickOffThread+0x79
kernel32!BaseThreadInitThunk+0x14
ntdll!RtlUserThreadStart+0x21
- The heap is corrupted:
0:340> !verifyheap
object 0000022da400fff8: bad member 0000022D04C05821 at 0000022DA4010000
Last good object: 0000022DA400FFE0.
- The last good object is:
0:340> !do 0000022DA400FFE0
Name: Sparrow.Utils.TimeoutManager+<>c__DisplayClass6_0
MethodTable: 00007ffd67a1a6e0
EEClass: 00007ffd67a24988
Tracked Type: false
Size: 24(0x18) bytes
File: c:\Jenkins\workspace\PR_Tests\s\test\SlowTests\bin\Release\net6.0\Sparrow.dll
Fields:
MT Field Offset Type VT Attr Value Name
00007ffd679b44a0 40002eb 8 ...Private.CoreLib]] 0 instance 0000022da4010160 onCancel
I see onCancel
member so it's likely the following from TimeoutManager.cs
:
var onCancel = new TaskCompletionSource<object>(TaskCreationOptions.RunContinuationsAsynchronously);
using (token.Register(tcs => onCancel.TrySetCanceled(), onCancel))
{
}
- Bad object is:
0:340> !do 0000022da400fff8
Name: System.Action`1[[System.Object, System.Private.CoreLib]]
MethodTable: 00007ffd66a69428
EEClass: 00007ffd658f6788
Tracked Type: false
Size: 64(0x40) bytes
File: C:\Program Files\dotnet\shared\Microsoft.NETCore.App\6.0.2\System.Private.CoreLib.dll
Fields:
MT Field Offset Type VT Attr Value Name
00007ffd655a5678 40001ec 8 System.Object 0 instance 0000022d04c05821 _target
00007ffd655a5678 40001ed 10 System.Object 0 instance 0000000000000000 _methodBase
00007ffd65654228 40001ee 18 System.IntPtr 1 instance 00007FFD67569160 _methodPtr
00007ffd65654228 40001ef 20 System.IntPtr 1 instance 0000000000000000 _methodPtrAux
00007ffd655a5678 4000273 28 System.Object 0 instance 0000000000000000 _invocationList
00007ffd65654228 4000274 30 System.IntPtr 1 instance 0000000000000000 _invocationCount
It is System.Action1[[System.Object, System.Private.CoreLib]]
so my suspicion is that it's this action tcs => onCancel.TrySetCanceled()
.
The attempt to get its _target
results in:
!DumpObj /d 0000022d04c05821
<Note: this object has an invalid CLASS field>
Invalid object
The address matches the output of verifyheap
- bad member 0000022D04C05821 at 0000022DA4010000
so we know that the corrupted member is _target
.
- I see that in our code we use directly
onCancel
variable intcs => onCancel.TrySetCanceled()
instead of using the callback action:tcs => ((TaskCompletionSource<object>)tcs).TrySetCanceled()
but effectively it's the same thing. Could it cause any GC problems and result in something like that?