Skip to content

HttpClient (using SocketsHttpHandler) may cause unobserved IOException that cannot be caught #104324

@egraff

Description

@egraff

Description

We've seen a crash from one of our production applications due to an IOException being emitted to the application's TaskScheduler.UnobservedTaskException event handler. The IOException wraps a SocketException, and is related to the use of an HttpClient to do a GET request. The application is using a vanilla HttpClient (which internally will use a SocketsHttpHandler which in turn uses an HttpConnectionPool), and the call to HttpClient.GetAsync() is protected by a try-catch that catches all exceptions. The unobserved exception came from an internal _readAheadTask that is owned by System.Net.Http.HttpConnection, and the exception did not propagate up the stack back to the caller of HttpClient.GetAsync().

The call stack looks like this:

System.IO.IOException: Unable to read data from the transport connection: An existing connection was forcibly closed by the remote host..
 ---> System.Net.Sockets.SocketException (10054): An existing connection was forcibly closed by the remote host.
   at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.CreateException(SocketError error, Boolean forAsyncThrow)
   at System.Net.Sockets.NetworkStream.ReadAsync(Memory`1 buffer, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnection.PrepareForReuse(Boolean async)
   at System.Net.Http.HttpConnectionPool.TryGetPooledHttp11Connection(HttpRequestMessage request, Boolean async, HttpConnection& connection, HttpConnectionWaiter`1& waiter)
   at System.Net.Http.HttpConnectionPool.SendWithVersionDetectionAndRetryAsync(HttpRequestMessage request, Boolean async, Boolean doRequestAuth, CancellationToken cancellationToken)
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[TStateMachine](TStateMachine& stateMachine)
   at System.Net.Http.HttpConnectionPool.SendAsync(HttpRequestMessage request, Boolean async, Boolean doRequestAuth, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPoolManager.SendAsyncCore(HttpRequestMessage request, Uri proxyUri, Boolean async, Boolean doRequestAuth, Boolean isProxyConnect, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPoolManager.SendAsync(HttpRequestMessage request, Boolean async, Boolean doRequestAuth, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionHandler.SendAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   at System.Net.Http.HttpMessageHandlerStage.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
   at System.Net.Http.RedirectHandler.SendAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[TStateMachine](TStateMachine& stateMachine)
   at System.Net.Http.RedirectHandler.SendAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   at System.Net.Http.SocketsHttpHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
   at System.Net.Http.HttpClient.<SendAsync>g__Core|83_0(HttpRequestMessage request, HttpCompletionOption completionOption, CancellationTokenSource cts, Boolean disposeCts, CancellationTokenSource pendingRequestsCts, CancellationToken originalCancellationToken)
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[TStateMachine](TStateMachine& stateMachine)
   at System.Net.Http.HttpClient.GetAsync(Uri requestUri)

Note that the _readAheadTask is created by HttpConnection.PrepareForReuse(Boolean async), where there is a sync-async disassociation (PrepareForReuse() is synchronous, but NetworkStream.ReadAsync() is async, and it is the returned ValueTask from NetworkStream.ReadAsync() that is assigned to _readAheadTask). The problem is that for this particular issue and code path, _readAheadTask was not awaited or checked for completion in a way that would observe any (possible) exception, so that exception was later emitted to TaskScheduler.UnobservedTaskException after the ValueTask was garbage collected.

Reproduction Steps

I have tried to create a stress test to reproduce the issue, but have so far been unsuccessful, because it is very difficult to recreate socket/transport problems when dealing with real sockets, and it is not possible to simulate as the socket cannot be mocked out. The exception is likely caused by either a transient network problem, or a connection timing out in just the right way or at just the right time to cause an exception when NetworkStream.ReadAsync() is called by PrepareForReuse().

However, I do believe I see a problem with the implementation of PrepareForReuse() that could explain why we end up with an unobserved exception. This is described further down, under "other information".

Expected behavior

The exception should be observed by the internal implementation. Either by logging and forgetting the task exceptions – there is already another code path in HttpConnection that does this, see

if (_readAheadTask != default)
{
Debug.Assert(_readAheadTaskStatus == ReadAheadTask_CompletionReserved);
LogExceptions(_readAheadTask.AsTask());
}

– or by propagating the exception back to the caller (depending on whether the exception occurs in context of a stack frame where it is appropriate and/or possible to do this).

There should be no unobserved exceptions.

Actual behavior

An unobserved internal socket exception might be emitted to TaskScheduler.UnobservedTaskException when garbage collected, which may cause an application to crash.

Regression?

No response

Known Workarounds

No known workarounds.

Configuration

  • Seen on .NET 8.0.3
  • The application was running on Windows Server 2019.
  • The code was compiled for x64

Other information

After some investigation, this is my guess as to what is causing the problem:

The method TryGetPooledHttp11Connection() in HttpConnectionPool contains this code:

if (!connection.PrepareForReuse(async))
{
if (NetEventSource.Log.IsEnabled()) connection.Trace("Found invalid HTTP/1.1 connection in pool.");
connection.Dispose();
continue;
}

where the relevant bit of PrepareForReuse() from HttpConnection is this:

Debug.Assert(_readAheadTaskStatus == ReadAheadTask_NotStarted);
_readAheadTaskStatus = ReadAheadTask_CompletionReserved;
// Perform an async read on the stream, since we're going to need to read from it
// anyway, and in doing so we can avoid the extra syscall.
try
{
#pragma warning disable CA2012 // we're very careful to ensure the ValueTask is only consumed once, even though it's stored into a field
_readAheadTask = _stream.ReadAsync(_readBuffer.AvailableMemory);
#pragma warning restore CA2012
return !_readAheadTask.IsCompleted;
}
catch (Exception error)
{
// If reading throws, eat the error and don't reuse the connection.
if (NetEventSource.Log.IsEnabled()) Trace($"Error performing read ahead: {error}");
return false;
}

Note that if PrepareForReuse() returns false, then the HttpConnection is disposed (and note also that there is no code in HttpConnection.Dispose() that would check the _readAheadTask for any exceptions). And for this code path, PrepareForReuse() would return false if _readAheadTask.IsCompleted is true. For the happy path, this is not a problem, but note that Task.IsCompleted may imply Task.IsFaulted. So if the call to NetworkStream.ReadAsync() completes synchronously and returns a Task that is in a faulted state, then the associated exception would be unobserved.

At the moment I cannot prove that this is the exact scenario that has happened in my case, but it seems likely that is could be. If it is possible (and I do believe it is possible) for NetworkStream.ReadAsync() to synchronously return a task that is in a faulted state (because of e.g. an IOException), then it seems the code in HttpConnectionPool and HttpConnection does not properly take this into account.

As for why I think it is possible for ReadAsync() to return a task that is already in a faulted state, consider this code:

ValueTask.FromException<int>(CreateException(error));

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions