-
Notifications
You must be signed in to change notification settings - Fork 5k
fallback to GetAddrInfoW if GetAddrInfoExW fails #45816
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Tagging subscribers to this area: @dotnet/ncl Issue DetailsWhile root cause why GetAddrInfoExW() is failing from impersonificated context is still investigated this change should make Async DNS working again on Windows when running from impersonificated context. We do not have infrastructure for automated authenticatiobn tests but I run the repro code manually on my Windows 10 and it works now eg. both explicit DNS lookup and HttpClient Async fetch works.
|
Also, is it possible to add a test for these changes? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
have we talked to Windows team to see if there's a way we can keep using async here?
no, not really. This was part of the Enterprise-auth project but it got de-prioritized. |
Co-authored-by: Cory Nelson <[email protected]>
src/libraries/System.Net.NameResolution/src/System/Net/NameResolutionPal.Windows.cs
Outdated
Show resolved
Hide resolved
…olutionPal.Windows.cs Co-authored-by: Jan Kotas <[email protected]>
src/libraries/System.Net.NameResolution/src/System/Net/NameResolutionPal.Windows.cs
Show resolved
Hide resolved
src/libraries/System.Net.NameResolution/src/System/Net/NameResolutionPal.Windows.cs
Show resolved
Hide resolved
/azp run runtime-libraries-coreclr outerloop |
Azure Pipelines successfully started running 1 pipeline(s). |
return result; | ||
if (NameResolutionPal.TryGetAddrInfoAsync(hostName, justAddresses, addressFamily, cancellationToken, out t)) | ||
{ | ||
t.ContinueWith((resultTask) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should be returning the Task returned by ContinueWith
, or the ActivityId following the Dns resolution wouldn't reflect the Stop event.
This was the issue with the original Telemetry instrumentation of Dns, pointed out by @ManickaP in #41670 (comment) (and fixed in #42188).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, this will break the telemetry once again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had different change where telemetry test was failing. With this on, telemetry tests are still passing. The log still be edited but I'm not sire about the ID. If we return the ContinueWith, we will need to copy result (or failure). I will look into it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tests will only validate that the Start/Stop are logged with the same ActivityId.
The issue here appears when you have another Start/Stop pair following the name resolution:
- HttpRequestStart starts with ID
/1/
- NameResolutionStart starts with nested ID
/1/1/
- NameResolutionStop correctly sees
/1/1/
and updates it to/1/
. - SocketConnectStart starts on the same async context as
NameResolutionStart
, but it never observes the change of the ID back to/1/
, so it starts with a nested ID/1/1/1
It will appear as if the Socket connect is an operation nested under the NameResolution Start/Stop, when in reality it's only following it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I re-factor the code and the activity ID is right as far as I can tell as well as running impersonificated still works.
I went back to the await
path as the ContinueWith
was getting complicated more than I wanted. The only downside is that we will miss time of the call it self.
I feel it probably does not matter comparing duration of the actual resolution.
We could probably start the timer upfront but I'm not sure if that is worth of the complication.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
after chatting with @MihaZupan , I decided to fix the stopwatch as well.
This should be ready for another review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Telemetry here LGTM
src/libraries/System.Net.NameResolution/src/System/Net/NameResolutionPal.Windows.cs
Outdated
Show resolved
Hide resolved
// instead of calling the synchronous version in the ThreadPool. | ||
// If the OS supports it and 'hostName' is not an IP Address, resolve the name asynchronously | ||
// instead of calling the synchronous version in the ThreadPool. | ||
// If it fails, we will fall-back to ThreadPool as well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// If it fails, we will fall-back to ThreadPool as well. | |
// If it fails, we will fall back to ThreadPool as well. |
{ | ||
// WSATRY_AGAIN indicates possible problem with reachability according to docs. | ||
// However, if servers are really unreachable, we would still get IOPending here | ||
// and final result would be posted via overlapped IO. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not understanding the above comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WSATRY_AGAIN is valued error code for failures other than we see in the impersonification case. What I'm trying to say here is based on my testing and understanding, the other WSATRY_AGAIN would not be synchronous failure.
In such cases, we should return transient failure to caller rather than re-trying on thread-pool IMHO.
src/libraries/System.Net.NameResolution/src/System/Net/NameResolutionPal.Windows.cs
Outdated
Show resolved
Hide resolved
{ | ||
result = await ((Task<T>)NameResolutionPal.GetAddrInfoAsync(hostName, justAddresses, addressFamily, cancellationToken)).ConfigureAwait(false); | ||
return result; | ||
//return new Tuple<bool, Task>(true, (Task)InternalGetAddrInfoWithTelemetryAsync(task, hostName)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
?
System.Speech.Tests test failure seems unrelated. |
/backport to release/5.0 |
Started backporting to release/5.0: https://github.com/dotnet/runtime/actions/runs/479577114 |
@wfurt backporting to release/5.0 failed, the patch most likely resulted in conflicts: $ git am --3way --ignore-whitespace --keep-non-patch changes.patch
Applying: fallback to GetAddrInfoW if GetAddrInfoExW fails
error: sha1 information is lacking or useless (src/libraries/System.Net.NameResolution/src/System/Net/Dns.cs).
error: could not build fake ancestor
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0001 fallback to GetAddrInfoW if GetAddrInfoExW fails
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".
Error: The process '/usr/bin/git' failed with exit code 128 Please backport manually! |
While root cause why GetAddrInfoExW() is failing from impersonificated context is still investigated this change should make Async DNS working again on Windows when running from impersonificated context.
We do not have infrastructure for automated authenticatiobn tests but I run the repro code manually on my Windows 10 and it works now eg. both explicit DNS lookup and HttpClient Async fetch works.
fixes #29935