Skip to content

Enable compiling with crossgen2 #32508

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
9 commits merged into from
May 24, 2021
Merged

Enable compiling with crossgen2 #32508

9 commits merged into from
May 24, 2021

Conversation

dougbu
Copy link
Member

@dougbu dougbu commented May 7, 2021

@ghost ghost added the area-infrastructure Includes: MSBuild projects/targets, build scripts, CI, Installers and shared framework label May 7, 2021
@dougbu
Copy link
Member Author

dougbu commented May 7, 2021

So far, have only rebased on latest 'main'. Working on test runs using the crossgen2ed assemblies.

@@ -65,7 +65,7 @@ jobs:
- script: ./eng/build.cmd -ci -nobl -noBuildRepoTasks -noRestore -test -all -noBuildJava -noBuildNative
-projects eng\helix\helix.proj /p:RunQuarantinedTests=true /p:IsRequiredCheck=true /p:IsHelixJob=true
/p:BuildInteropProjects=true /p:RunTemplateTests=true
/p:CrossgenOutput=false /p:ASPNETCORE_TEST_LOG_DIR=artifacts/log
/p:ASPNETCORE_TEST_LOG_DIR=artifacts/log
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

!revert! before squish and merge

<Reference Include="Microsoft.NETCore.App.Crossgen2.$(BuildOsName)-$(BuildArchitecture)"
ExcludeAssets="All"
PrivateAssets="All"
GeneratePathProperty="true" />
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@davidwrighton, questions from #31778:

I suggest Microsoft.NETCore.App.Crossgen2.* packages should contain a same-named .props file that sets a property for the tool path. This didn't make sense for Microsoft.NETCore.App.Runtime.* packages but definitely does now. Wouldn't need the path property nor some of the later complications finding the tool if package were more self-describing.

Any possibility of removing the need for workarounds (hacks) like https://github.com/dotnet/aspnetcore/pull/31778/files#diff-eb8fd31a7971d712606a187ea074d0bd5d6ea85aae5a707107fd998216986ea5R117, https://github.com/dotnet/aspnetcore/pull/31778/files#diff-eb8fd31a7971d712606a187ea074d0bd5d6ea85aae5a707107fd998216986ea5R365-R368, and related bits in the future❔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those may be removed in the future, but not in the next month or two. I'm plenty busy with other build system issues.

@dougbu dougbu force-pushed the dougbu/crossgen2 branch 3 times, most recently from baf2973 to 44c150b Compare May 10, 2021 21:31
@dougbu dougbu force-pushed the dougbu/crossgen2 branch from 44c150b to f33d88e Compare May 14, 2021 02:35
@dougbu
Copy link
Member Author

dougbu commented May 14, 2021

@davidwrighton @agocke @Lxiamail finally seeing the expected BadImageFormatExceptions now❕ May be close to done w/ testing.

@dougbu
Copy link
Member Author

dougbu commented May 14, 2021

@sebastienros heads up. When I merge this (after removing the last two commits) tomorrow (again, I hope), may see an impact on the perf results. @davidwrighton suspects the initial gains will be modest but we should expect more over time. Let us know if perf decreases.

@dougbu
Copy link
Member Author

dougbu commented May 14, 2021

@davidwrighton help please. I'm seeing test failures I didn't expect e.g. at https://dev.azure.com/dnceng/public/_build/results?buildId=1138686&view=ms.vss-test-web.build-test-results-tab&runId=34540072&resultId=123944&paneView=debug

Assert.Throws() Failure
Expected: typeof(System.InvalidOperationException)
Actual:   typeof(System.IO.FileLoadException): Could not load file or assembly 'Microsoft.Extensions.FileProviders.Embedded, Version=42.42.42.42, Culture=neutral, PublicKeyToken=adb9793829ddae60' or one of its dependencies. The located assembly's manifest definition does not match the assembly reference. (Exception from HRESULT: 0x80131040)
---- System.IO.FileLoadException : Could not load file or assembly 'Microsoft.Extensions.FileProviders.Embedded, Version=42.42.42.42, Culture=neutral, PublicKeyToken=adb9793829ddae60' or one of its dependencies. The located assembly's manifest definition does not match the assembly reference. (Exception from HRESULT: 0x80131040)

In this case, build was on BuildPool.Server.Amd64.VS2019.Open and test ran on a Windows.10.Amd64.Server20H2.Open Helix agent.

@davidwrighton
Copy link
Member

@dougbu, I think those test failures are an issue caused by running on the wrong runtime.

Note the result for https://dev.azure.com/dnceng/public/_build/results?buildId=1138686&view=ms.vss-test-web.build-test-results-tab&runId=34540072&resultId=123876&paneView=debug which actually passes.

I can see in AzDo, that the failing test ran as part of workitem { "HelixJobId": "91361424-0db7-42aa-ba2d-5656f83b8c40", "HelixWorkItemName": "Microsoft.Extensions.FileProviders.Embedded.Tests--net461" } and the successful one ran as part of { "HelixJobId": "91361424-0db7-42aa-ba2d-5656f83b8c40", "HelixWorkItemName": "Microsoft.Extensions.FileProviders.Embedded.Tests--net6.0" }

Looking at those names, I'm suspicious. Does that imply that the failing test actually tried to run the binary on the .NET 4.6.1 runtime, and the succeeding test ran on the .NET 6 runtime. If so, that is an indication that the test passed just fine as we don't expect these images to actually work on .NET 4.6.1 after running crossgen.

OverwriteReadOnlyFiles="$(OverwriteReadOnlyFiles)"
Retries="$(CopyRetryCount)"
RetryDelayMilliseconds="$(CopyRetryDelayMilliseconds)"
SkipUnchangedFiles="$(SkipCopyUnchangedFiles)" />
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does that imply that the failing test actually tried to run the binary on the .NET 4.6.1 runtime, and the succeeding test ran on the .NET 6 runtime.

Yes, that's almost definitely what it means. I didn't consider the TFM in this block. Should only <Copy/> when targeting net6.0. I'll add a commit or two to (a) focus temporarily on Windows ➡️ Windows and Linux ➡️ Linux jobs (because the other combinations are failing as consistently as we expect) and (b) fix this targeting issue.

@dougbu dougbu force-pushed the dougbu/crossgen2 branch 2 times, most recently from 9cc9ddf to 885c31a Compare May 16, 2021 04:36
@dougbu dougbu force-pushed the dougbu/crossgen2 branch from 3b053cb to 8b0e185 Compare May 17, 2021 00:04
@dougbu dougbu force-pushed the dougbu/crossgen2 branch from efe47d8 to a6a5130 Compare May 17, 2021 05:03
@dougbu
Copy link
Member Author

dougbu commented May 18, 2021

🆗 @davidwrighton we're down to two distinct problems that I'm not understanding

  1. The Caching_SendFileWithFullContentLength_Cached() test in src/Servers/HttpSys/test/FunctionalTests/ResponseCachingTests.cs is failing consistently. I suspect a relative speed-up or slowdown of the executing code means the client portion is taking longer than 10 seconds before it issues the second request. (Client must send second request before HTTP cache expires for the test to pass.)
  2. Our "Tests: Helix ARM64 matrix" job in the aspnetcore-helix-matrix pipeline is consistently running out of space before getting any useful test results e.g. https://dev.azure.com/dnceng/public/_build/results?buildId=1141412&view=logs&j=99ff3a02-281e-524b-bc15-c0daf2fc9715. This likely means assemblies crossgen2ed on Ubuntu 18.04 are consistently failing on Redhat 7, Alpine 3.12, and so on i.e. we're unable to store all of the test failure information.

Neither of these problems block the real work because we don't normally crossgen or crossgen2 before submitting Helix jobs. But, could either indicate a real issue in the assemblies we ship❔

@dougbu
Copy link
Member Author

dougbu commented May 18, 2021

a relative speed-up or slowdown of the executing code

Hmm, that's not very clear. I mean either the client code (that in the using block) isn't performing as well as it did before or the other tests are running faster, leading to new contention for HTTP connections. @Tratcher what do you think is likely❔

Background: This PR is both using crossgen2 and hacking things in order to test everything against the crossgen2ed assemblies. We don't use crossgen or crossgen2 in Helix tests outside this PR.

@Tratcher
Copy link
Member

Don't fret over Caching_SendFileWithFullContentLength_Cached, those tests have never been the most reliable. You can quarantine it and open an issue.

@davidwrighton
Copy link
Member

@dougbu I can't tell you if Caching_SendFileWithFullContentLength_Cached is a real issue. I would rely on your team to understand the test results.

With respect to the arm64 disk space failure, I suspect that you are getting very close to running out of disk space on all architectures, but that the arm64 binaries are larger than on other architectures. (This is expected. Our arm64 compiler produces a larger output than our x64 compiler). I see that you've injected a disk space check into the set of jobs, but that it executes very early. I would like to see a second Disk Size check, but after building the shared fx. My guess is that the available disk space will be rather tight, but I don't have a good sense on just how tight. Alternatively, could you run those tests using one of the Helix build machines that has more available hard disk?

@dougbu
Copy link
Member Author

dougbu commented May 19, 2021

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 3 pipeline(s).

@dougbu dougbu force-pushed the dougbu/crossgen2 branch from a6a5130 to 9b73a69 Compare May 19, 2021 22:42
@dougbu
Copy link
Member Author

dougbu commented May 19, 2021

With respect to the arm64 disk space failure, I suspect that you are getting very close to running out of disk space on all architectures, but that the arm64 binaries are larger than on other architectures.

Resolved this in my "Move to Core-Eng build agents" commit. But, the size difference we're talking about is pretty big. A regular build seems to use 16G of diskspace and disks on the hosted agents tend to have ~22G available on the / disk. We'll see if the switch Just Works™️

@dougbu dougbu requested a review from a team May 20, 2021 23:16
@dougbu dougbu marked this pull request as ready for review May 20, 2021 23:16
@dougbu dougbu requested a review from Pilchie as a code owner May 20, 2021 23:16
@dougbu
Copy link
Member Author

dougbu commented May 21, 2021

/ping reviewers. This is green (🍏📗🟢💚). Thanks @davidwrighton

@dougbu
Copy link
Member Author

dougbu commented May 21, 2021

/ping reviewers

Copy link
Member

@wtgodbe wtgodbe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code changes all look reasonable, but I suggest adding a bunch of comments to Microsoft.AspNetCore.App.Runtime.csproj where you made changes - many of the new properties/items/targets are pretty esoteric/non-obvious

@dougbu
Copy link
Member Author

dougbu commented May 24, 2021

many of the new properties/items/targets are pretty esoteric/non-obvious

That's a bit strange since what @davidwrighton did exactly matches what we have for crossgen. Would it help to burn the crossgen infrastructure w/ fire❔

@wtgodbe
Copy link
Member

wtgodbe commented May 24, 2021

What I mean is that somebody reading that file might be confused as to the purpose of the properties you're setting, etc. in this PR. That's probably true for crossgen too. Comments explaining their purpose would be useful

davidwrighton and others added 9 commits May 24, 2021 14:43
- should resolve problems w/ disk space in "Tests: Helix ARM64 matrix" job
- need `node` in this job for some reason
- remove many properties related only to the `crossgen` tool
  - may have missed a few of course

nits:
- `crossgen` -> `crossgen2` in a few comments
- add / expand a few comments to improve clarity
@dougbu dougbu force-pushed the dougbu/crossgen2 branch from 2873e63 to 8899ef3 Compare May 24, 2021 22:58
@dougbu
Copy link
Member Author

dougbu commented May 24, 2021

🆗 I've removed crossgen stuff as much as I had time to handle right now. May be worth going through all of the properties and item groups in Microsoft.AspNetCore.App.Runtime.csproj to make sure all are used elsewhere in the file or by the SDK. But, this at least simplifies the file slightly.

Two main simplifications:

  1. No longer need to copy the crossgen tool into the runtime packages we produce. @Pilchie should we announce the removal❔
    • @davidwrighton will Microsoft.NETCore.App.Runtime packages change similarly❔ If yes, may need only one announcement😀
  2. Only run one crossgen2 command for both assemblies and symbols instead of the two we needed when using crosssgen

Less important but it is nice to require slightly fewer properties and not to need an extra slash at the end of $(Crossgen2PlatformAssemblyPaths) (unlike $(CrossgenPlatformAssemblyPaths).

@dougbu
Copy link
Member Author

dougbu commented May 24, 2021

What I mean is that somebody reading that file might be confused as to the purpose of the properties you're setting, etc. in this PR. That's probably true for crossgen too. Comments explaining their purpose would be useful

I think part of what you were seeing was the new properties weren't in the same hunk as the old ones, hiding a few existing comments. Let me know if the current file looks better.

@ghost
Copy link

ghost commented May 24, 2021

Hello @dougbu!

Because this pull request has the auto-merge label, I will be glad to assist with helping to merge this pull request once all check-in policies pass.

p.s. you can customize the way I help with merging this pull request, such as holding this pull request until a specific person approves. Simply @mention me (@msftbot) and give me an instruction to get started! Learn more here.

@ghost ghost merged commit 46b56aa into main May 24, 2021
@ghost ghost deleted the dougbu/crossgen2 branch May 24, 2021 23:53
@davidwrighton
Copy link
Member

@dougbu We're hoping to remove crossgen from the runtime packages soonish, but its a slow process moving everyone off of it.

@ghost
Copy link

ghost commented May 25, 2021

Hi @davidwrighton. It looks like you just commented on a closed PR. The team will most probably miss it. If you'd like to bring something important up to their attention, consider filing a new issue and add enough details to build context.

@dougbu
Copy link
Member Author

dougbu commented May 25, 2021

/backport to release/6.0-preview5

@github-actions
Copy link
Contributor

Started backporting to release/6.0-preview5: https://github.com/dotnet/aspnetcore/actions/runs/875859566

This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-infrastructure Includes: MSBuild projects/targets, build scripts, CI, Installers and shared framework
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants