-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Contention of Project Evaluation in parallel builds #7625
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Thanks to "msbuild /profileEvaluation", I got a few more hints. $([Microsoft.Build.Utilities.ToolLocationHelper]::GetLatestSDKTargetPlatformVersion($(SDKIdentifier), $(SDKVersion))) takes 3-4ms warm and 180ms cold. While the results are cached, there is a lock in RetrieveTargetPlatformList(). There is also a few instance of "exists" conditions that takes 4-6ms. Hopefully those results are cached. |
Disclaimer: not a maintainer, but afaik the CachingFileSystemWrapper is used for Exists evaluation. |
Just thinking out loud, if the main MSBuild node could copy over its caches to the child nodes, then that would save load time. |
@yuehuang010 Is this still active? How serious you think it is? What priority you would give it? |
This is important if MSBuild wants to be used as a project system and
integrated into an IDE. The perf makes it hard to scale up with many cores
and large solutions.
…On Tue, Jan 10, 2023 at 9:51 PM Roman Konecny ***@***.***> wrote:
@yuehuang010 <https://github.com/yuehuang010> Is this still active? How
serious you think it is? What priority you would give it?
—
Reply to this email directly, view it on GitHub
<#7625 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AEXI5GOKAGR5DQ4ESH3CP7TWRVSM7ANCNFSM5V42NAUQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Off topic but is there still discussion of the possibility of moving some nodes into the same process, where tasks were known to not assume their own current directory and environment block? Although, without more rearchitecture there would still be serialization costs, there would be other savings. |
Without going too crazy, I think focusing on a simple problem of GetLatestSDKTargetPlatformVersion() is good enough. Only have the initial node hold on to ToolLocationHelper data and other nodes just request them. On the hand, I hear that the multi threaded MSBuild is making progress, perhaps that is good enough. |
Uh oh!
There was an error while loading. Please reload this page.
Issue Description
Project Evaluation in parallel builds have contention causing evaluations of 20-30ms to take over 1000ms.
Steps to Reproduce
Create a solution with lots of small projects, enough to saturate your CPU. I used 4 times CPU threads worth of projects. The contents of each projects is not relevant as I used "Clean" target to do the least amount work. I used nearly identical projects to remove variables. Projects don't have P2P to maximize throughput. Nodereuse:false in all cases.
Case 1:
msbuild /t:clean /bl /v:q
Case 2:
msbuild /t:clean /bl /v:q /m
Used binlog to record results and set verbose to quiet to avoid console print out noise. Observe the Project Evaluation times of all projects.
Data & Analysis
This image is the trace of a single node build (case 1). Observer that each evaluation time took a few 20-30ms, except for the initial project.

This image is the trace of a multi node build (case 2) Observer that first evaluation took the same time in case 1, once parallel nodes started, the time of first evaluation takes seconds. Following subsequent project, their evaluation are faster. Notice node 1 is also having slowdown.

Theory
I theorize there is single threaded file cache service that handles file IO. The file cache probably serializes the data back to the nodes while holding onto the lock, thus blocking other nodes from using it. Node 0 is affected by the contention, so that disproves the "new" node cost.
Alternative is an evaluation cache where the lock is on the entire evaluation duration.
The text was updated successfully, but these errors were encountered: