Cache assemblies and wasm using content hashes #18859

SteveSandersonMS · 2020-02-06T18:37:34Z

This is a prerequisite for the PWA work, but also benefits people who aren't building a PWA, since it means the client no longer has to make If-Modified-Since requests for all the assemblies and the wasm file. It also means that badly-configured servers (not returning etags) won't result in bad perf.

It's an unusually complicated change because:

I had to reorganize a lot of the logic for loading assemblies to factor parts out of MonoPlatform.ts
I had to add some extra logic for switching between "streaming" and "arraybuffer" wasm compilation modes, since we can no longer inherit that from emscripten (reasons in comments)
Some reordering was needed in the MSBuild targets, because we now need to collect info about the wasm files earlier so we can include their content hashes in blazor.boot.json
The caching logic has to be consumable in two different modes:
- Evaluating the whole network response and validating the hash before completing the promise (for loading .NET assemblies/pdbs)
- Returning the network response before it's fully loaded, and then validating the hash and caching as a separate background task (for loading wasm binaries, because streaming compilation only makes sense if you start it immediately and not after waiting for the response data to fully arrive)

Some not-strictly-required improvements I also made along the way:

Changed GenerateBlazorBootJson so it uses S.T.J rather than DataContractJsonSerializer
Fixed what seems to be a bug in the MSBuild targets that was double-writing pdb files to the output, and was unnecessary logic

The build will fail because I haven't yet updated the .js binaries and I think some tests will need to be updated too.

As for testing this in general, I plan to write CTI tests. I considered trying to write E2E tests but:

It would involve getting Selenium to report on the network traffic so we could observe whether things were retrieved from cache, which is not something we do elsewhere. It may be achievable, but:
Concurrent test execution would break things, because the cache is shared across all browser instances and tabs. To deal with this, we'd have to launch the browser on separate user profile directories for each test, but:
- That's whole new infrastructure I don't want to build
- Even if we did create that, it's not likely to be portable across browsers, e.g., when these tests run against actual devices. And it would make for slow tests, creating new user profiles for each.

However I think it will be possible to write pretty comprehensive CTI tests as the tester will be able to check the browser cache contents using dev tools, and verify things look right as far as console log output etc.

Simulated stats

According the browser dev tools, here's the effect on the time taken to fetch all the resources needed to load the page. This is based on a warm HTTP cache (so static resources such as .dlls are returning 304s either way):

Scenario	Without this change	With this change
Slow 3G	13.5s	4.1s
Fast 3G	5.1s	2.4s
Localhost (artificial scenario)	0.6s	0.1s

The difference is simply due to eliminating 30+ HTTP requests that all would have returned 304s.

Real-world stats

I've been trying to come up with an argument that we won't actually need to do this at all, and could rely on doing it all in a service worker instead. Unfortunately the data isn't agreeing with me. I deployed to GitHub pages and accessed it over a reasonable quality 4G connection. After ~20 trials both with and without this feature, measured by "finish" time displayed in browser dev tools:

	Without this change	With this change
25th centile	1.57s	0.48s
Median	1.68s	0.71s
75th centile	2.02s	0.91s

src/Components/Blazor/Build/src/targets/Blazor.MonoRuntime.targets

src/Components/Blazor/Build/src/Tasks/GenerateBlazorBootJson.cs

src/Components/Web.JS/src/Platform/Mono/MonoPlatform.ts

rynowak · 2020-02-06T20:35:32Z

src/Components/Blazor/Build/src/targets/Blazor.MonoRuntime.targets

+      <_BlazorBootResource Include="@(BlazorOutputWithTargetPath->WithMetadataValue('BlazorRuntimeFile', 'true'))" />
+      <_BlazorBootResource BootResourceType="assembly" Condition="'%(Extension)' == '.dll'" />
+      <_BlazorBootResource BootResourceType="pdb" Condition="'%(Extension)' == '.pdb'" />
+      <_BlazorBootResource BootResourceType="wasm" Condition="'%(Extension)' == '.wasm'" />


Should we be concerned about files that don't match one of these extension types? Are the non-matching files all of the other static assets that don't belong in boot.json?

The ones without the BlazorRuntimeFile are the other static assets that don’t belong in the boot json.

Additionally we’re excluding other extensions from boot json by virtue of not assigning any BootResourceType. This eliminates the AssemblyName.xml files that would otherwise sometimes appear.

Why do we think it is important to differentiate by dll, pdbs and wasm? I think having the Condition piece should be good enough.

The logic about how to distinguish the file types has to exist somewhere. I agree it could be done on the .NET side too but that seems neither better nor worse. If you can anticipate some problem with doing it on the MSBuild side I'm fine with changing it, but otherwise I suspect it's a neutral choice and we're just as well leaving it as-is. It's an internal implementation detail we can change later if we want.

pranavkm

Looks pretty neat. The JSON thing as Ryan pointed out needs to be fixed. Is there also a way to add some tests for the loader and it's caching?

src/Components/Blazor/Build/src/targets/Blazor.MonoRuntime.targets

src/Components/Web.JS/src/Platform/Mono/MonoPlatform.ts

src/Components/Web.JS/src/Platform/WebAssemblyResourceLoader.ts

SteveSandersonMS · 2020-02-11T11:45:31Z

Is there also a way to add some tests for the loader and it's caching?

Please see the very long PR description for details :)

javiercn

Overall looks like a great set of changes!

High level feedback:

I have a bunch of comments about the specific caching mechanics.
- It's not clear to me if we are taking a cache first approach or doing something different, but I can't discuss it on a PR effectively.
I propose we change the hashing format with the idea of simplifying things on the JS side.

src/Components/Blazor/Build/src/targets/Blazor.MonoRuntime.targets

src/Components/Web.JS/src/Platform/Mono/MonoPlatform.ts

src/Components/Web.JS/src/Platform/WebAssemblyResourceLoader.ts

javiercn · 2020-02-11T11:34:10Z

src/Components/Web.JS/src/Platform/WebAssemblyResourceLoader.ts

+      // Bypass the entire cache flow, including checking hashes, etc.
+      // This gives developers an easy opt-out if they don't like anything about the default cache mechanism
+      const response = fetch(url, networkFetchOptions);
+      const data = response.then(r => r.clone().arrayBuffer());


Why do we clone the response data here?

Added comment:

// Cloning represents the response as two separate streams so we can process it twice independently // (once for hash checking, once for streaming compilation). Without this, there would be a read error.

Yes, sorry. I should have been clearer. We are returning the response here, so we shouldn't need to clone it, as we are not going to cache it. Am I missing something obvious here?

We're returning the response so it can be read externally, and a promise that resolves with its data. These two separate actions can't be done on a single response stream, hence the need to clone it.

However when I go back to this and make use of fetch(..., { integrity: ... }), it might be that quite a bit of this complexity goes away. There might be no need to be reading the response data separately. I'll find out when I try it :)

It has got much simpler now I've changed to use integrity.

There is still one place where the response has to be cloned: before we put it into the cache. This is to be expected since we need to read the body (for caching) separately from returning it upstream. The "return from cache" code path, however, no longer has to clone.

src/Components/Web.JS/src/Platform/WebAssemblyResourceLoader.ts

javiercn · 2020-02-11T13:23:38Z

src/Components/Blazor/Build/src/targets/Blazor.MonoRuntime.targets

+      <_BlazorPackageContentOutput Include="@(BlazorPackageContentFile)" Condition="%(SourcePackage) != ''">
+        <TargetOutputPath>$(BaseBlazorPackageContentOutputPath)%(SourcePackage)\%(RecursiveDir)\%(Filename)%(Extension)</TargetOutputPath>
+      </_BlazorPackageContentOutput>
+      <BlazorOutputWithTargetPath Include="@(_BlazorPackageContentOutput)" />
    </ItemGroup>


How do we debug Release builds? Is that a thing we want to support? I can debug regular .net apps in Release configuration.

Not sure how the question relates to this PR. Is it a general question about whether pdb files should be emitted and loaded in release builds? If so, a totally reasonable question but let's find a separate channel for discussing it, as this PR doesn't change anything about that.

https://github.com/dotnet/aspnetcore/pull/18859/files#diff-c4620899f2dac172f7b494a8eb643b11R344-R355

Wouldn't we expose the pdbs unconditionally before?

I don't really follow the link you've posted, since the code on those lines doesn't seem to say anything about whether pdbs are exposed.

However there was an issue with controlling whether PDBs are loaded: #18655. I've now amended Blazor.MonoRuntime.targets to fix this. Specifically, I changed how we filter out PDBs from the output so that it's controlled by the BlazorEnableDebugging in the same way whether linking is enabled or disabled.

src/Components/Blazor/Build/src/Tasks/GenerateBlazorBootJson.cs

src/Components/Web.JS/src/Platform/WebAssemblyResourceLoader.ts

pranavkm

Once @javiercn is happy. Re-iterating my previous question - is there a way to add tests for the resource loader?

src/Components/Blazor/Build/src/Tasks/GenerateBlazorBootJson.cs

src/Components/Blazor/Build/src/targets/Blazor.MonoRuntime.targets

pranavkm · 2020-02-11T17:35:51Z

src/Components/Blazor/Build/src/targets/Blazor.MonoRuntime.targets

+          Remove assemblies that are inputs to calculating the assembly closure. Instead use the resolved outputs, since it is the minimal set.
+         -->
+      <_BlazorCopyLocalPaths Include="@(ReferenceCopyLocalPaths)" Condition="'%(Extension)' == '.dll'" />
+      <_BlazorCopyLocalPaths Remove="@(_BlazorManagedRuntimeAssemby)" Condition="'%(Extension)' == '.dll'" />


Suggested change

<_BlazorCopyLocalPaths Remove="@(_BlazorManagedRuntimeAssemby)" Condition="'%(Extension)' == '.dll'" />

<_BlazorCopyLocalPaths Remove="@(_BlazorManagedRuntimeAssemby)" />

This condition is needed as a workaround for #18951. We might find a better solution later, but that's separate from this PR.

SteveSandersonMS · 2020-02-12T23:34:49Z

I just realised the manual hash checking is completely unnecessary. We can pass { integrity: '<hash>' } as an option to fetch and it's all done for us. This will simplify several aspects of the flow :)

javiercn · 2020-02-13T09:51:59Z

I just realised the manual hash checking is completely unnecessary. We can pass { integrity: '<hash>' } as an option to fetch and it's all done for us. This will simplify several aspects of the flow :)

I was super excited about this, but I'm sad to be the bearer of bad news :(.

It is not supported in Safari looks like. See here

SteveSandersonMS · 2020-02-13T14:53:50Z

It is not supported in Safari looks like.

Good news :) It totally does work, at least in the use case we have here. I verified it on Safari iOS both in the success and failure cases for integrity checking. Caniuse has outdated info for this.

From Safari's release notes, it looks like this has been supported since at least release 65, which was ages ago.

…sources

Co-Authored-By: Pranav K <[email protected]>

SteveSandersonMS requested review from pranavkm and javiercn February 6, 2020 18:37

ghost added the area-blazor Includes: Blazor, Razor Components label Feb 6, 2020

SteveSandersonMS commented Feb 6, 2020

View reviewed changes

src/Components/Blazor/Build/src/targets/Blazor.MonoRuntime.targets Outdated Show resolved Hide resolved

rynowak reviewed Feb 6, 2020

View reviewed changes

src/Components/Blazor/Build/src/Tasks/GenerateBlazorBootJson.cs Outdated Show resolved Hide resolved

SteveSandersonMS commented Feb 6, 2020

View reviewed changes

src/Components/Web.JS/src/Platform/Mono/MonoPlatform.ts Outdated Show resolved Hide resolved

rynowak reviewed Feb 6, 2020

View reviewed changes

SteveSandersonMS mentioned this pull request Feb 7, 2020

PWA template #18878

Merged

pranavkm mentioned this pull request Feb 7, 2020

pdb files are published in Blazor WASM template project with Release configuration #18655

Closed

pranavkm reviewed Feb 7, 2020

View reviewed changes

SteveSandersonMS force-pushed the stevesa/content-hash-caching branch from 2ef6130 to c60982f Compare February 11, 2020 09:26

javiercn reviewed Feb 11, 2020

View reviewed changes

pranavkm approved these changes Feb 11, 2020

View reviewed changes

pranavkm added this to the blazor-wasm-3.2-preview2 milestone Feb 11, 2020

SteveSandersonMS added 12 commits February 17, 2020 09:21

Change the format of blazor.boot.json to include content hashes of re…

4791469

…sources

Start factoring out the loading logic into a separate class

bca2a4b

Load using 'fetch' in WebAssemblyResourceLoader

02137de

Log resource stats

a1121e9

Better logging

0ebda64

Validate content hashes

bbc1024

Track and display transferred data size

ba98132

Purge unused cache entries

457fab5

Reduce logging in release builds

2521224

Load .wasm via resource loader too

e32d9ff

Have a separate cache for each base URI

832a5ee

Cleanup

706268a

SteveSandersonMS added 9 commits February 17, 2020 09:22

Attempt to stop breaking test about satellite assemblies

faed3a4

CR

781caef

CR comment

777d91e

CR: Typo

cf9018a

CR: Comment

ec8206a

CR: Comment

85c42bb

CR: no-cache for blazor.boot.json

203f1a7

CR: Use base64 hashes

a28a257

Corresponding test update

62fd199

SteveSandersonMS force-pushed the stevesa/content-hash-caching branch from d664f6e to 62fd199 Compare February 17, 2020 09:23

SteveSandersonMS and others added 14 commits February 17, 2020 10:09

Use fetch 'integrity' option instead of manual hash validation

a35d1e8

Clarifying comments

9f58b70

Better error reporting if wasm fetch fails

9c8e2ae

Tidy

323843c

CR: Remove obsolete SWA handling logic

6cabf3b

Remove obsolete code. Fix debugging detection.

63360d1

Clean up logic for determining whether to output and load pdbs

fe3476c

Update src/Components/Blazor/Build/src/Tasks/GenerateBlazorBootJson.cs

7337f88

Co-Authored-By: Pranav K <[email protected]>

Fix suggestion typo

ddf1a45

CR: Rename type

5346ee8

CR: Rename

888e93b

Update test

7613e67

Fix flaky behavior with deleting profile dirs in E2E tests

01bc2a3

E2E tests for caching

3a0e839

javiercn approved these changes Feb 17, 2020

View reviewed changes

Remove redundant test code

e5a3cf4

SteveSandersonMS merged commit 4628dfb into blazor-wasm Feb 17, 2020

SteveSandersonMS deleted the stevesa/content-hash-caching branch February 17, 2020 17:17

SteveSandersonMS mentioned this pull request Feb 19, 2020

[Blazor] Published app results in 404s for pdb files #18545

Closed

SwiftMJ mentioned this pull request Mar 19, 2022

Published Hosted Blazor WASM does not contain PDB files. #40787

Closed

1 task

	<_BlazorCopyLocalPaths Remove="@(_BlazorManagedRuntimeAssemby)" Condition="'%(Extension)' == '.dll'" />
	<_BlazorCopyLocalPaths Remove="@(_BlazorManagedRuntimeAssemby)" />

Cache assemblies and wasm using content hashes #18859

Cache assemblies and wasm using content hashes #18859

Uh oh!

Conversation

SteveSandersonMS commented Feb 6, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Simulated stats

Real-world stats

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pranavkm left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

SteveSandersonMS commented Feb 11, 2020

Uh oh!

javiercn left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

pranavkm left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SteveSandersonMS commented Feb 12, 2020

Uh oh!

javiercn commented Feb 13, 2020

Uh oh!

SteveSandersonMS commented Feb 13, 2020

Uh oh!

Uh oh!

SteveSandersonMS commented Feb 6, 2020 •

edited

Loading