Skip to content

Add HPack dynamic compression #20058

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Apr 1, 2020
Merged

Add HPack dynamic compression #20058

merged 14 commits into from
Apr 1, 2020

Conversation

JamesNK
Copy link
Member

@JamesNK JamesNK commented Mar 22, 2020

Fixes #4715

TODO

  • Use settings frame to set max table size and max list size
  • Perf test
  • Allow list/deny list headers to avoid table churn?
  • Make this implementation shareable with dotnet/runtime? - Disclaimer, I've refactored the big ASP.NET Core dependency (the KnownHeaders enum) out of the encoder and encapsulated it in Http2HeadersEnumerator. I'm leaving the code in ASP.NET Core. If the client team decides they want to add the feature in the future we can collaborate on additional changes required to it then.

@JamesNK JamesNK requested a review from a team March 22, 2020 10:18
@JamesNK JamesNK force-pushed the jamesnk/dynamic-hpack branch from 9cdacb1 to fac451e Compare March 22, 2020 10:44
Copy link
Member

@Tratcher Tratcher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you went with encode-everything. I'm curious to see the perf impact.

@JamesNK
Copy link
Member Author

JamesNK commented Mar 22, 2020

@Tratcher Question for you about SETTINGS_HEADER_TABLE_SIZE.

If the client sends a settings frame that updates the header table size, should the server issue a dynamic table size update? https://tools.ietf.org/html/rfc7541#section-6.3

I'm not sure whether the dynamic table size update is always sent, or if it exists for the situation when the server decides to change the size without the client sending SETTINGS_HEADER_TABLE_SIZE.

@JamesNK
Copy link
Member Author

JamesNK commented Mar 23, 2020

RPS Before:

|      Method | HeadersCount | HeadersChange |     Mean |    Error |   StdDev |     Op/s | Gen 0 | Gen 1 | Gen 2 | Allocated |
|------------ |------------- |-------------- |---------:|---------:|---------:|---------:|------:|------:|------:|----------:|
| MakeRequest |            1 |         False | 33.67 us | 0.649 us | 0.773 us | 29,701.3 |     - |     - |     - |     484 B |
| MakeRequest |            1 |          True | 34.09 us | 0.662 us | 1.010 us | 29,332.2 |     - |     - |     - |     486 B |
| MakeRequest |            4 |         False | 32.54 us | 0.645 us | 0.768 us | 30,735.5 |     - |     - |     - |     487 B |
| MakeRequest |            4 |          True | 33.43 us | 0.751 us | 0.771 us | 29,916.2 |     - |     - |     - |     481 B |
| MakeRequest |           32 |         False | 41.24 us | 1.621 us | 2.476 us | 24,247.6 |     - |     - |     - |     487 B |
| MakeRequest |           32 |          True | 42.44 us | 0.816 us | 0.939 us | 23,560.2 |     - |     - |     - |     487 B |

RPS After:

|      Method | HeadersCount | HeadersChange |     Mean |    Error |   StdDev |     Op/s | Gen 0 | Gen 1 | Gen 2 | Allocated |
|------------ |------------- |-------------- |---------:|---------:|---------:|---------:|------:|------:|------:|----------:|
| MakeRequest |            1 |         False | 33.12 us | 0.655 us | 0.897 us | 30,188.7 |     - |     - |     - |     485 B |
| MakeRequest |            1 |          True | 33.20 us | 0.648 us | 0.772 us | 30,119.1 |     - |     - |     - |     485 B |
| MakeRequest |            4 |         False | 34.03 us | 0.738 us | 0.690 us | 29,382.6 |     - |     - |     - |     487 B |
| MakeRequest |            4 |          True | 35.24 us | 0.699 us | 0.833 us | 28,373.4 |     - |     - |     - |     486 B |
| MakeRequest |           32 |         False | 40.35 us | 0.782 us | 0.960 us | 24,784.1 |     - |     - |     - |     487 B |
| MakeRequest |           32 |          True | 44.65 us | 0.486 us | 0.406 us | 22,397.6 |     - |     - |     - |     487 B |

Dynamic compression is slightly slower when headers continuously change, and slightly faster when headers are static.

Size:

This shows the change in headers response size when all headers are reused on the second request. If they are different then there will be no change from the first request.

1 header first request: 104 bytes
1 header second request: 5 bytes

4 headers first request: 284 bytes
4 headers second request: 8 bytes

32 headers first request: 1986 bytes
32 headers second request: 36 bytes

@Tratcher
Copy link
Member

These benchmarks are underwhelming so far, minimal change to rps and no change to allocations. We're going to need more numbers to show this is worth doing. These are the metrics I'd expect to improve with this change:

  • Total bytes on the wire reduced
  • Techempower RPS. Those scenarios are always sending the same response headers (though not very many?) so they should show the best case scenario.
  • RPS in a bandwidth limited scenario rather than a CPU limited one like we usually test

@JamesNK
Copy link
Member Author

JamesNK commented Mar 24, 2020

Total bytes on the wire reduced

This will be the biggest improvement. You can see the significant decrease in HEADERS frame size some of the unit tests where multiple calls are made. I'll get a byte break down from the benchmark.

I think the benchmarks so far mostly shows that the impact HPack encoding has on the server CPU usage is minimum in either direction. Best case it improves (and IMO this will probably be the typical case), and worst case where every header is different it is only slightly slower.

@JamesNK
Copy link
Member Author

JamesNK commented Mar 24, 2020

Benchmarks response size added - #20058 (comment)

@Tratcher
Copy link
Member

Tratcher commented Mar 24, 2020

@Tratcher Question for you about SETTINGS_HEADER_TABLE_SIZE.

If the client sends a settings frame that updates the header table size, should the server issue a dynamic table size update? https://tools.ietf.org/html/rfc7541#section-6.3

Consider the difference between the table size limit set by the client and the actual table size being maintained by the server. The server's actual table size can always be smaller than the client's limit.

So yes, the server has to actually acknowledge that it is changing the table size (if it wants to do so after a limit change, or for any other reason).

I'm not sure whether the dynamic table size update is always sent, or if it exists for the situation when the server decides to change the size without the client sending SETTINGS_HEADER_TABLE_SIZE.

This can happen in both directions. The client could choose to raise the table size limit beyond what the server wants to utilize, in which case the server will not change the table size to match the new limit. Similarly the server could choose to lower the table size for some other reason (e.g. low traffic) without the client having changed the limit.

https://tools.ietf.org/html/rfc7541#section-4.2

A change in the maximum size of the dynamic table is signaled via a
dynamic table size update (see Section 6.3). This dynamic table size
update MUST occur at the beginning of the first header block
following the change to the dynamic table size. In HTTP/2, this
follows a settings acknowledgment (see Section 6.5.3 of [HTTP2]).

This also answers the ACK ordering question. ACK first.

@halter73
Copy link
Member

@scalablecory Is this something HttpClient might use?

@scalablecory
Copy link
Contributor

@scalablecory Is this something HttpClient might use?

We prototyped this and weren't happy with the performance and implementation complexity tradeoffs. One scenario we looked at was the gRPC client.

Perhaps response headers are a different story, though. Or, maybe this implementation will be faster than what we came up with and it's worth revisiting. @JamesNK if you want to take a shot at it and can demonstrate good benefit, I'd be happy to look at a PR.

@JamesNK
Copy link
Member Author

JamesNK commented Mar 25, 2020

Do you still have the client prototype code? I know you looked at huffman but I wasn't aware you investigated static/dynamic compression.

I'm surprised you found static compression not worth it. It was a pretty quick win to implement in Kestrel.

@scalablecory
Copy link
Contributor

scalablecory commented Mar 25, 2020

Do you still have the client prototype code? I know you looked at huffman but I wasn't aware you investigated static/dynamic compression.

I think it's at https://github.com/aik-jahoda/corefx/tree/jahoda/dynamicheaders

I'm surprised you found static compression not worth it. It was a pretty quick win to implement in Kestrel.

We haven't looked at pure static compression; I think that could be an easy win. We looked at it combined with dynamic compression. CC @aik-jahoda

We have a concept of "known headers" already; it would not be hard to augment it with known values (we kind of have these already but only use it for string interning, not protocol)

@JamesNK
Copy link
Member Author

JamesNK commented Mar 25, 2020

Ah, you tried to reuse the decoder dynamic table. HPack implementations I looked at used a separate data structure for the encoder dynamic table and the decoder dynamic table. They store the same thing, but the way you search them and the output you want from them is different.

Encoder dynamic tables typically combine a hash table (based on header name) for fast lookups with a doubly linked list to track age and make FIFO evictions.

The encoder dynamic table I've written does that, and is also based around strings rather than bytes because string values are what we build up in the headers collection, and what we want to compare against when matching.

If you're interested in reusing this code then I think I make be able to remove dependencies on ASP.NET Core types. At the moment this PR uses ASP.NET's KnownHeaderType throughout the encoder, but I have an idea for removing that dependency.

@JamesNK JamesNK force-pushed the jamesnk/dynamic-hpack branch from 448450d to 72dc315 Compare March 25, 2020 04:45
@JamesNK
Copy link
Member Author

JamesNK commented Mar 25, 2020

Do we want to validate SETTINGS_MAX_HEADER_LIST_SIZE on the server?

https://tools.ietf.org/html/rfc7540#section-6.5.2

Currently Kestrel doesn't do anything with SETTINGS_MAX_HEADER_LIST_SIZE sent by the client. Kestrel could validate headers are under the setting size and throw an exception if headers exceeds it.

I'd like to leave it for the future.

@Tratcher
Copy link
Member

https://tools.ietf.org/html/rfc7540#section-10.5.1

Enforcement on our end is optional. Might start with a debug log. Sending the headers is the clearest way to indicate to the client the actual error (headers too large), there's no other standard error available that would convey that.

@JamesNK
Copy link
Member Author

JamesNK commented Mar 25, 2020

Being optional is why I guessed we didn't support it. I'm not eager to add it right now so I'm going to leave it alone.

@JamesNK
Copy link
Member Author

JamesNK commented Mar 25, 2020

Chrome's headers and HttpClient's dynamic table content after visiting different websites below.

Note that some headers aren't in the dynamic table because they've matched the static table, e.g. status, content-encoding


https://blog.cloudflare.com/tools-for-debugging-testing-and-using-http-2/ - Includes all headers

image

image


https://dotnet.microsoft.com/ - Includes all headers (Originally from Kestrel. Reverse proxy (HttpSys?) is adding HPack?

image

image


Content after visiting https://www.google.com/ - Doesn't include Set-Cookie

image

image

@JamesNK JamesNK force-pushed the jamesnk/dynamic-hpack branch from ba2fc87 to b05955b Compare March 26, 2020 01:00
@JamesNK
Copy link
Member Author

JamesNK commented Apr 1, 2020

@halter73 @Tratcher IMO this is done. Please review

@JamesNK JamesNK force-pushed the jamesnk/dynamic-hpack branch from 07bd63c to 2517f0e Compare April 1, 2020 20:52
@JamesNK JamesNK merged commit 0e4bcf6 into master Apr 1, 2020
@JamesNK JamesNK deleted the jamesnk/dynamic-hpack branch April 1, 2020 23:22
@pranavkm pranavkm removed the api-ready-for-review API is ready for formal API review - https://github.com/dotnet/apireviews label Mar 22, 2021
@amcasey amcasey added area-networking Includes servers, yarp, json patch, bedrock, websockets, http client factory, and http abstractions and removed area-runtime labels Jun 6, 2023
@github-actions github-actions bot locked and limited conversation to collaborators Dec 8, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-networking Includes servers, yarp, json patch, bedrock, websockets, http client factory, and http abstractions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement HPack dynamic compression
8 participants