-
Notifications
You must be signed in to change notification settings - Fork 18k
crypto/tls: TLS should dynamically tune record size for better HTTP performance #14376
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Great! I have TODOs for investigating this but I didn't get it done in time for Go 1.6. I'm glad you're already on it. I'll be able to start reviewing & helping next week. (in India and mostly occupied for the next couple days) |
I overlooked this issue and left the following comment on CL 19591.
|
Unless I completely misunderstood, I think we're talking about different problems. This issue is about framing at the TLS level -- a poor choice of TLS-level framing can force a slow client to download large TLS records, which increases time-to-first-decrypted-byte. We can't solve this issue at the TCP level because TCP has no visibility into TLS record boundaries, so TCP_NOTSENT_LOWAT won't help. I'm not familiar with CoDeL, but it also looks too low-level to solve this issue. |
Thanks for the explanation. If I understand correctly, you are trying to make trade with loss caused by split for making mobile or high-latency TLS speakers happy, and not trying to utilize hierarchical buffers along a path comprised of high latency links. If so, automatic record size adjustment/clamping based on TCP MSS (or path/link IP MTU) might make low-latency TLS speakers frown unless we provide a control knob. |
Are you proposing adding a new knob to tls.Config, DisableDynamicRecordSizing? I'd be fine with that. In theory the dynamic sizing parameters could also be knobs -- RecordSizeIdleTimeout (default = 1s) and RecordSizeBoostThreshold (default = 1MB) -- though I don't see a strong use case. |
I do think it's reasonable to provide these knobs. E.g. if you're building a download server that serves large payloads that are not stream-processed, then you can disable dynamic sizing and always output 16KB records to reduce framing + CPU overhead. |
I was thinking that DisableDynamicRecordSizing=true would always use the largest possible records (16KB). This should cover the download server case. Do you know of a use case where you want to fine-tune either the 1s idle timeout or the 1MB boost threshold? AFAICT, Google servers have used these constants for multiple years without change, suggesting they are almost always good enough. |
Ah, I guess that works. I know some other servers do expose explicit control over these, but I can't point to concrete examples of where and why the defaults are being overriden in production. |
Yup, I think a new API is necessary for adapting to various circumstances. I'd leave the API design including what's the best default behavior from Go 1.7 for other people. |
CL https://golang.org/cl/19591 mentions this issue. |
Currently, if a client of crypto/tls (e.g., net/http, http2) calls tls.Conn.Write with a 33KB buffer, that ends up writing three TLS records: 16KB, 16KB, and 1KB. Slow clients (such as 2G phones) must download the first 16KB record before they can decrypt the first byte. To improve latency, it's better to send smaller TLS records. However, sending smaller records adds overhead (more overhead bytes and more crypto calls), which slightly hurts throughput. A simple heuristic, implemented in this change, is to send small records for new connections, then boost to large records after the first 1MB has been written on the connection. Fixes golang#14376 Change-Id: Ice0f6279325be6775aa55351809f88e07dd700cd Reviewed-on: https://go-review.googlesource.com/19591 TryBot-Result: Gobot Gobot <[email protected]> Reviewed-by: Tom Bergan <[email protected]> Reviewed-by: Adam Langley <[email protected]>
Currently, if a client of crypto/tls (e.g., net/http, http2) calls tls.Conn.Write with a 33KB buffer, that ends up writing three TLS records: 16KB, 16KB, and 1KB. Slow clients (such as 2G phones) must download the first 16KB record before they can decrypt the first byte. To improve latency, it's better to send smaller TLS records. However, sending smaller records adds overhead (more overhead bytes and more crypto calls), which slightly hurts throughput. A simple heuristic, implemented in this change, is to send small records for new connections, then boost to large records after the first 1MB has been written on the connection. Fixes golang#14376 Change-Id: Ice0f6279325be6775aa55351809f88e07dd700cd Reviewed-on: https://go-review.googlesource.com/19591 TryBot-Result: Gobot Gobot <[email protected]> Reviewed-by: Tom Bergan <[email protected]> Reviewed-by: Adam Langley <[email protected]>
Currently, the crypto/tls package sends all data records using a max payload size of 16 KB, which is the maximum allowed by the TLS spec. This is optimal for throughput-sensitive applications, but not for latency-sensitive applications, such as web browsing, because the entire 16KB record must be downloaded and decrypted before any of the plaintext can be processed by the application layer. On slow 2G connections this can easily add a 500-1000ms latency to web page load time, since the browser cannot start fetching subresources until that first 16KB record has been fully decrypted.
A simple heuristic that works well in practice is to use small records for the first ~1MB of data, then switch to 16KB records for subsequent data, then switch back to smaller records after the connection goes idle. Ideally, the smaller records should fit exactly within one TCP packet.
For more background reading, see:
https://www.igvita.com/2013/10/24/optimizing-tls-record-size-and-buffering-latency/
http://chimera.labs.oreilly.com/books/1230000000545/ch04.html#TLS_RECORD_SIZE
https://issues.apache.org/jira/browse/TS-2503
I have an implementation sketch here:
https://go-review.googlesource.com/#/c/19591/
I can demonstrate the latency improvement in a simple test environment:
Without this change, I notice a ~500ms delay between the time that Chrome receives the first byte of the HTTP response and the time Chrome sends the first request for a subresource. With this change, that difference drops to ~10ms.
If the general approach looks good, I can mail the change for review. +@bradfitz and +@agl for TLS expertise.
The text was updated successfully, but these errors were encountered: