-
Notifications
You must be signed in to change notification settings - Fork 132
Implement SSL_MODE_AUTO_RETRY #1333
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement SSL_MODE_AUTO_RETRY #1333
Conversation
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #1333 +/- ##
==========================================
- Coverage 76.85% 76.84% -0.02%
==========================================
Files 425 425
Lines 71498 71502 +4
==========================================
- Hits 54948 54943 -5
- Misses 16550 16559 +9 ☔ View full report in Codecov by Sentry. |
9607aca
to
06f9cc4
Compare
82eae99
to
6065eed
Compare
This seemed relevant to the test failures I'm running into with Bind9, so I tried applying the change to see if it would resolve my issues. It results in the tests from Bind9 hanging forever though (the tests used to directly fail) :(. |
0ddfd60
to
6dd64f7
Compare
e335d43
to
3531382
Compare
3531382
to
5cbb984
Compare
ret = SSL_read(client.get(), buf, sizeof(buf)); | ||
EXPECT_EQ(ret, (int) sizeof(buf)); | ||
EXPECT_EQ(SSL_get_error(client.get(), ret), SSL_ERROR_NONE); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you tried to read another byte here what do you get?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we get SSL_ERROR_WANT_READ
from SSL_get_error
, but this time with a ret val of -1 (as opposed to 0 retval on ssl_test.cc L10502). SSL_ERROR_WANT_READ
is retryable, that doesn't seem right....
moving PR back to draft pending further investigation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well, I confirmed that even before this change, trying to read additional bytes after exhausting the rbio
resulted in SSL_ERROR_WANT_READ
. so, it looks like this change doesn't affect that specific behavior.
b5772a5
to
d3c39a9
Compare
d3c39a9
to
98736bd
Compare
6b4891e
to
6f4c696
Compare
@samuel40791765 -- from what I can tell, bind9 does not set SSL_MODE_AUTO_RETRY, so this change should not affect that integration. |
This change implements `SSL_MODE_AUTO_RETRY`, which was previously ignored. When this option (off by default) is set, `SSL_get_error`s behavior hews a little more closely to OpenSSL when processing a return code of 0 (i.e. empty I/O). With this option set, we allow error processing for empty reads if we're in a retryable state. With this option unset, we treat (most) 0-valued return codes as `SSL_ERROR_SYSCALL`, using it as a sort-of alias for EOF in the underlying transport. Internal calls to `SSL_get_error` (such as [those in `bio_ssl.cc`](https://github.com/aws/aws-lc/blob/main/ssl/bio_ssl.cc#L29)) then observe the `SSL_ERROR_SYSCALL` and fail to process potential retries. OpenSSL implements this mode differently, only checking its value in its [main read loop](https://github.com/openssl/openssl/blob/398011848468c7e8e481b295f7904afc30934217/ssl/record/rec_layer_s3.c#L992). Neither OpenSSL [1.1.1][6] nor [OpenSSL 3.x][10] return `SSL_ERROR_SYSCALL` early for empty reads, instead allowing them to be processed for retryable state. BoringSSL [appears to have diverged][7] from this behavior in 2015, but that commit was just a revert of a regression committed ~1 month prior. Their [previous implementation][18] closely resembles [that of OpenSSL 1.0.2][13], guarding retryable error processing with `i < 0` checks. `SSL_get_error` in [OpenSSL 1.1.1][12] and [OpenSSL 3.0][11], however, have no such guards. This change resembles the latter two. Why is it important that we process 0-valued return codes more fully? The recommended SSL IO functions (`SSL_read`, `SSL_write`, etc.) are, by contract, allowed to return negative numbers to indicate an error or non-success condition. However, their `size_t`-clean counterparts (`SSL_read_ex`, `SSL_write_ex`, etc.) must, by contract, only return 0 or 1. This means that passing the return code from the various `_ex` functions to `SSL_get_error` loses error reporting information, resulting in inconsistent handling of otherwise identical error cases between the two function families. Another solution to the `_ex` functions' loss of error reporting information would be to change the contract of those functions to allow negative return values, directly returning the return code from internal calls to `SSL_read` or `SSL_write`. I decided not to take this approach as it would break callers who do not account for negative return values in their error handling. - [OpenSSL 1.0.2 documentation for `SSL_get_error`][3] - [OpenSSL 1.1.1 documentation for `SSL_get_error`][4] - [OpenSSL 3.0 documentation for `SSL_get_error`][5] - [OpenSSL 1.0.2 implementation for `SSL_get_error`][13] - [OpenSSL 1.1.1 implementation for `SSL_get_error`][12] - [OpenSSL 3.0 implementation for `SSL_get_error`][11] - [OpenSSL master (3.2) implementation for `SSL_get_error`][10] - [OpenSSL 1.1.1 commit dropping `ret < 0` guards for hint processing][6] - [Unresolved OpenSSL issue for distinguishing EOF from failure in `BIO_read_ex`][1] - [BoringSSL commit introducing `SSL_ERROR_SYSCALL` for `ret == 0` case][7] - [BoringSSL Issue 503: Stop preserving arbitrary BIO error return values][2] - [BoringSSL Issue 24: Sanitize SSL and BIO return codes.][15] - [OpenSSL 3.0 commit returning `SSL_ERROR_SSL`, add reason to err stack on EOF][14] - [OpenSSL 3.0 commit introducing `SSL_OP_IGNORE_UNEXPECTED_EOF` option][8] - [CPython commit from @davidben removing CPython's use of `SSL_OP_IGNORE_UNEXPECTED_EOF`][9] - [CPython commit switching from `SSL_read` and `SSL_write` to `SSL_read_ex` and `SSL_write_ex`][16] - [OpenSSL 1.1.1 documentation for `SSL_MODE_AUTO_RETRY`][17] [1]: openssl/openssl#8208 [2]: https://bugs.chromium.org/p/boringssl/issues/detail?id=503 [3]: https://www.openssl.org/docs/man1.0.2/man3/SSL_get_error.html [4]: https://www.openssl.org/docs/man1.1.1/man3/SSL_get_error.html [5]: https://www.openssl.org/docs/man3.0/man3/SSL_get_error.html [6]: openssl/openssl@8051ab2 [7]: google/boringssl@9a38e92 [8]: openssl/openssl@09b90e0 [9]: python/cpython#95495 [10]: https://github.com/openssl/openssl/blob/master/ssl/ssl_lib.c#L4601 [11]: https://github.com/openssl/openssl/blob/openssl-3.0/ssl/ssl_lib.c#L3839 [12]: https://github.com/openssl/openssl/blob/OpenSSL_1_1_1-stable/ssl/ssl_lib.c#L3617 [13]: https://github.com/openssl/openssl/blob/OpenSSL_1_0_2-stable/ssl/ssl_lib.c#L2709 [14]: openssl/openssl@d924dbf [15]: https://bugs.chromium.org/p/boringssl/issues/detail?id=24 [16]: python/cpython@89d1550 [17]: https://www.openssl.org/docs/man1.1.1/man3/SSL_CTX_set_mode.html [18]: google/boringssl@fcf2583#diff-38e1eecb6013c2fe8bf194e5020ce1afd6e31c371d7849e041464ad822fe745bL2271
# Notes This change implements `SSL_MODE_AUTO_RETRY`, which was previously ignored. When this option (off by default) is set, `SSL_get_error`s behavior hews a little more closely to OpenSSL when processing a return code of 0 (i.e. empty I/O). With this option set, we allow error processing for empty reads if we're in a retryable state. With this option unset, we treat (most) 0-valued return codes as `SSL_ERROR_SYSCALL`, using it as a sort-of alias for EOF in the underlying transport. Internal calls to `SSL_get_error` (such as [those in `bio_ssl.cc`](https://github.com/aws/aws-lc/blob/main/ssl/bio_ssl.cc#L29)) then observe the `SSL_ERROR_SYSCALL` and fail to process potential retries. OpenSSL implements this mode differently, only checking its value in its [main read loop](https://github.com/openssl/openssl/blob/398011848468c7e8e481b295f7904afc30934217/ssl/record/rec_layer_s3.c#L992). Neither OpenSSL [1.1.1][6] nor [OpenSSL 3.x][10] return `SSL_ERROR_SYSCALL` early for empty reads, instead allowing them to be processed for retryable state. BoringSSL [appears to have diverged][7] from this behavior in 2015, but that commit was just a revert of a regression committed ~1 month prior. Their [previous implementation][18] closely resembles [that of OpenSSL 1.0.2][13], guarding retryable error processing with `i < 0` checks. `SSL_get_error` in [OpenSSL 1.1.1][12] and [OpenSSL 3.0][11], however, have no such guards. This change resembles the latter two. Why is it important that we process 0-valued return codes more fully? The recommended SSL IO functions (`SSL_read`, `SSL_write`, etc.) are, by contract, allowed to return negative numbers to indicate an error or non-success condition. However, their `size_t`-clean counterparts (`SSL_read_ex`, `SSL_write_ex`, etc.) must, by contract, only return 0 or 1. This means that passing the return code from the various `_ex` functions to `SSL_get_error` loses error reporting information, resulting in inconsistent handling of otherwise identical error cases between the two function families. Another solution to the `_ex` functions' loss of error reporting information would be to change the contract of those functions to allow negative return values, directly returning the return code from internal calls to `SSL_read` or `SSL_write`. I decided not to take this approach as it would break callers who do not account for negative return values in their error handling. # Relevant Links - [OpenSSL 1.0.2 documentation for `SSL_get_error`][3] - [OpenSSL 1.1.1 documentation for `SSL_get_error`][4] - [OpenSSL 3.0 documentation for `SSL_get_error`][5] - [OpenSSL 1.0.2 implementation for `SSL_get_error`][13] - [OpenSSL 1.1.1 implementation for `SSL_get_error`][12] - [OpenSSL 3.0 implementation for `SSL_get_error`][11] - [OpenSSL master (3.2) implementation for `SSL_get_error`][10] - [OpenSSL 1.1.1 commit dropping `ret < 0` guards for hint processing][6] - [Unresolved OpenSSL issue for distinguishing EOF from failure in `BIO_read_ex`][1] - [BoringSSL commit introducing `SSL_ERROR_SYSCALL` for `ret == 0` case][7] - [BoringSSL Issue 503: Stop preserving arbitrary BIO error return values][2] - [BoringSSL Issue 24: Sanitize SSL and BIO return codes.][15] - [OpenSSL 3.0 commit returning `SSL_ERROR_SSL`, add reason to err stack on EOF][14] - [OpenSSL 3.0 commit introducing `SSL_OP_IGNORE_UNEXPECTED_EOF` option][8] - [CPython commit from @davidben removing CPython's use of `SSL_OP_IGNORE_UNEXPECTED_EOF`][9] - [CPython commit switching from `SSL_read` and `SSL_write` to `SSL_read_ex` and `SSL_write_ex`][16] - [OpenSSL 1.1.1 documentation for `SSL_MODE_AUTO_RETRY`][17] [1]: openssl/openssl#8208 [2]: https://bugs.chromium.org/p/boringssl/issues/detail?id=503 [3]: https://www.openssl.org/docs/man1.0.2/man3/SSL_get_error.html [4]: https://www.openssl.org/docs/man1.1.1/man3/SSL_get_error.html [5]: https://www.openssl.org/docs/man3.0/man3/SSL_get_error.html [6]: openssl/openssl@8051ab2 [7]: google/boringssl@9a38e92 [8]: openssl/openssl@09b90e0 [9]: python/cpython#95495 [10]: https://github.com/openssl/openssl/blob/master/ssl/ssl_lib.c#L4601 [11]: https://github.com/openssl/openssl/blob/openssl-3.0/ssl/ssl_lib.c#L3839 [12]: https://github.com/openssl/openssl/blob/OpenSSL_1_1_1-stable/ssl/ssl_lib.c#L3617 [13]: https://github.com/openssl/openssl/blob/OpenSSL_1_0_2-stable/ssl/ssl_lib.c#L2709 [14]: openssl/openssl@d924dbf [15]: https://bugs.chromium.org/p/boringssl/issues/detail?id=24 [16]: python/cpython@89d1550 [17]: https://www.openssl.org/docs/man1.1.1/man3/SSL_CTX_set_mode.html [18]: google/boringssl@fcf2583#diff-38e1eecb6013c2fe8bf194e5020ce1afd6e31c371d7849e041464ad822fe745bL2271
1. This adds an integration CI dimension for Bind9 2. Resolved "cmocka unit tests" for Bind9 * Additional <openssl/asn1.h> import in <openssl/objects.h>: Bind depends on some ASN1 functions, but does not directly import the corresponding header. OpenSSL imports the asn1 header file in objects.h (which Bind is pulling these symbols from), so I've added the header file reference to objects.h. * SSL_get_error error anticipation fixing: There were several failures discovered to be related this, thanks to research done in Implement SSL_MODE_AUTO_RETRY #1333. The issue was pinned down the check implemented in google/boringssl@9a38e92. This check used to exist before the final return of SSL_get_error in OpenSSL. Upstream moved this earlier in the function with google/boringssl@fcf2583. However, much of the functions guards for i < 0 checks have been removed since OpenSSL 1.1.1, so the early logic no longer applies. This check has evolved into SSL_ERROR_ZERO_RETURN in our code. Moving the check further down helps us gain better parity with OpenSSL 1.1.1. Doing so passes the bind test failures for proxystream_test, tls_test, and doh_test. This also happens to help our integration with CPython, so I've reconfigured that patch. We actually already use SSL_AUTO_RETRY by default in AWS-LC. The recent change mentioned in the point above surrounding the flag (208327e) was just to make some of the errors consistent in CPython when the flag was used. I've reverted the special behavior surrounding it since it should no longer be needed. * Assertion for SSL_set_shutdown: The assertion was added in 63006a9, where it’s stated that we didn’t want SSL_set_shutdown messing up the state machine. This assertion is causing failures in tlsdns_test for Bind9, so it appears that we'll have to remove this to gain better OpenSSL parity. 3. Patch file needed for Bind seems to be slight bug in their build configuration. This was from a fairly recent commit. We can look to contribute this sometime soon.
Issues:
Addresses CryptoAlg-2136
Description of changes:
Notes
This change implements
SSL_MODE_AUTO_RETRY
, which was previously ignored. When this option (off by default) is set,SSL_get_error
s behavior hews a little more closely to OpenSSL when processing a return code of 0 (i.e. empty I/O). With this option set, we allow error processing for empty reads if we're in a retryable state. With this option unset, we treat (most) 0-valued return codes asSSL_ERROR_SYSCALL
, using it as a sort-of alias for EOF in the underlying transport. Internal calls toSSL_get_error
(such as those inbio_ssl.cc
) then observe theSSL_ERROR_SYSCALL
and fail to process potential retries.OpenSSL implements this mode differently, only checking its value in its main read loop. Neither OpenSSL 1.1.1 nor OpenSSL 3.x return
SSL_ERROR_SYSCALL
early for empty reads, instead allowing them to be processed for retryable state. BoringSSL appears to have diverged from this behavior in 2015, but that commit was just a revert of a regression committed ~1 month prior. Their previous implementation closely resembles that of OpenSSL 1.0.2, guarding retryable error processing withi < 0
checks.SSL_get_error
in OpenSSL 1.1.1 and OpenSSL 3.0, however, have no such guards. This change resembles the latter two.Why is it important that we process 0-valued return codes more fully? The recommended SSL IO functions (
SSL_read
,SSL_write
, etc.) are, by contract, allowed to return negative numbers to indicate an error or non-success condition. However, theirsize_t
-clean counterparts (SSL_read_ex
,SSL_write_ex
, etc.) must, by contract, only return 0 or 1. This means that passing the return code from the various_ex
functions toSSL_get_error
loses error reporting information, resulting in inconsistent handling of otherwise identical error cases between the two function families.Another solution to the
_ex
functions' loss of error reporting information would be to change the contract of those functions to allow negative return values, directly returning the return code from internal calls toSSL_read
orSSL_write
. I decided not to take this approach as it would break callers who do not account for negative return values in their error handling.Relevant Links
SSL_get_error
SSL_get_error
SSL_get_error
SSL_get_error
SSL_get_error
SSL_get_error
SSL_get_error
ret < 0
guards for hint processingBIO_read_ex
SSL_ERROR_SYSCALL
forret == 0
caseSSL_ERROR_SSL
, add reason to err stack on EOFSSL_OP_IGNORE_UNEXPECTED_EOF
optionSSL_OP_IGNORE_UNEXPECTED_EOF
SSL_read
andSSL_write
toSSL_read_ex
andSSL_write_ex
SSL_MODE_AUTO_RETRY
Call-outs:
Testing:
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and the ISC license.