Skip to content

Setting of SSL_CERT_FILE in newer images causes issues with PInvoke C extensions #1973

Open
@cretz

Description

@cretz

Describe the bug

As part of #1661, the AWS .NET 8 image, SSL_CERT_FILE env var is forcefully set to a no-op because it was assumed only the .NET code/runtime would use such a file. However, some .NET apps have C/Pinvoke extensions that do not expect this to be set to an empty file.

In our case (https://github.com/temporalio/sdk-dotnet), we have a Rust extension that uses https://github.com/hyperium/tonic which uses https://github.com/rustls/rustls-native-certs which uses https://github.com/alexcrichton/openssl-probe. When this was set to /tmp/noop in #1661 this didn't get used because the file didn't exist, but once it became a real file in #1663, now every Rust library using TLS this way (most I assume) does not have a CA cert bundle and therefore fails to validate server certs during TLS connections. And really any TLS library properly respecting SSL_CERT_FILE will have this problem.

Is there any way to solve whatever was trying to be solved in #1661 without setting an override variable that affects all TLS libraries?

Regression Issue

  • Select this option if this issue appears to be a regression.

Expected Behavior

Expect .NET extensions to be able to communicate over TLS same as if they were running on any other normal image. And expect OpenSSL environment variable overrides to not be present so default behavior still occurs. This was the case I believe in the .NET 6 runtime.

Current Behavior

Libraries that (properly) support SSL_CERT_FILE env var overrides now fail where they used to succeed (e.g. I believe in the .NET 6 runtime).

Reproduction Steps

I admit I have not built a standalone replication though I can if I must (though it takes a while to setup an entire project with Rust PInvoke extension). Here's an admittedly unfair/obvious reproduction:

if (Environment.GetEnvironmentVariable("SSL_CERT_FILE") != null)
{
    throw new InvalidOperationException("SSL cert file default overridden implicitly");
}

Possible Solution

No response

Additional Information/Context

No response

AWS .NET SDK and/or Package version used

Amazon.Lambda.Core 2.5.0 (default with code template)

Targeted .NET Platform

.NET 8

Operating System and version

AmazonLinux 2023

Activity

added
bugThis issue is a bug.
needs-triageThis issue or PR still needs to be triaged.
on Feb 7, 2025
ashishdhingra

ashishdhingra commented on Feb 10, 2025

@ashishdhingra
Contributor

@cretz Thanks for opening the issue. As you correctly mentioned, in #1663, the bootstrap-al2023.sh SSL_CERT_FILE sets to /var/runtime/empty-certificates.crt if not explicitly set. Could your Rust extension check for environment variable SSL_CERT_FILE, if set to /var/runtime/empty-certificates.crt, reset it to empty or value that suits your scenario?

CC @normj for inputs if some alternative approach could be followed.

added
p2This is a standard priority issue
response-requestedWaiting on additional info and feedback. Will move to close soon in 7 days.
and removed
needs-triageThis issue or PR still needs to be triaged.
on Feb 10, 2025
normj

normj commented on Feb 12, 2025

@normj
Member

@cretz Without doing this change there was some pretty significant cold start regressions moving to .NET 8 and AL2023. When we create the next .NET 10 managed runtime we should be able to take this change out since our PR we upstream to the .NET runtime was merged and shipped as part of .NET 9. Till then if the SSL_CERT_FILE pointing to an empty file is causing problems then your best workaround is to set the SSL_CERT_FILE to /etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem on your Lambda function. You will have the double load of certs which causes the cold start regression. Assuming you are making HTTPS calls from .NET.

cretz

cretz commented on Feb 12, 2025

@cretz
Author

Definitely don't want to re-introduce the performance issue this solves. It's just unfortunate that the only way to fix is to set an environment variable that is commonly used by all TLS implementations instead of just some kind of fix specific to the .NET stdlib one.

However if there is no way to fix the .NET-TLS-specific performance issue without setting a general-TLS setting, then I understand there is not much that can be done. May still be worth leaving issue open for those others that may hit similar issues.

normj

normj commented on Feb 12, 2025

@normj
Member

I agree I wish we could have done something that was only inside the .NET side that we controlled in Lambda. The issue was deep in the private layers of the .NET base libraries and the only knob to adjust the behavior was the SSL environment variables.

removed
response-requestedWaiting on additional info and feedback. Will move to close soon in 7 days.
potential-regressionMarking this issue as a potential regression to be checked by team member
on Feb 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @cretz@normj@ashishdhingra

        Issue actions

          Setting of SSL_CERT_FILE in newer images causes issues with PInvoke C extensions · Issue #1973 · aws/aws-lambda-dotnet