Description
Image used:
amazon/aws-lambda-python:3.7
Dockerfile:
FROM amazon/aws-lambda-python:3.7
RUN yum -y install gcc
COPY ./requirements.txt ./requirements.txt
RUN pip3 install -r requirements.txt
COPY ./lambda_function.py ./lambda_function.py
RUN python -c 'import sys; print(sys.path)'
RUN pip3 freeze | grep idna
CMD [ "lambda_function.lambda_handler" ]
requirements.txt
snowflake-connector-python==2.3.10
snowflake-sqlalchemy==1.2.4
lambda_function.py
import pkg_resources
import sys
from snowflake.sqlalchemy import URL
from sqlalchemy import create_engine
def lambda_handler(event, context):
print(sys.path)
print(pkg_resources.working_set.by_key['idna'])
engine = create_engine(
URL(account='account', user='user',
database='"DATABASE"', warehouse='warehouse')
)
return {'success': 200}
What I would expect:
Dependencies are installed correctly, the lambda_handler
imports them and executes properly on Lambda.
What is the case:
Log from Lambda:
[ERROR] ContextualVersionConflict: (idna 3.1 (/var/runtime), Requirement.parse('idna<3,>=2.5'), {'requests', 'snowflake-connector-python'})
Traceback (most recent call last):
File "/var/task/lambda_function.py", line 17, in lambda_handler
database='"DATABASE"', warehouse='warehouse')
File "/var/lang/lib/python3.7/site-packages/sqlalchemy/engine/__init__.py", line 520, in create_engine
return strategy.create(*args, **kwargs)
File "/var/lang/lib/python3.7/site-packages/sqlalchemy/engine/strategies.py", line 61, in create
entrypoint = u._get_entrypoint()
File "/var/lang/lib/python3.7/site-packages/sqlalchemy/engine/url.py", line 172, in _get_entrypoint
cls = registry.load(name)
File "/var/lang/lib/python3.7/site-packages/sqlalchemy/util/langhelpers.py", line 275, in load
return impl.load()
File "/var/lang/lib/python3.7/site-packages/pkg_resources/__init__.py", line 2461, in load
self.require(*args, **kwargs)
File "/var/lang/lib/python3.7/site-packages/pkg_resources/__init__.py", line 2484, in require
items = working_set.resolve(reqs, env, installer, extras=self.extras)
File "/var/lang/lib/python3.7/site-packages/pkg_resources/__init__.py", line 792, in resolve
raise VersionConflict(dist, req).with_context(dependent_req)
The snowflake-connector-python
imports a different version of its dependency than pip has installed during Docker build. It then fails due to the fact that idna 3.1
library does not match its requirements: idna<3,>=2.5
.
Why it tries to import a different version is suggested by the logging I have added in the Dockerfile and the lambda_function.
Dockerfile: RUN python -c 'import sys; print(sys.path)
'
---> Running in 5c925a9e311a
['', '/var/lang/lib/python37.zip', '/var/lang/lib/python3.7', '/var/lang/lib/python3.7/lib-dynload', '/var/lang/lib/python3.7/site-packages']
lambda_function.lambda_handler: print(sys.path)
['/var/task', '/opt/python/lib/python3.7/site-packages', '/opt/python', '/var/runtime', '/var/lang/lib/python37.zip', '/var/lang/lib/python3.7', '/var/lang/lib/python3.7/lib-dynload', '/var/lang/lib/python3.7/site-packages', '/opt/python/lib/python3.7/site-packages', '/opt/python']
Dockerfile: RUN pip3 freeze | grep idna
'
idna==2.10
lambda_function.lambda_handler: print(pkg_resources.working_set.by_key['idna'])
idna 3.1
What happens is that Lambda Runtime sets the /var/runtime
directory in front of /var/lang/lib/python3.7
and populates the pkg_resources.WorkingSet with the distributions installed there (mostly boto3 + deps). This is being carried over to the lambda handler which is executed with the "overriden" libraries. Seeing how sys.path
at the moment when the handler executes, I presume that it has been manually modified to not place the runtime path at the beginning, but the user provided libraries in /var/task
. Why /opt/python/lib/python3.7/site-packages
is the second if it is not the pip site-packages directory?
The outcome is really confusing - using Docker I expect to be able to handle my runtime and (at least) my dependencies. I definitely expect the Lambda Runtime to be transparent and its dependencies not to impact my workload. Especially if the bug is this opaque and undocumented.
I was able to work my way around with the following hack.
import pkg_resources
import sys
from snowflake.sqlalchemy import URL
from sqlalchemy import create_engine
def lambda_handler(event, context):
entry = '/var/lang/lib/python3.7/site-packages'
sys.path = [entry] + sys.path
for dist in pkg_resources.find_distributions(entry, True):
pkg_resources.working_set.add(dist, entry, False, replace=True)
[...]
It may be dangerous if the manually imported libraries libraries will in turn conflict with the downstream code in the runtime. I think, however, that something of this kind can be implemented in the runtime itself so the the handler use only the environment libraries.