Skip to content

Cannot import batch with pydantic already imported, without email-validator #998

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
huonw opened this issue Feb 4, 2022 · 6 comments
Closed
Assignees
Labels
feature-request feature request v2

Comments

@huonw
Copy link
Contributor

huonw commented Feb 4, 2022

What were you trying to accomplish?

We're trying to use the batch utility to process SQS records. We're using Pydantic for various things, including the messages pushed into SQS. If we've imported pydantic before importing aws_lambda_powertools.utilities.batch we hit an exception due to email-validator not being installed.

We're not using pydantic's EmailStr, and so haven't installed email-validator and would prefer not to if possible. (NB. email-validator itself is small (< 100K), but it seems to require dnspython which is almost 2MB.)

Expected Behavior

Using batch classes shouldn't require email-validator to be installed, even if Pydantic has already been imported.

Current Behavior

Traceback (most recent call last):
  File "pydantic/networks.py", line 409, in pydantic.networks.import_email_validator
ModuleNotFoundError: No module named 'email_validator'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File ".../site-packages/aws_lambda_powertools/utilities/batch/__init__.py", line 7, in <module>
    from aws_lambda_powertools.utilities.batch.base import (
  File ".../site-packages/aws_lambda_powertools/utilities/batch/base.py", line 36, in <module>
    from aws_lambda_powertools.utilities.parser.models import DynamoDBStreamRecordModel
  File ".../site-packages/aws_lambda_powertools/utilities/parser/__init__.py", line 3, in <module>
    from . import envelopes
  File ".../site-packages/aws_lambda_powertools/utilities/parser/envelopes/__init__.py", line 1, in <module>
    from .apigw import ApiGatewayEnvelope
  File ".../site-packages/aws_lambda_powertools/utilities/parser/envelopes/apigw.py", line 4, in <module>
    from ..models import APIGatewayProxyEventModel
  File ".../site-packages/aws_lambda_powertools/utilities/parser/models/__init__.py", line 32, in <module>
    from .ses import (
  File ".../site-packages/aws_lambda_powertools/utilities/parser/models/ses.py", line 21, in <module>
    class SesReceipt(BaseModel):
  File "pydantic/main.py", line 204, in pydantic.main.ModelMetaclass.__new__
  File "pydantic/fields.py", line 488, in pydantic.fields.ModelField.infer
  File "pydantic/fields.py", line 419, in pydantic.fields.ModelField.__init__
  File "pydantic/fields.py", line 534, in pydantic.fields.ModelField.prepare
  File "pydantic/fields.py", line 728, in pydantic.fields.ModelField._type_analysis
  File "pydantic/fields.py", line 778, in pydantic.fields.ModelField._create_sub_type
  File "pydantic/fields.py", line 419, in pydantic.fields.ModelField.__init__
  File "pydantic/fields.py", line 539, in pydantic.fields.ModelField.prepare
  File "pydantic/fields.py", line 801, in pydantic.fields.ModelField.populate_validators
  File "pydantic/networks.py", line 422, in __get_validators__
  File "pydantic/networks.py", line 411, in pydantic.networks.import_email_validator
ImportError: email-validator is not installed, run `pip install pydantic[email]`

Possible Solution

  1. A minimal quick fix would be to expand the dynamic "has_pydantic" check to look for email_validator too, although this worsens the user/typing experience for those who are using pydantic without email validation
  2. A targeted fix might be to break up the modules to reduce the flow-on consequences of a single import statement, for instance, batch could import DynamoDBStreamRecordModel (etc) in a way that doesn't trigger also import envelopes, or envelopes/apigw.py could import APIGatewayProxyEventModel in a way that doesn't import ses
  3. A more general fix might be to adjust ses (etc):
  • make it conditional on email-validator: if email-validator doesn't exist, then the models don't function, in some way that has a useful error message
  • (silently) fall back to str instead of EmailStr, if email-validator isn't installed

(Either of the last two would potentially also resolve the related issue that it is impossible to use aws_lambda_powertools.utilities.parser without email-validator, even if avoiding the SES models. See python -c 'import aws_lambda_powertools.utilities.parser'.)

Steps to Reproduce (for bugs)

pip install aws-lambda-powertools==1.24.2 aws-xray-sdk==2.9.0 boto3==1.20.48 botocore==1.23.48 fastjsonschema==2.15.3 future==0.18.2 jmespath==0.10.0 pydantic==1.9.0 python-dateutil==2.8.2 s3transfer==0.5.1 six==1.16.0 typing_extensions==4.0.1 urllib3==1.26.8 wrapt==1.13.3
python -c 'import pydantic; from aws_lambda_powertools.utilities import batch'

Note: removing the import pydantic; works fine, since the check is just based on whether pydantic has been imported, not whether it is importable.

https://github.com/awslabs/aws-lambda-powertools-python/blob/168ec482502fb4a1a22038e3a2041459926e6591/aws_lambda_powertools/utilities/batch/base.py#L31

Environment

  • Powertools version used: 1.24.2 (full freeze output in pip command above)
  • Packaging format (Layers, PyPi): N/A
  • AWS Lambda function runtime: N/A
  • Debugging logs N/A
@huonw huonw added bug Something isn't working triage Pending triage from maintainers labels Feb 4, 2022
@heitorlessa
Copy link
Contributor

How critical is this for you?

Can we finish other papercuts to make our upcoming release then work on this right after?

Because we assume customers will install our "pydantic" extra package which includes email-validator, we didn't foresee the latter lib impact since Pydantic is a big outlier already.

@ran-isenberg Question: How critical is it to have email-validator vs EmailStr given it's a SES event?

I'd strive for option 2 tho I can't recall if this can be a non-breaking change. If it is a breaking change, then we could
go for 3 with the silent fallback to EmailStr as that solves it conditionally too.

As always, thank you for helping us improve @huonw !!

@ran-isenberg
Copy link
Contributor

I think it's great to have the email validation but it's not a must. I mean, if you are using a parser, these are the areas it excels.
I dont see a big issue with adding the email validator (done it myslef), after all pydantic is not so small anyways, but that's just me.
One option is to extend the SesModel and change emailstr to str or we can provide a non emailstr version of the model too. but that would mean another envelope too.

Another option is to check if emailstr can be imported, and if not define it as str maybe?

@huonw
Copy link
Contributor Author

huonw commented Feb 4, 2022

Not critical: totally happy for you to prioritise as you see fit. As you say, this is "only" 2MB on top of the large pydantic package (and we happen to use a bunch of other large ones too).

@heitorlessa heitorlessa added feature-request feature request and removed bug Something isn't working triage Pending triage from maintainers labels Feb 7, 2022
@heitorlessa heitorlessa added good first issue Good for newcomers help wanted Could use a second pair of eyes/hands p3 area/parser labels Mar 7, 2022
mgochoa pushed a commit to mgochoa/aws-lambda-powertools-python that referenced this issue Jun 23, 2022
mgochoa pushed a commit to mgochoa/aws-lambda-powertools-python that referenced this issue Jun 23, 2022
@github-actions github-actions bot added the pending-release Fix or implementation already in dev waiting to be released label Jul 13, 2022
@heitorlessa heitorlessa removed the pending-release Fix or implementation already in dev waiting to be released label Jul 18, 2022
@heitorlessa heitorlessa removed the p3 label Aug 2, 2022
@heitorlessa
Copy link
Contributor

@huonw we're prepping for V2 this week and removed email-validator dependency, as part of other package size optimizations we've made -- For PyPi, we're now down to ~200K, and for Layers with all dependencies + compiled Pydantic down to ~2M.

Thank you for the prolonged patience here!

@heitorlessa heitorlessa added v2 and removed help wanted Could use a second pair of eyes/hands good first issue Good for newcomers labels Oct 18, 2022
@rubenfonseca rubenfonseca self-assigned this Oct 19, 2022
@heitorlessa
Copy link
Contributor

Closing as we're wrap to launch V2.

@github-actions
Copy link
Contributor

⚠️COMMENT VISIBILITY WARNING⚠️

This issue is now closed. Please be mindful that future comments are hard for our team to see.

If you need more assistance, please either tag a team member or open a new issue that references this one.

If you wish to keep having a conversation with other community members under this issue feel free to do so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-request feature request v2
Projects
None yet
Development

No branches or pull requests

4 participants