Skip to content

Bug: Mongo ObjectIds not serializing out of the box #3892

Closed
@aitchnyu

Description

@aitchnyu

Expected Behaviour

My app has a custom serializer. I used enable_validation. A dict with Mongo ObjectId should have been serialized.

from bson import json_util
...
def custom_serializer(obj) -> str:
    ...
    """Your custom serializer function ApiGatewayResolver will use"""
    return json_util.dumps(obj)

app = APIGatewayRestResolver(serializer=custom_serializer, enable_validation=True)

Current Behaviour

If we use enable_validation, the openapi modules uses its own encoders which cannot handle ObjectIds.

Traceback (most recent call last):
File "/var/task/aws_lambda_powertools/event_handler/openapi/encoders.py", line 265, in _dump_other
data = dict(obj)
TypeError: 'ObjectId' object is not iterable
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/var/task/aws_lambda_powertools/event_handler/openapi/encoders.py", line 269, in _dump_other
data = vars(obj)
TypeError: vars() argument must have __dict__ attribute
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/var/task/aws_lambda_powertools/event_handler/api_gateway.py", line 1985, in _call_route
route(router_middlewares=self._router_middlewares, app=self, route_arguments=route_arguments),
File "/var/task/aws_lambda_powertools/event_handler/api_gateway.py", line 400, in __call__
return self._middleware_stack(app)
File "/var/task/aws_lambda_powertools/event_handler/api_gateway.py", line 1291, in __call__
return self.current_middleware(app, self.next_middleware)
File "/var/task/aws_lambda_powertools/event_handler/middlewares/base.py", line 121, in __call__
return self.handler(app, next_middleware)
File "/var/task/aws_lambda_powertools/event_handler/middlewares/openapi_validation.py", line 121, in handler
return self._handle_response(route=route, response=response)
File "/var/task/aws_lambda_powertools/event_handler/middlewares/openapi_validation.py", line 128, in _handle_response
response.body = self._serialize_response(
File "/var/task/aws_lambda_powertools/event_handler/middlewares/openapi_validation.py", line 187, in _serialize_response
return jsonable_encoder(response_content)
File "/var/task/aws_lambda_powertools/event_handler/openapi/encoders.py", line 108, in jsonable_encoder
return _dump_dict(
File "/var/task/aws_lambda_powertools/event_handler/openapi/encoders.py", line 212, in _dump_dict
encoded_value = jsonable_encoder(
File "/var/task/aws_lambda_powertools/event_handler/openapi/encoders.py", line 138, in jsonable_encoder
return _dump_other(
File "/var/task/aws_lambda_powertools/event_handler/openapi/encoders.py", line 272, in _dump_other
raise ValueError(errors) from e
ValueError: [TypeError("'ObjectId' object is not iterable"), TypeError('vars() argument must have __dict__ attribute')]

Code snippet

# Any object which contains a value of type Mongo ObjectId

ObjectId('666f6f2d6261722d71757578')

Possible Solution

My workaround is to monkey patch openapi encoder. Ideally openapi should reuse json encoder of APIGatewayRestResolver, or allow plugging in our own function.

from bson import ObjectId
from aws_lambda_powertools.event_handler.openapi.encoders import ENCODERS_BY_TYPE
ENCODERS_BY_TYPE[ObjectId] = lambda o: str(o)

Steps to Reproduce

Serialize an object which contains Mongo ObjectId

ObjectId('666f6f2d6261722d71757578')

Powertools for AWS Lambda (Python) version

latest

AWS Lambda function runtime

3.8

Packaging format used

PyPi

Debugging logs

No response

Activity

added
bugSomething isn't working
triagePending triage from maintainers
on Mar 6, 2024
changed the title [-]Bug: TITLE[/-] [+]Bug: Mongo ObjectIds not serializing out of the box[/+] on Mar 6, 2024
rubenfonseca

rubenfonseca commented on Mar 7, 2024

@rubenfonseca
Contributor

Looking at this now

rubenfonseca

rubenfonseca commented on Mar 7, 2024

@rubenfonseca
Contributor

Thank you for reporting this. Do you think it makes sense to include the custom serializer as part of the OpenAPI module? We could try to do that too.

self-assigned this
on Mar 7, 2024
moved this from Triage to Working on it in Powertools for AWS Lambda (Python)on Mar 7, 2024
aitchnyu

aitchnyu commented on Mar 7, 2024

@aitchnyu
Author

Yeah, I would like that.

rubenfonseca

rubenfonseca commented on Mar 7, 2024

@rubenfonseca
Contributor

@aitchnyu I've wrote a simple PR that should enable you to use the custom serializer like intended. Can you check it to see if it looks good to you? I would appreciate your feedback :)

aitchnyu

aitchnyu commented on Mar 7, 2024

@aitchnyu
Author

Maybe I'm too tired to brain this, but

  1. If I wanted to override BaseModel serialization, this would still use jsonable_encoder instead of my serializer right?
  2. since this is a recursive function, shouldn't recursive calls also get the serializer?
def jsonable_encoder(  # noqa: PLR0911
    obj: Any,
    include: Optional[IncEx] = None,
    exclude: Optional[IncEx] = None,
    by_alias: bool = True,
    exclude_unset: bool = False,
    exclude_defaults: bool = False,
    exclude_none: bool = False,
   custom_serializer: Optional[Callable[[Any], str]] = None,
) -> Any:
   ...
    if include is not None and not isinstance(include, (set, dict)):
        include = set(include)
    if exclude is not None and not isinstance(exclude, (set, dict)):
        exclude = set(exclude)
    # Pydantic models
    if isinstance(obj, BaseModel):
        return _dump_base_model(
            obj=obj,
            include=include,
            exclude=exclude,
            by_alias=by_alias,
            exclude_unset=exclude_unset,
            exclude_none=exclude_none,
            exclude_defaults=exclude_defaults,
        )
    # Dataclasses
    if dataclasses.is_dataclass(obj):
        obj_dict = dataclasses.asdict(obj)
        return jsonable_encoder(
            obj_dict,
            include=include,
            exclude=exclude,
            by_alias=by_alias,
            exclude_unset=exclude_unset,
            exclude_defaults=exclude_defaults,
            exclude_none=exclude_none,
        )
    # Enums
    if isinstance(obj, Enum):
        return obj.value
    # Paths
    if isinstance(obj, PurePath):
        return str(obj)
    # Scalars
    if isinstance(obj, (str, int, float, type(None))):
        return obj
    # Dictionaries
    if isinstance(obj, dict):
        return _dump_dict(
            obj=obj,
            include=include,
            exclude=exclude,
            by_alias=by_alias,
            exclude_none=exclude_none,
            exclude_unset=exclude_unset,
        )
    # Sequences
    if isinstance(obj, (list, set, frozenset, GeneratorType, tuple, deque)):
        return _dump_sequence(
            obj=obj,
            include=include,
            exclude=exclude,
            by_alias=by_alias,
            exclude_none=exclude_none,
            exclude_defaults=exclude_defaults,
            exclude_unset=exclude_unset,
        )
    # Other types
    if type(obj) in ENCODERS_BY_TYPE:
        return ENCODERS_BY_TYPE[type(obj)](obj)
    for encoder, classes_tuple in encoders_by_class_tuples.items():
        if isinstance(obj, classes_tuple):
            return encoder(obj)
    # Use custom serializer if present
    if custom_serializer:
        return custom_serializer(obj)

8 remaining items

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

Status

Shipped

Milestone

No milestone

Relationships

None yet

    Participants

    @rubenfonseca@aitchnyu

    Issue actions

      Bug: Mongo ObjectIds not serializing out of the box · Issue #3892 · aws-powertools/powertools-lambda-python