-
Notifications
You must be signed in to change notification settings - Fork 5
Lambda Powertools Idempotency Support for SQS Batch Processor #32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hey @kimberlyamandalu, thank you so much for flagging this. We've been meaning to create or update Idempotency utility to become a decorator for any standalone function, not just Lambda handler. Adding this to backlog as we totally should do this and appreciate any help on implementing it cc @cakepietoast who did the original implementation for ideas. |
Just had a look at the source code and it'd indeed need some major internal changes to support any function with any signature. It'd be relatively quick to support any function with a single parameter (like Batch Processor Handler), but if we want to do it right, we should support any function not just the processor handler. @cakepietoast when you get some bandwidth, could you look into it as you're way more familiar with the implementation? |
@heitorlessa Thank you very much for adding this to your backlog. @cakepietoast thanks for looking into it! |
Yes! That can be easily replaced with fn.__name__ since is a sync function
- I think it was an oversight from our part.
…On Mon, 16 Aug 2021 at 14:32, kimberlyamandalu ***@***.***> wrote:
@heitorlessa <https://github.com/heitorlessa> Thank you very much for
adding this to your backlog.
We are looking into the code as well. Will definitely share any findings
or suggestions for how to implement. At the moment, what we've noticed is
that, the idempotency utility is quite dependent on the Lambda Context
object, but it mainly uses it to add the Lambda function name to the
primary key value in the DynamoDB table, which makes sense.
@cakepietoast <https://github.com/cakepietoast> thanks for looking into
it!
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#32 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAZPQBATI7S6SS3UI6LUGOTT5EAO3ANCNFSM5CCEWOCQ>
.
|
Thanks for moving this so quickly to Working on it! One thought on removing the need for Lambda Context object just to access the Lambda Function Name for building the Dynamo Key is to leverage the default Lambda Environment Variable for the Lambda Function Name - |
As I was getting to know better the idempotency logic I realized if I remove it it'll break people's unit tests. Relying on I'll create a new decorator instead, though I'm pondering on two solutions and would appreciate your help deciding it: A. Single argument only: Accept any function with a single argument where the argument is the actual payload we'll hash and optionally extract idempotency tokens ( B. Function parameter signature: Accept any function with an arbitrary number of parameters, but must have a keyword named argument named C. For example:
# WORKS
@another_idempotent_name_i_need_to_figure_out
def record_handler(record: Dict): ...
# WILL NOT WORK
@another_idempotent_name_i_need_to_figure_out
def record_handler(record: Dict, my_other_arg: str = ""): ...
# WILL NOT WORK
@another_idempotent_name_i_need_to_figure_out
def record_handler(record: Dict, my_other_arg: str): ...
# WORKS
@another_idempotent_name_i_need_to_figure_out
def record_handler(*, payload: dict): ...
# WORKS
@another_idempotent_name_i_need_to_figure_out
def record_handler(*, payload: dict, **kwargs): ...
# MIGHT WORK, DEPENDS ON HOW IT'S CALLED
@another_idempotent_name_i_need_to_figure_out
def record_handler(payload: dict, **kwargs): ...
# WORKS
# record_handler(payload={}, foo="bar")
# WILL NOT WORK
# record_handler({}, foo="bar") |
I personally prefer Option B # WORKS
@another_idempotent_name_i_need_to_figure_out
def record_handler(*, payload: dict, **kwargs): ... I think the flexibility this will offer allows the idempotent utility for non lambda handlers to be easily extensible, if needed. Let me know if this makes sense or if I have terribly misunderstood your point. Thank you!! |
Oh I see, you want a third option in this case then @kimberlyamandalu like:
We could do that too, I didn't factor that in because of the JMESPath feature. This feature allows you to select a portion of the incoming payload and extract what you want as Idempotency tokens. I'll get back to the whiteboard, write a RFC and then ping it back here - meanwhile, I'll have another look at creative solutions to simply integrate the Batch utility as-is too Thanks a lot for the help |
hey @heitorlessa, To clarify, I picked one of the choices you had listed for Option B (the second one that works). Wanted to make sure we were on the same page =) Currently, the JMESPath feature is included in the Idempotency Config object that is passed to the idempotent function. We do want to have this feature retained. I think it is extremely useful. # WORKS
@another_idempotent_name_i_need_to_figure_out
def record_handler(*, payload: dict): ...
##### SELECTED THIS OPTION #####
# WORKS
@another_idempotent_name_i_need_to_figure_out
def record_handler(*, payload: dict, **kwargs): ...
# MIGHT WORK, DEPENDS ON HOW IT'S CALLED
@another_idempotent_name_i_need_to_figure_out
def record_handler(payload: dict, **kwargs): ...
# WORKS
# record_handler(payload={}, foo="bar")
# WILL NOT WORK
# record_handler({}, foo="bar") |
Hey @kimberlyamandalu I've updated Slack but updating it here too. We've finally come to an acceptable design to keep all features. We had to make three compromises for this to work:
For the Batch utility, you don't have to worry about 3 as we are doing this on your behalf. Assuming you want to make sure you only process a SQS record that has a unique message ID once, this is how it'll look like in the next release: from aws_lambda_powertools.utilities.batch import sqs_batch_processor
from aws_lambda_powertools.utilities.idempotency import idempotent_function, DynamoDBPersistenceLayer, IdempotencyConfig
dynamodb = DynamoDBPersistenceLayer(table_name="idem")
config = IdempotencyConfig(
event_key_jmespath="messageId",
use_local_cache=True,
)
@idempotent_function(data_keyword_argument="record", persistence_layer=dynamodb, config=config)
def record_handler(record):
return do_something_with(record["body"])
@sqs_batch_processor(record_handler=record_handler)
def lambda_handler(event, context):
return {"statusCode": 200}
Here's a dummy example to demonstrate the constraint 3. from aws_lambda_powertools.utilities.idempotency import idempotent_function, DynamoDBPersistenceLayer, IdempotencyConfig
import uuid
records = {
"messageId": "059f36b4-87a3-44ab-83d2-661975830a7d",
"receiptHandle": "AQEBwJnKyrHigUMZj6rYigCgxlaS3SLy0a...",
"body": "Test message.",
"attributes": {
"ApproximateReceiveCount": "1",
"SentTimestamp": "1545082649183",
"SenderId": "AIDAIENQZJOLO23YVJ4VO",
"ApproximateFirstReceiveTimestamp": "1545082649185"
},
"messageAttributes": {
"testAttr": {
"stringValue": "100",
"binaryValue": "base64Str",
"dataType": "Number"
}
},
"md5OfBody": "e4e68fb7bd0e697a0ae8f1bb342846b3",
"eventSource": "aws:sqs",
"eventSourceARN": "arn:aws:sqs:us-east-2:123456789012:my-queue",
"awsRegion": "us-east-2"
}
persistence_layer = DynamoDBPersistenceLayer(table_name="idem")
config = IdempotencyConfig(
event_key_jmespath="messageId",
use_local_cache=True,
)
@idempotent_function(persistence_store=persistence_layer, data_keyword_argument="data", config=config)
def handler(any_arg, data, and_kwarg_too=False):
"""Dummy function with an arbitrary number of args and named args"""
return {"some_id": f"{str(uuid.uuid4())}"}
ret = handler("blah", data=records, and_kwarg_too=True)
print(ret) |
Looks perfect - thanks again for picking it up and moving on it so fast! Should be seamless for us to move to this after the next release! |
Merged! Coming in today's release |
We use the Powertools SQS Batch Processor utility with our Lambda function. The documentation suggests implementing idempotency into the function logic. We were trying to leverage the Powertools Idempotent feature. There seems to be some limitations with this approach as the Idempotency feature only checks against the entire SQS batch of records instead of each individual record in the batch. So kind of defeats the purpose of the SQS Batch Processor.
Is there any way we can utilize powertools idempotency utility in our lambda with the SQS batch processor utility having the idempotency utility check at an individual record rather than the full batch.
Reading through the source code, it seems like it assumes that it will wrap the lambda handler only so it seems impossible?
Thanks.
The text was updated successfully, but these errors were encountered: