Skip to content
This issue has been moved to a discussionGo to the discussion

Can powertools ensure the idempotence of all kinds of functions #801

Closed
@Nsupyq

Description

@Nsupyq

When reading the document about idempotency, I am wondering whether the powertool can convert all functions to be idempotent.

For example, if a function tries to increase the value of a variable in DynamoDB or other databases, I think it cannot be idempotent unless writing the functional return value and increasing the variable are completed atomically.

I suggest that the document should describe which kinds of functions can achieve the idempotence via powertools in detail.
Please let me know if my understanding is correct.

Activity

boring-cyborg

boring-cyborg commented on Nov 4, 2021

@boring-cyborg

Thanks for opening your first issue here! We'll come back to you as soon as we can.

self-assigned this
on Nov 4, 2021
to-mc

to-mc commented on Nov 5, 2021

@to-mc
Contributor

If you wrap any function with the @idempotent_function decorator, the entire function will behave in an idempotent manner. Meaning, if the function is called twice with the same arguments, it will only be executed once. This is true regardless of the contents of the function. The utility doesn't alter the contents of the decorated function in any way. Given that the function is called multiple times within the idempotency expiry period, with the same arguments, the code in the body of the function will only be executed the first time the function is called. Subsequent calls will receive the same response, which the idempotency utility retreieves from its data store instead of executing the function again.

The idempotency of the specific operations contained within the function (like updating a counter) should not be relevant here. Take the below example:

@idempotent_function(data_keyword_argument="data", config=config, persistence_store=dynamodb)
def dummy(arg_one, arg_two, data: dict, **kwargs):

    # make an "unsafe" update to dynamodb counter
    get_counter_value_from_dynamodb(Key=data)
    increment_counter_value()
    set_counter_to_new_value_in_dynamodb(Key=data)
    #############################################


    return {"data": counter}

In this case, the idempotent utility will not allow any of the code in this function to be called more than once within the idempotent expiry period. If you call the function a second (and third, fourth, and so on...) time after the first execution has completed, the idempotency util will deliver the same response as the first execution, without running the function code again. Assuming there's a separate counter for each data in the example, there should be no possibility that the same counter is being updated more than once at the same time.

Disclaimer: the example is very much a contrived one, there are better ways to do this with DynamoDB alone that don't require the idempotent utility.

Does that answer your question?

Nsupyq

Nsupyq commented on Nov 5, 2021

@Nsupyq
Author

Thank you for your answer @cakepietoast!
But I still have a question.

If you call the function a second (and third, fourth, and so on...) time after the first execution has completed, the idempotency util will deliver the same response as the first execution, without running the function code again.

It seems that powertool only considers the retry happening after a function has completed. I am wondering what will happen if the function fails after set_counter_to_new_value_in_dynamodb(Key=data) and before return {"data": counter}. According to the document, in this case, the powertool will not write the return value into DynamoDB and the function can be executed for the second time. But the counter has been increased in the first failed execution. Therefore, the counter will be increased twice on retry. I am wondering if powertool can properly address such retry.

to-mc

to-mc commented on Nov 10, 2021

@to-mc
Contributor

I am wondering what will happen if the function fails after set_counter_to_new_value_in_dynamodb(Key=data) and before return {"data": counter}. According to the document, in this case, the powertool will not write the return value into DynamoDB and the function can be executed for the second time.

This is correct, though you do have control over this as a user of the library. If you don't want the function to be retried in its entirety, you can catch any exceptions and return a valid response from your Lambda function instead of allowing the Exception to bubble up. Example:

@idempotent_function(data_keyword_argument="data", config=config, persistence_store=dynamodb)
def dummy(arg_one, arg_two, data: dict, **kwargs):

    # make an "unsafe" update to dynamodb counter
    get_counter_value_from_dynamodb(Key=data)
    increment_counter_value()
    set_counter_to_new_value_in_dynamodb(Key=data)
    
    try:
        some_other_call_that_raises_an_exception()
    except Exception as err:
        logger.error(err)
        return {"data": None, "error": str(err)}


    return {"data": counter}

This is mentioned in the handling exceptions section of the docs.

Having said that, it is a good idea to make your idempotent functions as small as you possibly can, with any code that doesn't need to be executed as idempotent outside the function. To continue with my (increasingly contrived) example from above:

def lambda_handler(event, context):
    do_some_stuff()
    result = dummy("one", "two", {"foo": "bar", "baz": "qux"})
    some_other_call_that_raises_an_exception()


@idempotent_function(data_keyword_argument="data", config=config, persistence_store=dynamodb)
def dummy(arg_one, arg_two, data: dict, **kwargs):

    # make an "unsafe" update to dynamodb counter
    get_counter_value_from_dynamodb(Key=data["foo"])
    increment_counter_value()
    set_counter_to_new_value_in_dynamodb(Key=data["foo"])
    return {"data": counter}

In this case, the code that can cause an exception - but is unrelated to the code that needs to be idempotent - is outside of the idempotent function. Now, when an exception is raised, it will be outside of the context of the function and not cause the record to be deleted. I can see that the exception handling part of the document needs updating to reflect this. It was written before we implemented the idempotent_function decorator, and doesn't account for it. I'll make these changes in PR #808 to clarify.

Nsupyq

Nsupyq commented on Nov 11, 2021

@Nsupyq
Author

Thank you for your detailed explanation @cakepietoast !
I still have a question. I think that the runtime exception is not the only factor that can trigger failure and retry. Some other things, such as system crash and hardware fault, can also cause function failure. Then the function cannot catch these factors.

def dummy(arg_one, arg_two, data: dict, **kwargs):
    get_counter_value_from_dynamodb(Key=data["foo"])
    increment_counter_value()
    set_counter_to_new_value_in_dynamodb(Key=data["foo"])
    return {"data": counter}

For example, when the machine running the function crashes after executing set_counter_to_new_value_in_dynamodb and before writing the function result into DynamoDB, the function will be retried and increases the counter again.

I am wondering how the powertool addresses this kind of failure.

to-mc

to-mc commented on Nov 11, 2021

@to-mc
Contributor

Thank you for your detailed explanation @cakepietoast ! I still have a question. I think that the runtime exception is not the only factor that can trigger failure and retry. Some other things, such as system crash and hardware fault, can also cause function failure. Then the function cannot catch these factors.

def dummy(arg_one, arg_two, data: dict, **kwargs):
    get_counter_value_from_dynamodb(Key=data["foo"])
    increment_counter_value()
    set_counter_to_new_value_in_dynamodb(Key=data["foo"])
    return {"data": counter}

For example, when the machine running the function crashes after executing set_counter_to_new_value_in_dynamodb and before writing the function result into DynamoDB, the function will be retried and increases the counter again.

I am wondering how the powertool addresses this kind of failure.

It is important to remember that Powertools is "just" a library that executes within the scope of your Lambda Function. It "wraps" your decorated python function, injecting its idempotency logic before and after your decorated python function is executed. In the case of underlying hardware failure during execution of your decorated python function, no more code execution can happen - including any Powertools/idempotency logic.

Specifically in the scenario you describe, when your python function is executed, the following will happen:

  1. An INPROGRESS idempotency record would be written to the persistent store to acquire a lock before any of your function code is allowed to begin executing.
  2. Your function begins executing, and successfully executes set_counter_to_new_value_in_dynamodb.
  3. The underlying hardware crashes before the function successfully returns.
  4. No more code is executed, so the idempotency logic cannot delete/update the INPROGRESS record to release the lock.
  5. On subsequent executions, it would look like your python function is still in progress. The idempotency utility would fail to acquire the lock, and would raise an IdempotencyAlreadyInProgressError rather than executing your function code.
  6. The idempotent record would expire after the period you configured, and the function would be retryable again.

As a side note: you can replace step 3. above with "the Lambda Function times out" as the behaviour there is the same.

Nsupyq

Nsupyq commented on Nov 11, 2021

@Nsupyq
Author

Ok, I see. Thank you very much!

locked and limited conversation to collaborators on Nov 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

Labels

documentationImprovements or additions to documentationneed-more-informationPending information to continue

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

    Development

    No branches or pull requests

      Participants

      @to-mc@Nsupyq

      Issue actions

        Can powertools ensure the idempotence of all kinds of functions · Issue #801 · aws-powertools/powertools-lambda-python