Skip to content

Computation of idempotency hash key using unqualified name  #1330

Closed
@dispyfree

Description

@dispyfree

Expected Behaviour

The idempotency layer should be using the fully qualified name of the input function to compute the hash key.
Using unqualified names may result in two distinct functions and/or methods contending for the same set of idempotency items.

Current Behaviour

In IdempotencyHandler, line 77:

persistence_store.configure(config, self.function.__name__)

Possible Solution

persistence_store.configure(config, self.function.__qualname__)

Note that it should be checked whether __qualname__ is subject to Python module import peculiarities. In particular, whether import module and from module import funcName yield the same __qualname__.

Steps to Reproduce

class A:
    @idempotent_function
    def test(self, myParam):
        return myParam
    

class B(A):
    
    @idempotent_function
    def test(self, myParam):
        return myParam

AWS Lambda Powertools for Python version

latest

AWS Lambda function runtime

3.9

Packaging format used

PyPi

Debugging logs

No response

Activity

added
bugSomething isn't working
triagePending triage from maintainers
on Jul 21, 2022
heitorlessa

heitorlessa commented on Jul 21, 2022

@heitorlessa
Contributor

Thanks for flagging this @dispyfree - we indeed use qualified name elsewhere in the code but not in the idempotency. According to the PEP 3155, it excludes the module name so the result would be the same regardless of how it was imported.

Would you like to send a fix as a first contributor? Alternatively, we can also fix it ourselves and give you credit as always.

cc @rubenfonseca

Thank you!

self-assigned this
on Jul 21, 2022
heitorlessa

heitorlessa commented on Jul 21, 2022

@heitorlessa
Contributor

Assigning to me so I can push a fix and make a patch release tomorrow

dispyfree

dispyfree commented on Jul 21, 2022

@dispyfree
Author

That's an interesting bit @heitorlessa.
So using __qualname__ will fix it with respect to nesting inside of

  • functions
  • classes

By excluding the module, however, we can - in theory - still have two distinct functions contest for the same idempotency items. They just bear the same name, have the same parameters and sit in two modules.
Would it make sense to explicitly include __module__ as well? By definition, this is the module that the function was originally defined in, so it is not subject to module aliasing, monkey patching and other dark magic.

Even then I am not sure whether this will then cover all conceivable cases; but at least I can't think of another way to break this off the top of my head :)

Btw. avid users of AWS Powertools speaking here - we have fixed the scale-in problem in user code and I hope I'll be able to feed that back into the upstream project.

leandrodamascena

leandrodamascena commented on Jul 22, 2022

@leandrodamascena
Contributor

Hello @dispyfree ! Reading your last message, I think it makes a lot of sense to add __module__ in addition to __qualname__.

I reproduced here and if we replace __name__ with __qualname__ and don't add the function module name, we run the risk of having two functions with the same name, same parameters and that are defined in different modules. In this case, the hash will be calculated for the first caller only. So I think it's not just a theory, it's a fact.

I created 2 scenarios to simulate the two situations

1 - Replacing __name__ with __qualname__

Main code
image

Modtest
image

base.py
image

Unexpected result
Despite calling 3 different functions (with same name), the code just calculated and stored two in the DDB.
image

2 - Replacing name with __qualname__ +__module__

Main code and modtest code remain the same

base.py
image

Expected result
We have the 3 records for the 3 functions with the same name and different modules/classes.
image

I'm also not 100% sure if this covers all scenarios, but I don't see any other way this could give a problem setting the qualified name and module name.

@heitorlessa I also believe that this will generate a small load for customers who already have the idempotency records generated, because as the key will change, all records will need to be generated again. But I don't see a way to do this without any kind of "impact".

I hope this debug helps in some way to have a definition about this.

Thank you.

heitorlessa

heitorlessa commented on Jul 22, 2022

@heitorlessa
Contributor

Thank you both, that's super helpful. I agree in including the module as that will cover cases where people use the new typing.Protocol as well as traditional ABC interfaces.

@dispyfree would you like to do the honour and contribute the fix today?

We want to make a release today with this fix.

PS: @dispyfree I'm stoked you're an avid customer. I'd love to hear more from you one of these days ❤️

dispyfree

dispyfree commented on Jul 24, 2022

@dispyfree
Author

@heitorlessa I'd be pleased to contribute. Feel free to assign to me; I shall have a go at it in the coming days :)

heitorlessa

heitorlessa commented on Jul 25, 2022

@heitorlessa
Contributor

Awesome. I've had a discussion with @rubenfonseca on the impact this breaking change could have given the change of keys. We assessed that it might impact over a million functions, since it'll invalidate an hour worth of transactions (default) or more (custom TTL).

We came to the conclusion that there are two ways we could proceed:

  1. Feature toggle. We could introduce this fix behind a keyword argument that would be False by default. We document it in the user guide, add a warning to everyone about unqualified/qualified, and how to enable it. Upon v2, we could make it the default behavior.
  2. Introduce the fix in 2.0 (mid-Aug). We'll cut a v2 release following Python 3.6 Lambda runtime deprecation in Aug 17th, and given it's only ~3.5 weeks away this means it's feasible to introduce this by then. We want to minimize breaking changes as much as possible in v2 as we haven't had time to discuss plans with customers yet, but this is worth it.

If you agree, let's go with 1 for now, and we can release this sooner than v2. Otherwise, we can do with 2, and update the documentation with a warning beforehand.

Please do shout out to @rubenfonseca @leandrodamascena or myself if you get stuck with writing tests for Idempotency - it's the most complex to write and we want to move away from the stubber approach soon.

16 remaining items

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

    Development

    No branches or pull requests

      Participants

      @rubenfonseca@dispyfree@heitorlessa@leandrodamascena

      Issue actions

        Computation of idempotency hash key using unqualified name · Issue #1330 · aws-powertools/powertools-lambda-python