Skip to content

Race condition in validators #5760

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
6 tasks done
michael-k opened this issue Jan 22, 2018 · 11 comments
Closed
6 tasks done

Race condition in validators #5760

michael-k opened this issue Jan 22, 2018 · 11 comments

Comments

@michael-k
Copy link
Contributor

michael-k commented Jan 22, 2018

Checklist

  • I have verified that that issue exists against the master branch of Django REST framework.
  • I have searched for similar issues in both open and closed tickets and cannot find a duplicate.
  • This is not a usage question.
  • This cannot be dealt with as a third party library. (We prefer new functionality to be in the form of third party libraries where possible.)
  • I have reduced the issue to the simplest possible case.
  • I have included a failing test as a pull request. Test that validator's context is not used by another serializer #5761

Steps to reproduce

Preparation

Create these 2 models with the serializer and the view:

class Collection(Model):
    pass

class Document(Model):
    collection = models.ForeignKey(Collection, on_delete=models.CASCADE)
    uid = models.TextField()
    blob = postgres_fields.JSONField(default=dict)

    class Meta():
        unique_together = (('collection', 'uid'), )

class DocumentSerializer(serializers.ModelSerializer):
    class Meta:
        model = Document
        fields = '__all__'
        validators = [
            UniqueTogetherValidator(
                queryset=Document.objects.all(),
                fields=('collection', 'uid')
            )
        ]

class DocumentView(viewsets.ModelViewSet):
    serializer_class = DocumentSerializer

Exploit

Create two Documents (can be in the same collection) and send update requests for both documents in parallel:

# drf-race-condition.py
import sys
from datetime import datetime, timedelta

import requests

document_ids = [1, 2]  # TODO: adjust
base_url = 'http://example.org/'  # TODO: adjust

def send_request(index):
    document_update_response = requests.patch(
        url=f'{base_url}documents/{document_ids[index]}/',
        headers=headers,
        json={'blob': ''},
    )
    if document_update_response.status_code == 200:
        print(index, document_update_response.status_code)

if __name__ == '__main__':
    index = int(sys.argv[1])
    end = datetime.now() + timedelta(seconds=5)
    while end > datetime.now():
        send_request(index=index)
#!/bin/bash

python3 drf-race-condition.py 0 &
python3 drf-race-condition.py 1 &

Expected behavior

The documents are updated without any side effects.

Actual behavior

We run into 500 status codes. (ie. IntegrityError: duplicate key value violates unique constraint "document_document_collection_id_1234_uniq"; DETAIL: Key (collection_id, uid)=(1, 'foo') already exists.)

It's even worse, when the documents are in different collections. Then one of them gets the other one's uid assigned.

The problem

Serializers are instances of rest_framework.fields.Field and the following code (inside of run_validators) is responsible for running validators:

for validator in self.validators:
if hasattr(validator, 'set_context'):
validator.set_context(self)
try:
validator(value)

validator.set_context() sets validator.instance:

def set_context(self, serializer):
"""
This hook is called by the serializer instance,
prior to the validation call being made.
"""
# Determine the existing instance, if this is an update operation.
self.instance = getattr(serializer, 'instance', None)

Then validator() uses this instance. validator is the same instance for every instance of the serializer: DocumentSerializer(instance=d).validators[0] is DocumentSerializer(instance=d2).validators[0]. If two serializers call validator.set_context(self) before any one of them calls validator(value), both use the same instance.

Changing uid

It's even worse, when the documents are in different collections. Then one of them gets the other one's uid assigned.

validator.filter_queryset() adds uid (not sure why collection did not behave as I expected) to attr, which reference the serializers validated_data. This way uid of the wrong instance ends up in serializer and serializer.save() uses it.

def filter_queryset(self, attrs, queryset):
"""
Filter the queryset to all instances matching the given attributes.
"""
# If this is an update, then any unprovided field should
# have it's value set based on the existing instance attribute.
if self.instance is not None:
for field_name in self.fields:
if field_name not in attrs:
attrs[field_name] = getattr(self.instance, field_name)

Is this only theoretical?

  • The IntegrityError happened nearly 3000 times in the last 8 weeks on our production plattform. Every time a customer sent multiple PATCH requests to our plattform.
  • The IntegrityError and the changing uid happened reliably on every execution of the script against our staging plattform.

Possible fixes

  • Pass serializer/serializer_field to __call__ and drop set_context.

    Django's validators do not expect a second argument. According to drf's documentation one “can use any of Django's existing validators“.

    This can be solved by using inspect (which has differents APIs for Python 2 and 3) or by catching the TypeError (which could come from within the call). => Both solutions have their downsides.

    This would be a breaking change!

  • Clone the validator before calling set_context.

michael-k added a commit to michael-k/django-rest-framework that referenced this issue Jan 22, 2018
michael-k added a commit to michael-k/django-rest-framework that referenced this issue Jan 22, 2018
@xordoquy
Copy link
Collaborator

Nice catch and very nice work.
Most of the time, this will be handled by the fact that field's are deepcopied, except that indeed, root's serializers don't got through it.

@carltongibson carltongibson added this to the 3.8 Release milestone Jan 22, 2018
@dennypenta
Copy link

Sorry for the noise, but recently I thought that problem of race condition is that validation was working on while it didn't really make a query. So it looks like this, correct me pls if I will wrong.

Step 1 - retrieve the value with the unique constraint for a checking.
Step 2 - insert record - this we are thinking that we checked it.

But between step 1 and step 2 another record has been inserted.

@xordoquy
Copy link
Collaborator

xordoquy commented Feb 7, 2018

@dennypenta your use case is different from the OP's.
Yours may happen and is hard to prevent although not impossible.

@michael-k
Copy link
Contributor Author

@dennypenta Here's the issue related to your case: #3876

The issue was closed with this note:

If anyone is actually hitting this in production it'd be useful to know, but either way around I don't see it as likely that we'll do anything other than allowing the 5xx response by default - if you need something else, write some view code, and catch the exceptional case.

If this happens only for one of your endpoints, you can catch the IntegrityError in the serializer or view and return 4xx instead of 5xx.

If you want to solve it in a generic way, you can write a custom exception handler. Here's an example:

from django.db.utils import IntegrityError
from rest_framework.status import HTTP_400_BAD_REQUEST
from rest_framework.views import exception_handler


def my_exception_handler(exc, context):
    if isinstance(exc, IntegrityError) and \
            'duplicate key value violates unique constraint' in str(exc):
        return Response(
            'Unique constraint violated',
            status=HTTP_400_BAD_REQUEST,
        )

    return exception_handler(exc, context)

@michael-k
Copy link
Contributor Author

Is there anything I can do to move my PR #5762 forward?

@carltongibson
Copy link
Collaborator

carltongibson commented Feb 22, 2018

@michael-k: Not yet. It's milestoned for 3.8, which means we'll resolve it one way or the other before then. Most likely it's fine and we'll take it as-is — it looks great at first glance — but I just need to sit down with it to review it properly.

@anx-ckreuzberger
Copy link
Contributor

Good job catching this issue! I guess we need some more eyes and opinions to look over the changes in PR #5762 ?

@carltongibson carltongibson removed this from the 3.8 Release milestone Apr 3, 2018
michael-k added a commit to michael-k/django-rest-framework that referenced this issue Sep 10, 2018
@tomchristie
Copy link
Member

Presumably ATOMIC_REQUESTS = True would resolve this issue, right?

@michael-k
Copy link
Contributor Author

I've tried ATOMIC_REQUESTS = True with rest-framework 3.9.0 and Django 2.1, but it did not solve the issue.

@bluetech
Copy link
Contributor

@michael-k You are running DRF in multiple threads in the same process, right? And the bug happens when Python decides to release the GIL just between the set_context and the call?

This is a really bad bug, potentially causing silent data corruption. @michael-k's PR #6172 fixes it, and makes the code cleaner and safer in the process by removing the unneeded set_context mutability. I hope it gets merged for the next release.

@carltongibson carltongibson added this to the 3.10 Release milestone May 17, 2019
michael-k added a commit to michael-k/django-rest-framework that referenced this issue May 18, 2019
@rpkilby rpkilby removed this from the 3.10 Release milestone Jul 11, 2019
@rpkilby
Copy link
Member

rpkilby commented Jul 11, 2019

Removing from the milestone, as we're tracking through the PR.

pchiquet pushed a commit to pchiquet/django-rest-framework that referenced this issue Nov 17, 2020
* Do not persist the context in validators

Fixes encode#5760

* Drop set_context() in favour of 'requires_context = True'
sigvef pushed a commit to sigvef/django-rest-framework that referenced this issue Dec 3, 2022
* Do not persist the context in validators

Fixes encode#5760

* Drop set_context() in favour of 'requires_context = True'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
8 participants