Skip to content

Avoid Django's lazy() when creating validators because it is too slow #6644

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

bluetech
Copy link
Contributor

@bluetech bluetech commented May 2, 2019

As a fix for issue #3354, commit 607e4ed made the
evaluation of some validation error messages lazy. To achieve that,
Django's django.utils.functional.lazy() function was used. However, that
function is extremely heavy and slow, and slows down string validation
significantly (lazy() is evaluated each time for each validator for each
field which has one). We noticed this in our production system.

Avoid lazy and use a lambda to defer the evaluation instead. This uses
the fact that a Django validator is just a simple callable.

Using the benchmark attached to the PR (snipped to tottime>100ms):

Before, model serializer:

         9225123 function calls (9200068 primitive calls) in 8.337 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    25000    1.299    0.000    3.534    0.000 functional.py:125(__prepare_class__)
  2415001    0.954    0.000    0.954    0.000 {built-in method builtins.hasattr}
  1350000    0.901    0.000    0.901    0.000 functional.py:145(__promise__)
  1550001    0.521    0.000    0.521    0.000 {built-in method builtins.setattr}
    25000    0.494    0.000    0.735    0.000 {built-in method builtins.__build_class__}
    30000    0.298    0.000    0.385    0.000 fields.py:297(__init__)
    25000    0.289    0.000    5.895    0.000 fields.py:740(__init__)
    25000    0.264    0.000    0.802    0.000 field_mapping.py:66(get_field_kwargs)
    25000    0.241    0.000    0.241    0.000 functional.py:100(__proxy__)
670003/670001    0.203    0.000    0.211    0.000 {built-in method builtins.getattr}
     5000    0.189    0.000    7.722    0.002 serializers.py:990(get_fields)
    25000    0.186    0.000    0.400    0.000 functools.py:186(total_ordering)
    25000    0.158    0.000    0.299    0.000 functional.py:234(wrapper)
     5000    0.129    0.000    0.136    0.000 serializers.py:1066(get_field_names)
    25000    0.104    0.000    1.002    0.000 serializers.py:1195(build_standard_field)

After, model serializer:

         3400123 function calls (3400068 primitive calls) in 2.824 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    30000    0.259    0.000    0.341    0.000 fields.py:294(__init__)
    25000    0.242    0.000    0.674    0.000 field_mapping.py:66(get_field_kwargs)
     5000    0.176    0.000    2.209    0.000 serializers.py:990(get_fields)
    25000    0.133    0.000    0.514    0.000 fields.py:737(__init__)
295003/295001    0.106    0.000    0.115    0.000 {built-in method builtins.getattr}

Before, regular serializer:

         8060003 function calls (7960003 primitive calls) in 7.123 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    25000    1.569    0.000    3.897    0.000 functional.py:125(__prepare_class__)
  1350000    1.013    0.000    1.013    0.000 functional.py:145(__promise__)
  2365000    0.925    0.000    0.925    0.000 {built-in method builtins.hasattr}
  1550000    0.512    0.000    0.512    0.000 {built-in method builtins.setattr}
    25000    0.378    0.000    0.550    0.000 {built-in method builtins.__build_class__}
    25000    0.307    0.000    5.946    0.000 fields.py:740(__init__)
    30000    0.277    0.000    0.360    0.000 fields.py:297(__init__)
80000/5000    0.202    0.000    6.526    0.001 copy.py:132(deepcopy)
    25000    0.172    0.000    0.172    0.000 functional.py:100(__proxy__)
   540000    0.152    0.000    0.152    0.000 {built-in method builtins.getattr}
    25000    0.119    0.000    6.199    0.000 fields.py:604(__deepcopy__)

After, regular serializer:

         2010003 function calls (1935003 primitive calls) in 1.456 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    30000    0.194    0.000    0.263    0.000 fields.py:294(__init__)
80000/5000    0.151    0.000    0.923    0.000 copy.py:132(deepcopy)
    25000    0.128    0.000    0.470    0.000 fields.py:737(__init__)

As a fix for issue encode#3354, commit 607e4ed made the
evaluation of some validation error messages lazy. To achieve that,
Django's django.utils.functional.lazy() function was used. However, that
function is extremely heavy and slow, and slows down string validation
significantly (lazy() is evaluated each time for each validator for each
field which has one). We noticed this in our production system.

Avoid lazy and use a lambda to defer the evaluation instead. This uses
the fact that a Django validator is just a simple callable.

Using the benchmark attached to the PR (snipped to tottime>100ms):

Before, model serializer:

         9225123 function calls (9200068 primitive calls) in 8.337 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    25000    1.299    0.000    3.534    0.000 functional.py:125(__prepare_class__)
  2415001    0.954    0.000    0.954    0.000 {built-in method builtins.hasattr}
  1350000    0.901    0.000    0.901    0.000 functional.py:145(__promise__)
  1550001    0.521    0.000    0.521    0.000 {built-in method builtins.setattr}
    25000    0.494    0.000    0.735    0.000 {built-in method builtins.__build_class__}
    30000    0.298    0.000    0.385    0.000 fields.py:297(__init__)
    25000    0.289    0.000    5.895    0.000 fields.py:740(__init__)
    25000    0.264    0.000    0.802    0.000 field_mapping.py:66(get_field_kwargs)
    25000    0.241    0.000    0.241    0.000 functional.py:100(__proxy__)
670003/670001    0.203    0.000    0.211    0.000 {built-in method builtins.getattr}
     5000    0.189    0.000    7.722    0.002 serializers.py:990(get_fields)
    25000    0.186    0.000    0.400    0.000 functools.py:186(total_ordering)
    25000    0.158    0.000    0.299    0.000 functional.py:234(wrapper)
     5000    0.129    0.000    0.136    0.000 serializers.py:1066(get_field_names)
    25000    0.104    0.000    1.002    0.000 serializers.py:1195(build_standard_field)

After, model serializer:

         3400123 function calls (3400068 primitive calls) in 2.824 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    30000    0.259    0.000    0.341    0.000 fields.py:294(__init__)
    25000    0.242    0.000    0.674    0.000 field_mapping.py:66(get_field_kwargs)
     5000    0.176    0.000    2.209    0.000 serializers.py:990(get_fields)
    25000    0.133    0.000    0.514    0.000 fields.py:737(__init__)
295003/295001    0.106    0.000    0.115    0.000 {built-in method builtins.getattr}

Before, regular serializer:

         8060003 function calls (7960003 primitive calls) in 7.123 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    25000    1.569    0.000    3.897    0.000 functional.py:125(__prepare_class__)
  1350000    1.013    0.000    1.013    0.000 functional.py:145(__promise__)
  2365000    0.925    0.000    0.925    0.000 {built-in method builtins.hasattr}
  1550000    0.512    0.000    0.512    0.000 {built-in method builtins.setattr}
    25000    0.378    0.000    0.550    0.000 {built-in method builtins.__build_class__}
    25000    0.307    0.000    5.946    0.000 fields.py:740(__init__)
    30000    0.277    0.000    0.360    0.000 fields.py:297(__init__)
80000/5000    0.202    0.000    6.526    0.001 copy.py:132(deepcopy)
    25000    0.172    0.000    0.172    0.000 functional.py:100(__proxy__)
   540000    0.152    0.000    0.152    0.000 {built-in method builtins.getattr}
    25000    0.119    0.000    6.199    0.000 fields.py:604(__deepcopy__)

After, regular serializer:

         2010003 function calls (1935003 primitive calls) in 1.456 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    30000    0.194    0.000    0.263    0.000 fields.py:294(__init__)
80000/5000    0.151    0.000    0.923    0.000 copy.py:132(deepcopy)
    25000    0.128    0.000    0.470    0.000 fields.py:737(__init__)
@bluetech
Copy link
Contributor Author

bluetech commented May 2, 2019

Here is the benchmark that I used: https://gist.github.com/bluetech/cdd3b3d6d9d8cfc7b1dfee3c6482679d

@@ -95,6 +95,20 @@ def pytest_configure(config):
settings.STATIC_ROOT = os.path.join(os.path.dirname(rest_framework.__file__), 'static-root')
settings.STATICFILES_STORAGE = 'django.contrib.staticfiles.storage.ManifestStaticFilesStorage'

# Test that serializer fields can be created before django is set-up.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this regression test in order to alert someone who looks at the weird lambdas and tries to remove them. I thought it better than to add a comment each time.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be moved to tests.importable. Basically, it's a fake app that ensures parts of DRF can be imported before Django is fully setup. The tests just ensure that the various imports have occurred in the app __init__.

Copy link

@yangjian00 yangjian00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tomchristie
Copy link
Member

tomchristie commented May 20, 2019

Okay, so this looks like some really good work here. It's also a big change footprint, which means that it's risky and difficult for us to take a call on.

Can you outline exactly what you think the behavioral changes would be as a result of this change?

The lazy translations provide the ability to deliver appropriately translated versions to the use on a per-request basis. Does this implementation change anything there that might not currently be caught by our test suite? When exactly are the translations made with this change? Can you walk me through exactly what's going on here?...

self.validators.append(lambda value: MaxLengthValidator(
    self.max_length,
    message=self.error_messages['max_length'].format(max_length=self.max_length),
)(value))

And how it differs exactly from the existing case?

@bluetech
Copy link
Contributor Author

Okay, so this looks like some really good work here. It's also a big change footprint, which means that it's risky and difficult for us to take a call on.

Thanks for taking a look.

Can you outline exactly what you think the behavioral changes would be as a result of this change?

The main behavioral change is that the type of the validator in self.validators is no longer e.g. MaxLengthValidator but some anonymous lambda instead. If someone goes over some_field_instance.validators and does isinstance checks, this will break for the particular validators this PR is wrapping.

Besides that, I can't think of anything.

The lazy translations provide the ability to deliver appropriately translated versions to the use on a per-request basis. Does this implementation change anything there that might not currently be caught by our test suite? When exactly are the translations made with this change?

In the lazy version, the string is evaluated when the error is formatted (get_error_detail).

In this version, it is evaluated when the validator is called.

They both happen from the same stack frame (run_validators) so in terms of the request context, it's the same.

Can you walk me through exactly what's going on here?...

The requirement is to delay the evaluation of the error message passed to the validators.

The previous version achieved this by making the format function itself lazy, and used a little hack (the CustomValidatorMessage class) to ensure it is only evaluated in get_error_detail.

This version achieves this in the classical way, by wrapping in a lambda. This works cleanly (no need for a "force") because a Django validator is just a single-argument callable, so validator and lambda x: validator(x) are functionally equivalent, other than the type (as discussed above) and possible surprises like PR #6172 (which is compatible with this change).


Finally I should also mention that an alternative fix is available which mostly keeps the current approach:

Expand
diff --git a/rest_framework/fields.py b/rest_framework/fields.py
index aecfa330..b253c754 100644
--- a/rest_framework/fields.py
+++ b/rest_framework/fields.py
@@ -23,7 +23,7 @@ from django.utils.dateparse import (
 from django.utils.duration import duration_string
 from django.utils.encoding import is_protected_type, smart_text
 from django.utils.formats import localize_input, sanitize_separators
-from django.utils.functional import lazy
+from django.utils.text import format_lazy
 from django.utils.ipv6 import clean_ipv6_address
 from django.utils.timezone import utc
 from django.utils.translation import gettext_lazy as _
@@ -744,12 +744,11 @@ class CharField(Field):
         self.min_length = kwargs.pop('min_length', None)
         super().__init__(**kwargs)
         if self.max_length is not None:
-            message = lazy(self.error_messages['max_length'].format, str)(max_length=self.max_length)
+            message = format_lazy(self.error_messages['max_length'], max_length=self.max_length)
             self.validators.append(
                 MaxLengthValidator(self.max_length, message=message))
         if self.min_length is not None:
-            message = lazy(
-                self.error_messages['min_length'].format, str)(min_length=self.min_length)
+            message = format_lazy(self.error_messages['min_length'], min_length=self.min_length)
             self.validators.append(
                 MinLengthValidator(self.min_length, message=message))
 
@@ -910,13 +909,11 @@ class IntegerField(Field):
         self.min_value = kwargs.pop('min_value', None)
         super().__init__(**kwargs)
         if self.max_value is not None:
-            message = lazy(
-                self.error_messages['max_value'].format, str)(max_value=self.max_value)
+            message = format_lazy(self.error_messages['max_value'], max_value=self.max_value)
             self.validators.append(
                 MaxValueValidator(self.max_value, message=message))
         if self.min_value is not None:
-            message = lazy(
-                self.error_messages['min_value'].format, str)(min_value=self.min_value)
+            message = format_lazy(self.error_messages['min_value'], min_value=self.min_value)
             self.validators.append(
                 MinValueValidator(self.min_value, message=message))
 
@@ -948,15 +945,11 @@ class FloatField(Field):
         self.min_value = kwargs.pop('min_value', None)
         super().__init__(**kwargs)
         if self.max_value is not None:
-            message = lazy(
-                self.error_messages['max_value'].format,
-                str)(max_value=self.max_value)
+            message = format_lazy(self.error_messages['max_value'], max_value=self.max_value)
             self.validators.append(
                 MaxValueValidator(self.max_value, message=message))
         if self.min_value is not None:
-            message = lazy(
-                self.error_messages['min_value'].format,
-                str)(min_value=self.min_value)
+            message = format_lazy(self.error_messages['min_value'], min_value=self.min_value)
             self.validators.append(
                 MinValueValidator(self.min_value, message=message))
 
@@ -1007,14 +1000,11 @@ class DecimalField(Field):
         super().__init__(**kwargs)
 
         if self.max_value is not None:
-            message = lazy(
-                self.error_messages['max_value'].format,
-                str)(max_value=self.max_value)
+            message = format_lazy(self.error_messages['max_value'], max_value=self.max_value)
             self.validators.append(
                 MaxValueValidator(self.max_value, message=message))
         if self.min_value is not None:
-            message = lazy(
-                self.error_messages['min_value'].format, str)(min_value=self.min_value)
+            message = format_lazy(self.error_messages['min_value'], min_value=self.min_value)
             self.validators.append(
                 MinValueValidator(self.min_value, message=message))
 
@@ -1352,15 +1342,11 @@ class DurationField(Field):
         self.min_value = kwargs.pop('min_value', None)
         super().__init__(**kwargs)
         if self.max_value is not None:
-            message = lazy(
-                self.error_messages['max_value'].format,
-                str)(max_value=self.max_value)
+            message = format_lazy(self.error_messages['max_value'], max_value=self.max_value)
             self.validators.append(
                 MaxValueValidator(self.max_value, message=message))
         if self.min_value is not None:
-            message = lazy(
-                self.error_messages['min_value'].format,
-                str)(min_value=self.min_value)
+            message = format_lazy(self.error_messages['min_value'], min_value=self.min_value)
             self.validators.append(
                 MinValueValidator(self.min_value, message=message))
 
@@ -1881,8 +1867,7 @@ class ModelField(Field):
         max_length = kwargs.pop('max_length', None)
         super().__init__(**kwargs)
         if max_length is not None:
-            message = lazy(
-                self.error_messages['max_length'].format, str)(max_length=self.max_length)
+            message = format_lazy(self.error_messages['max_length'], max_length=self.max_length)
             self.validators.append(
                 MaxLengthValidator(self.max_length, message=message))

but

So I still prefer this fix.

@tomchristie
Copy link
Member

Okay, so I think we probably want to look into a PR that makes more minimal adjustments in order to tackle this.

Eg. could we implement a format_lazy ourselves which just returns an instance with a __str__ method that gets called once it's actually needed.

@bluetech
Copy link
Contributor Author

I don't think it's possible to implement just __str__, the code is free to do anything with the message, so in order not to make it very fickle, a full str proxy is needed, and then it's better to use Django's format_lazy in my opinion.

I can't think of something better than the two approaches I suggested, but maybe someone else can.

@tomchristie
Copy link
Member

I don't think it's possible to implement just str, the code is free to do anything with the message

True, but it's only ever accessed by a validator instance. Hard to see that it'd ever needs to do anything with it other than call str() on it?

@bluetech
Copy link
Contributor Author

Is this about what you had in mind? (also incorporated @rpkilby suggestion to move the test to tests/importable)

diff --git a/rest_framework/compat.py b/rest_framework/compat.py
index 1feaf9fa..4498c9aa 100644
--- a/rest_framework/compat.py
+++ b/rest_framework/compat.py
@@ -5,7 +5,6 @@ versions of Django/Python, and compatibility wrappers around optional packages.
 import sys
 
 from django.conf import settings
-from django.core import validators
 from django.views.generic import View
 
 try:
@@ -245,34 +244,5 @@ LONG_SEPARATORS = (', ', ': ')
 INDENT_SEPARATORS = (',', ': ')
 
 
-class CustomValidatorMessage:
-    """
-    We need to avoid evaluation of `lazy` translated `message` in `django.core.validators.BaseValidator.__init__`.
-    https://github.com/django/django/blob/75ed5900321d170debef4ac452b8b3cf8a1c2384/django/core/validators.py#L297
-
-    Ref: https://github.com/encode/django-rest-framework/pull/5452
-    """
-
-    def __init__(self, *args, **kwargs):
-        self.message = kwargs.pop('message', self.message)
-        super().__init__(*args, **kwargs)
-
-
-class MinValueValidator(CustomValidatorMessage, validators.MinValueValidator):
-    pass
-
-
-class MaxValueValidator(CustomValidatorMessage, validators.MaxValueValidator):
-    pass
-
-
-class MinLengthValidator(CustomValidatorMessage, validators.MinLengthValidator):
-    pass
-
-
-class MaxLengthValidator(CustomValidatorMessage, validators.MaxLengthValidator):
-    pass
-
-
 # Version Constants.
 PY36 = sys.version_info >= (3, 6)
diff --git a/rest_framework/fields.py b/rest_framework/fields.py
index 179dd25c..860ae03a 100644
--- a/rest_framework/fields.py
+++ b/rest_framework/fields.py
@@ -12,7 +12,8 @@ from django.conf import settings
 from django.core.exceptions import ObjectDoesNotExist
 from django.core.exceptions import ValidationError as DjangoValidationError
 from django.core.validators import (
-    EmailValidator, RegexValidator, URLValidator, ip_address_validators
+    EmailValidator, MaxLengthValidator, MaxValueValidator, MinLengthValidator,
+    MinValueValidator, RegexValidator, URLValidator, ip_address_validators
 )
 from django.forms import FilePathField as DjangoFilePathField
 from django.forms import ImageField as DjangoImageField
@@ -23,20 +24,17 @@ from django.utils.dateparse import (
 from django.utils.duration import duration_string
 from django.utils.encoding import is_protected_type, smart_text
 from django.utils.formats import localize_input, sanitize_separators
-from django.utils.functional import lazy
 from django.utils.ipv6 import clean_ipv6_address
 from django.utils.timezone import utc
 from django.utils.translation import gettext_lazy as _
 from pytz.exceptions import InvalidTimeError
 
 from rest_framework import ISO_8601
-from rest_framework.compat import (
-    MaxLengthValidator, MaxValueValidator, MinLengthValidator,
-    MinValueValidator, ProhibitNullCharactersValidator
-)
+from rest_framework.compat import ProhibitNullCharactersValidator
 from rest_framework.exceptions import ErrorDetail, ValidationError
 from rest_framework.settings import api_settings
 from rest_framework.utils import html, humanize_datetime, json, representation
+from rest_framework.utils.formatting import lazy_format
 
 
 class empty:
@@ -749,12 +747,11 @@ class CharField(Field):
         self.min_length = kwargs.pop('min_length', None)
         super().__init__(**kwargs)
         if self.max_length is not None:
-            message = lazy(self.error_messages['max_length'].format, str)(max_length=self.max_length)
+            message = lazy_format(self.error_messages['max_length'], max_length=self.max_length)
             self.validators.append(
                 MaxLengthValidator(self.max_length, message=message))
         if self.min_length is not None:
-            message = lazy(
-                self.error_messages['min_length'].format, str)(min_length=self.min_length)
+            message = lazy_format(self.error_messages['min_length'], min_length=self.min_length)
             self.validators.append(
                 MinLengthValidator(self.min_length, message=message))
 
@@ -915,13 +912,11 @@ class IntegerField(Field):
         self.min_value = kwargs.pop('min_value', None)
         super().__init__(**kwargs)
         if self.max_value is not None:
-            message = lazy(
-                self.error_messages['max_value'].format, str)(max_value=self.max_value)
+            message = lazy_format(self.error_messages['max_value'], max_value=self.max_value)
             self.validators.append(
                 MaxValueValidator(self.max_value, message=message))
         if self.min_value is not None:
-            message = lazy(
-                self.error_messages['min_value'].format, str)(min_value=self.min_value)
+            message = lazy_format(self.error_messages['min_value'], min_value=self.min_value)
             self.validators.append(
                 MinValueValidator(self.min_value, message=message))
 
@@ -953,15 +948,11 @@ class FloatField(Field):
         self.min_value = kwargs.pop('min_value', None)
         super().__init__(**kwargs)
         if self.max_value is not None:
-            message = lazy(
-                self.error_messages['max_value'].format,
-                str)(max_value=self.max_value)
+            message = lazy_format(self.error_messages['max_value'], max_value=self.max_value)
             self.validators.append(
                 MaxValueValidator(self.max_value, message=message))
         if self.min_value is not None:
-            message = lazy(
-                self.error_messages['min_value'].format,
-                str)(min_value=self.min_value)
+            message = lazy_format(self.error_messages['min_value'], min_value=self.min_value)
             self.validators.append(
                 MinValueValidator(self.min_value, message=message))
 
@@ -1012,14 +1003,11 @@ class DecimalField(Field):
         super().__init__(**kwargs)
 
         if self.max_value is not None:
-            message = lazy(
-                self.error_messages['max_value'].format,
-                str)(max_value=self.max_value)
+            message = lazy_format(self.error_messages['max_value'], max_value=self.max_value)
             self.validators.append(
                 MaxValueValidator(self.max_value, message=message))
         if self.min_value is not None:
-            message = lazy(
-                self.error_messages['min_value'].format, str)(min_value=self.min_value)
+            message = lazy_format(self.error_messages['min_value'], min_value=self.min_value)
             self.validators.append(
                 MinValueValidator(self.min_value, message=message))
 
@@ -1357,15 +1345,11 @@ class DurationField(Field):
         self.min_value = kwargs.pop('min_value', None)
         super().__init__(**kwargs)
         if self.max_value is not None:
-            message = lazy(
-                self.error_messages['max_value'].format,
-                str)(max_value=self.max_value)
+            message = lazy_format(self.error_messages['max_value'], max_value=self.max_value)
             self.validators.append(
                 MaxValueValidator(self.max_value, message=message))
         if self.min_value is not None:
-            message = lazy(
-                self.error_messages['min_value'].format,
-                str)(min_value=self.min_value)
+            message = lazy_format(self.error_messages['min_value'], min_value=self.min_value)
             self.validators.append(
                 MinValueValidator(self.min_value, message=message))
 
@@ -1610,10 +1594,10 @@ class ListField(Field):
         super().__init__(*args, **kwargs)
         self.child.bind(field_name='', parent=self)
         if self.max_length is not None:
-            message = self.error_messages['max_length'].format(max_length=self.max_length)
+            message = lazy_format(self.error_messages['max_length'], max_length=self.max_length)
             self.validators.append(MaxLengthValidator(self.max_length, message=message))
         if self.min_length is not None:
-            message = self.error_messages['min_length'].format(min_length=self.min_length)
+            message = lazy_format(self.error_messages['min_length'], min_length=self.min_length)
             self.validators.append(MinLengthValidator(self.min_length, message=message))
 
     def get_value(self, dictionary):
@@ -1886,8 +1870,7 @@ class ModelField(Field):
         max_length = kwargs.pop('max_length', None)
         super().__init__(**kwargs)
         if max_length is not None:
-            message = lazy(
-                self.error_messages['max_length'].format, str)(max_length=self.max_length)
+            message = lazy_format(self.error_messages['max_length'], max_length=self.max_length)
             self.validators.append(
                 MaxLengthValidator(self.max_length, message=message))
 
diff --git a/rest_framework/utils/formatting.py b/rest_framework/utils/formatting.py
index 4e003f61..be4483f8 100644
--- a/rest_framework/utils/formatting.py
+++ b/rest_framework/utils/formatting.py
@@ -65,3 +65,30 @@ def markup_description(description):
         description = escape(description).replace('\n', '<br />')
         description = '<p>' + description + '</p>'
     return mark_safe(description)
+
+
+class lazy_format:
+    """
+    Delay formatting until it's actually needed.
+
+    Useful when the format string or one of the arguments is lazy.
+
+    Not using Django's lazy because it is too slow.
+    """
+    __slots__ = ('format_string', 'args', 'kwargs', 'result')
+
+    def __init__(self, format_string, *args, **kwargs):
+        self.result = None
+        self.format_string = format_string
+        self.args = args
+        self.kwargs = kwargs
+
+    def __str__(self):
+        if self.result is not None:
+            self.format_string, self.args, self.kwargs = None, None, None
+            return self.result
+        self.result = self.format_string.format(*self.args, **self.kwargs)
+        return self.result
+
+    def __mod__(self, value):
+        return str(self) % value
diff --git a/tests/importable/__init__.py b/tests/importable/__init__.py
index b3659902..ca7554ae 100644
--- a/tests/importable/__init__.py
+++ b/tests/importable/__init__.py
@@ -1 +1,15 @@
 from rest_framework import compat  # noqa
+
+
+# Test that serializer fields can be created before django is set-up.
+# See issue #3354.
+from rest_framework.serializers import (
+    CharField, DecimalField, DurationField, FloatField, IntegerField,
+    ListField,
+)
+CharField(min_length=1, max_length=2)
+IntegerField(min_value=1, max_value=2)
+FloatField(min_value=1, max_value=2)
+DecimalField(max_digits=10, decimal_places=1, min_value=1, max_value=2)
+DurationField(min_value=1, max_value=2)
+ListField(min_length=1, max_length=2)

I still prefer the lambda version, but if this is OK I can submit it in a new PR which replaces this one. It passes all the tests locally. It's slower than the lambda version in the benchmark by 10% but that's not too bad.

@rpkilby
Copy link
Member

rpkilby commented May 23, 2019

Hi @bluetech. I've created/merged #6708, which adds the regression test and the fix for ListField.

I still prefer the lambda version

The downside to the lambda version is that it's somewhat confusing. It's an anonymous function that returns a Validator instance that lazily formats its error message via a closure (right, because the validator is itself a callable?). In contrast, lazy_format is pretty straightforward to understand, and can be more easily tested in isolation.

@rpkilby rpkilby added this to the 3.10 Release milestone May 23, 2019
@rpkilby
Copy link
Member

rpkilby commented May 23, 2019

e.g.,

message = Mock()  # or is it MagicMock?
formatted = lazy_format(message, val='foo')

assert message.format.assert_not_called()

str(formatted)
str(formatted)
assert message.format.assert_called_once_with(val='foo')

@bluetech
Copy link
Contributor Author

Thanks @rpkilby, I opened #6709 with the lazy_format approach rebased on master and added a test for it.

@rpkilby rpkilby removed this from the 3.10 Release milestone May 23, 2019
@rpkilby
Copy link
Member

rpkilby commented May 23, 2019

Thanks for the update. I'm going to go ahead and close this, as I think the other approach is preferred.

@rpkilby rpkilby closed this May 23, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants