Skip to content

Don't call .format() on lazy-translated strings during Field.__init__ #3354

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
tomchristie opened this issue Sep 1, 2015 · 5 comments · Fixed by #5452
Closed

Don't call .format() on lazy-translated strings during Field.__init__ #3354

tomchristie opened this issue Sep 1, 2015 · 5 comments · Fixed by #5452
Labels
Milestone

Comments

@tomchristie
Copy link
Member

This from IRC...

[17:04:05] <__zer01>     after adding the `min_value` kwarg to a serializer.DecimalField I get a `django.core.exceptions.AppRegistryNotReady: The translation infrastructure cannot be initialized before the apps registry is ready.` error when running tests, any ideas?
[17:07:57] <__zer01>     here's the end of the stack trace http://pastebin.com/B7ECdL6w

Which is:

File "/home/eugenio/spartan/rest/serializers.py", line 80, in MySerializer min_value=1.01)
File "/home/eugenio/Envs/spartan/lib/python3.4/site-packages/rest_framework/fields.py", line 906, in __init__
message = self.error_messages['min_value'].format(min_value=self.min_value)
File "/home/eugenio/Envs/spartan/lib/python3.4/site-packages/django/utils/functional.py", line 136, in __wrapper__ res = func(*self.__args, **self.__kw)
File "/home/eugenio/Envs/spartan/lib/python3.4/site-packages/django/utils/translation/__init__.py", line 84, in ugettext
return _trans.ugettext(message)
File "/home/eugenio/Envs/spartan/lib/python3.4/site-packages/django/utils/translation/trans_real.py", line 321, in gettext
return do_translate(message, 'gettext')
File "/home/eugenio/Envs/spartan/lib/python3.4/site-packages/django/utils/translation/trans_real.py", line 304, in do_translate
_default = _default or translation(settings.LANGUAGE_CODE)
File "/home/eugenio/Envs/spartan/lib/python3.4/site-packages/django/utils/translation/trans_real.py", line 206, in translation
_translations[language] = DjangoTranslation(language)
File "/home/eugenio/Envs/spartan/lib/python3.4/site-packages/django/utils/translation/trans_real.py", line 116, in __init__
self._add_installed_apps_translations()
File "/home/eugenio/Envs/spartan/lib/python3.4/site-packages/django/utils/translation/trans_real.py", line 164, in _add_installed_apps_translations
"The translation infrastructure cannot be initialized before the "
django.core.exceptions.AppRegistryNotReady: The translation infrastructure cannot be initialized before the apps registry is ready. Check that you don't make non-lazy gettext calls at import time.

And...

[17:08:41]   yuriheupa ([email protected]) joined the channel.
[17:09:26]   yuriheupa ([email protected]) left IRC. (Remote host closed the connection)
[17:11:54] <tomchristie>     Hrm
[17:13:14] <__zer01>     I'm running djangorestframework==3.2.3 and Django==1.8.4
[17:13:28] <tomchristie>     I think that may need raising as an issue
[17:14:13] <tomchristie>     The error is being raised because we're calling .format on a lazy-translated error message
[17:14:34] <tomchristie>     Due to it being passed a min_value which is included in the message
[17:15:00] <tomchristie>     Prob is, that causes the translation machinary to attempt to run, and produce the translated output
[17:15:12] <tomchristie>     But that can't happen yet because the app isn't setup and running
[17:15:23] <tomchristie>     Looks like Django internally handles this a little diff, eg
[17:15:34] <tomchristie>     This… https://github.com/django/django/blob/master/django/forms/fields.py#L258
[17:15:43] <tomchristie>     vs
[17:16:09] <tomchristie>     This: https://github.com/tomchristie/django-rest-framework/blob/master/rest_framework/fields.py#L820-L821
[17:16:23] <tomchristie>     Notive that we're calling .format when the field is instantiated
[17:16:31] <tomchristie>     But same is not happening in Django
[17:16:37] <tomchristie>     You may be able to resolve
[17:16:53] <tomchristie>     by ensuring that you're not importing the serializers prior to the app setup
[17:16:54]   marc_v92 (~marc_v92@unaffiliated/marc-v92/x-6202358) joined the channel.
[17:17:07] <tomchristie>     (Eg if you import them in your settings I guess that's when it blows up)
@tomchristie tomchristie added the Bug label Sep 1, 2015
@nryoung
Copy link
Contributor

nryoung commented Sep 11, 2015

Really all that needs to be changed here is that these particular lines

message = self.error_messages['max_value'].format(max_value=self.max_value)

need to be removed?

Doing this and running the tests does not yield any test failures, but then again I am not sure where else message is used.

@nryoung
Copy link
Contributor

nryoung commented Sep 15, 2015

ping @tomchristie or @kevin-brown for the above. If this is the right approach and I have already made the changes and could push them out. Let me know when you get a chance 😄

@xordoquy
Copy link
Collaborator

@nryoung we may lack a test for that particular string. It looks like the omitting the format will result in the max_value not being correctly filled whenever an error occurs.

@nryoung
Copy link
Contributor

nryoung commented Sep 15, 2015

@nryoung we may lack a test for that particular string. It looks like the omitting the format will result in the max_value not being correctly filled whenever an error occurs.

This does seem to be the case. It seems like we could use the message class attribute of Django's MaxValueValidator since it is a simple inheritance of this class: https://github.com/django/django/blob/master/django/core/validators.py#L318

@nryoung
Copy link
Contributor

nryoung commented Sep 16, 2015

I have been doing some research on this.

For Django versions >= 1.8 message is optional in the __init__ method:

but, message is also defaulted as a class attribute here:

For Django versions < 1.8 message is not expected:

thus, that is why it is popped off here:

but, message is also defaulted here:

I believe, that removing the message assignments here and other similar ones:

and not passing message in to the instantiation of the validator, it should default to the Django class attribute:

I will make a PR for what I outline above. As always, let me know if I am wrong here.

nryoung added a commit to nryoung/django-rest-framework that referenced this issue Sep 16, 2015
nryoung added a commit to nryoung/django-rest-framework that referenced this issue Sep 16, 2015
nryoung added a commit to nryoung/django-rest-framework that referenced this issue Sep 20, 2015
 encode#3354

- Update tests to reflect new error messages provided by Django field
  parent classes
@xordoquy xordoquy added this to the 3.2.5 Release milestone Sep 23, 2015
xordoquy pushed a commit to linovia/django-rest-framework that referenced this issue Sep 23, 2015
 encode#3354

- Update tests to reflect new error messages provided by Django field
  parent classes
@xordoquy xordoquy modified the milestones: 3.3.2 Release, 3.3.3 Release Dec 14, 2015
tomaszgy added a commit to open-craft/edx-platform that referenced this issue Jan 29, 2018
to remove problem from encode/django-rest-framework#3354.

Signed-off-by: Tomasz Gargas <[email protected]>
tomaszgy added a commit to edx/edx-solutions-edx-platform that referenced this issue Feb 5, 2018
to branch with fixes to encode/django-rest-framework#3354.

Signed-off-by: Tomasz Gargas <[email protected]>
bluetech added a commit to bluetech/django-rest-framework that referenced this issue May 2, 2019
As a fix for issue encode#3354, commit 607e4ed made the
evaluation of some validation error messages lazy. To achieve that,
Django's django.utils.functional.lazy() function was used. However, that
function is extremely heavy and slow, and slows down string validation
significantly (lazy() is evaluated each time for each validator for each
field which has one). We noticed this in our production system.

Avoid lazy and use a lambda to defer the evaluation instead. This uses
the fact that a Django validator is just a simple callable.

Using the benchmark attached to the PR (snipped to tottime>100ms):

Before, model serializer:

         9225123 function calls (9200068 primitive calls) in 8.337 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    25000    1.299    0.000    3.534    0.000 functional.py:125(__prepare_class__)
  2415001    0.954    0.000    0.954    0.000 {built-in method builtins.hasattr}
  1350000    0.901    0.000    0.901    0.000 functional.py:145(__promise__)
  1550001    0.521    0.000    0.521    0.000 {built-in method builtins.setattr}
    25000    0.494    0.000    0.735    0.000 {built-in method builtins.__build_class__}
    30000    0.298    0.000    0.385    0.000 fields.py:297(__init__)
    25000    0.289    0.000    5.895    0.000 fields.py:740(__init__)
    25000    0.264    0.000    0.802    0.000 field_mapping.py:66(get_field_kwargs)
    25000    0.241    0.000    0.241    0.000 functional.py:100(__proxy__)
670003/670001    0.203    0.000    0.211    0.000 {built-in method builtins.getattr}
     5000    0.189    0.000    7.722    0.002 serializers.py:990(get_fields)
    25000    0.186    0.000    0.400    0.000 functools.py:186(total_ordering)
    25000    0.158    0.000    0.299    0.000 functional.py:234(wrapper)
     5000    0.129    0.000    0.136    0.000 serializers.py:1066(get_field_names)
    25000    0.104    0.000    1.002    0.000 serializers.py:1195(build_standard_field)

After, model serializer:

         3400123 function calls (3400068 primitive calls) in 2.824 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    30000    0.259    0.000    0.341    0.000 fields.py:294(__init__)
    25000    0.242    0.000    0.674    0.000 field_mapping.py:66(get_field_kwargs)
     5000    0.176    0.000    2.209    0.000 serializers.py:990(get_fields)
    25000    0.133    0.000    0.514    0.000 fields.py:737(__init__)
295003/295001    0.106    0.000    0.115    0.000 {built-in method builtins.getattr}

Before, regular serializer:

         8060003 function calls (7960003 primitive calls) in 7.123 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    25000    1.569    0.000    3.897    0.000 functional.py:125(__prepare_class__)
  1350000    1.013    0.000    1.013    0.000 functional.py:145(__promise__)
  2365000    0.925    0.000    0.925    0.000 {built-in method builtins.hasattr}
  1550000    0.512    0.000    0.512    0.000 {built-in method builtins.setattr}
    25000    0.378    0.000    0.550    0.000 {built-in method builtins.__build_class__}
    25000    0.307    0.000    5.946    0.000 fields.py:740(__init__)
    30000    0.277    0.000    0.360    0.000 fields.py:297(__init__)
80000/5000    0.202    0.000    6.526    0.001 copy.py:132(deepcopy)
    25000    0.172    0.000    0.172    0.000 functional.py:100(__proxy__)
   540000    0.152    0.000    0.152    0.000 {built-in method builtins.getattr}
    25000    0.119    0.000    6.199    0.000 fields.py:604(__deepcopy__)

After, regular serializer:

         2010003 function calls (1935003 primitive calls) in 1.456 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    30000    0.194    0.000    0.263    0.000 fields.py:294(__init__)
80000/5000    0.151    0.000    0.923    0.000 copy.py:132(deepcopy)
    25000    0.128    0.000    0.470    0.000 fields.py:737(__init__)
bluetech added a commit to bluetech/django-rest-framework that referenced this issue May 23, 2019
As a fix for issue encode#3354, commit 607e4ed made the
evaluation of some validation error messages lazy. To achieve that,
Django's django.utils.functional.lazy() function was used. However, that
function is extremely heavy and slow, and slows down string validation
significantly (lazy() is evaluated each time for each validator for each
field which has one). We noticed this in our production system.

Use a custom lazy_format() object instead which does the formatting
lazily with less overhead.

Using the benchmark attached to the PR (snipped to tottime>100ms):

Before, model serializer:

         9225123 function calls (9200068 primitive calls) in 8.337 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    25000    1.299    0.000    3.534    0.000 functional.py:125(__prepare_class__)
  2415001    0.954    0.000    0.954    0.000 {built-in method builtins.hasattr}
  1350000    0.901    0.000    0.901    0.000 functional.py:145(__promise__)
  1550001    0.521    0.000    0.521    0.000 {built-in method builtins.setattr}
    25000    0.494    0.000    0.735    0.000 {built-in method builtins.__build_class__}
    30000    0.298    0.000    0.385    0.000 fields.py:297(__init__)
    25000    0.289    0.000    5.895    0.000 fields.py:740(__init__)
    25000    0.264    0.000    0.802    0.000 field_mapping.py:66(get_field_kwargs)
    25000    0.241    0.000    0.241    0.000 functional.py:100(__proxy__)
670003/670001    0.203    0.000    0.211    0.000 {built-in method builtins.getattr}
     5000    0.189    0.000    7.722    0.002 serializers.py:990(get_fields)
    25000    0.186    0.000    0.400    0.000 functools.py:186(total_ordering)
    25000    0.158    0.000    0.299    0.000 functional.py:234(wrapper)
     5000    0.129    0.000    0.136    0.000 serializers.py:1066(get_field_names)
    25000    0.104    0.000    1.002    0.000 serializers.py:1195(build_standard_field)

After, model serializer:

         3265096 function calls (3240059 primitive calls) in 2.645 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    30000    0.237    0.000    0.315    0.000 fields.py:295(__init__)
    25000    0.218    0.000    0.639    0.000 field_mapping.py:66(get_field_kwargs)
    25000    0.214    0.000    0.665    0.000 fields.py:743(__init__)
     5000    0.156    0.000    2.086    0.000 serializers.py:988(get_fields)
    25000    0.107    0.000    0.210    0.000 functional.py:234(wrapper)

Before, regular serializer:

         8060003 function calls (7960003 primitive calls) in 7.123 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    25000    1.569    0.000    3.897    0.000 functional.py:125(__prepare_class__)
  1350000    1.013    0.000    1.013    0.000 functional.py:145(__promise__)
  2365000    0.925    0.000    0.925    0.000 {built-in method builtins.hasattr}
  1550000    0.512    0.000    0.512    0.000 {built-in method builtins.setattr}
    25000    0.378    0.000    0.550    0.000 {built-in method builtins.__build_class__}
    25000    0.307    0.000    5.946    0.000 fields.py:740(__init__)
    30000    0.277    0.000    0.360    0.000 fields.py:297(__init__)
80000/5000    0.202    0.000    6.526    0.001 copy.py:132(deepcopy)
    25000    0.172    0.000    0.172    0.000 functional.py:100(__proxy__)
   540000    0.152    0.000    0.152    0.000 {built-in method builtins.getattr}
    25000    0.119    0.000    6.199    0.000 fields.py:604(__deepcopy__)

After, regular serializer:

         2150003 function calls (2050003 primitive calls) in 1.609 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    30000    0.224    0.000    0.293    0.000 fields.py:295(__init__)
    25000    0.181    0.000    0.607    0.000 fields.py:743(__init__)
80000/5000    0.151    0.000    1.074    0.000 copy.py:132(deepcopy)
    25000    0.102    0.000    0.819    0.000 fields.py:607(__deepcopy__)
bluetech added a commit to bluetech/django-rest-framework that referenced this issue May 23, 2019
As a fix for issue encode#3354, commit 607e4ed made the
evaluation of some validation error messages lazy. To achieve that,
Django's django.utils.functional.lazy() function was used. However, that
function is extremely heavy and slow, and slows down string validation
significantly (lazy() is evaluated each time for each validator for each
field which has one). We noticed this in our production system.

Use a custom lazy_format() object instead which does the formatting
lazily with less overhead.

Using the benchmark attached to the PR (snipped to tottime>100ms):

Before, model serializer:

         9225123 function calls (9200068 primitive calls) in 8.337 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    25000    1.299    0.000    3.534    0.000 functional.py:125(__prepare_class__)
  2415001    0.954    0.000    0.954    0.000 {built-in method builtins.hasattr}
  1350000    0.901    0.000    0.901    0.000 functional.py:145(__promise__)
  1550001    0.521    0.000    0.521    0.000 {built-in method builtins.setattr}
    25000    0.494    0.000    0.735    0.000 {built-in method builtins.__build_class__}
    30000    0.298    0.000    0.385    0.000 fields.py:297(__init__)
    25000    0.289    0.000    5.895    0.000 fields.py:740(__init__)
    25000    0.264    0.000    0.802    0.000 field_mapping.py:66(get_field_kwargs)
    25000    0.241    0.000    0.241    0.000 functional.py:100(__proxy__)
670003/670001    0.203    0.000    0.211    0.000 {built-in method builtins.getattr}
     5000    0.189    0.000    7.722    0.002 serializers.py:990(get_fields)
    25000    0.186    0.000    0.400    0.000 functools.py:186(total_ordering)
    25000    0.158    0.000    0.299    0.000 functional.py:234(wrapper)
     5000    0.129    0.000    0.136    0.000 serializers.py:1066(get_field_names)
    25000    0.104    0.000    1.002    0.000 serializers.py:1195(build_standard_field)

After, model serializer:

         3265096 function calls (3240059 primitive calls) in 2.645 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    30000    0.237    0.000    0.315    0.000 fields.py:295(__init__)
    25000    0.218    0.000    0.639    0.000 field_mapping.py:66(get_field_kwargs)
    25000    0.214    0.000    0.665    0.000 fields.py:743(__init__)
     5000    0.156    0.000    2.086    0.000 serializers.py:988(get_fields)
    25000    0.107    0.000    0.210    0.000 functional.py:234(wrapper)

Before, regular serializer:

         8060003 function calls (7960003 primitive calls) in 7.123 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    25000    1.569    0.000    3.897    0.000 functional.py:125(__prepare_class__)
  1350000    1.013    0.000    1.013    0.000 functional.py:145(__promise__)
  2365000    0.925    0.000    0.925    0.000 {built-in method builtins.hasattr}
  1550000    0.512    0.000    0.512    0.000 {built-in method builtins.setattr}
    25000    0.378    0.000    0.550    0.000 {built-in method builtins.__build_class__}
    25000    0.307    0.000    5.946    0.000 fields.py:740(__init__)
    30000    0.277    0.000    0.360    0.000 fields.py:297(__init__)
80000/5000    0.202    0.000    6.526    0.001 copy.py:132(deepcopy)
    25000    0.172    0.000    0.172    0.000 functional.py:100(__proxy__)
   540000    0.152    0.000    0.152    0.000 {built-in method builtins.getattr}
    25000    0.119    0.000    6.199    0.000 fields.py:604(__deepcopy__)

After, regular serializer:

         2150003 function calls (2050003 primitive calls) in 1.609 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    30000    0.224    0.000    0.293    0.000 fields.py:295(__init__)
    25000    0.181    0.000    0.607    0.000 fields.py:743(__init__)
80000/5000    0.151    0.000    1.074    0.000 copy.py:132(deepcopy)
    25000    0.102    0.000    0.819    0.000 fields.py:607(__deepcopy__)
bluetech added a commit to bluetech/django-rest-framework that referenced this issue May 24, 2019
As a fix for issue encode#3354, commit 607e4ed made the
evaluation of some validation error messages lazy. To achieve that,
Django's django.utils.functional.lazy() function was used. However, that
function is extremely heavy and slow, and slows down string validation
significantly (lazy() is evaluated each time for each validator for each
field which has one). We noticed this in our production system.

Use a custom lazy_format() object instead which does the formatting
lazily with less overhead.

Using the benchmark attached to the PR (snipped to tottime>100ms):

Before, model serializer:

         9225123 function calls (9200068 primitive calls) in 8.337 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    25000    1.299    0.000    3.534    0.000 functional.py:125(__prepare_class__)
  2415001    0.954    0.000    0.954    0.000 {built-in method builtins.hasattr}
  1350000    0.901    0.000    0.901    0.000 functional.py:145(__promise__)
  1550001    0.521    0.000    0.521    0.000 {built-in method builtins.setattr}
    25000    0.494    0.000    0.735    0.000 {built-in method builtins.__build_class__}
    30000    0.298    0.000    0.385    0.000 fields.py:297(__init__)
    25000    0.289    0.000    5.895    0.000 fields.py:740(__init__)
    25000    0.264    0.000    0.802    0.000 field_mapping.py:66(get_field_kwargs)
    25000    0.241    0.000    0.241    0.000 functional.py:100(__proxy__)
670003/670001    0.203    0.000    0.211    0.000 {built-in method builtins.getattr}
     5000    0.189    0.000    7.722    0.002 serializers.py:990(get_fields)
    25000    0.186    0.000    0.400    0.000 functools.py:186(total_ordering)
    25000    0.158    0.000    0.299    0.000 functional.py:234(wrapper)
     5000    0.129    0.000    0.136    0.000 serializers.py:1066(get_field_names)
    25000    0.104    0.000    1.002    0.000 serializers.py:1195(build_standard_field)

After, model serializer:

         3265096 function calls (3240059 primitive calls) in 2.645 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    30000    0.237    0.000    0.315    0.000 fields.py:295(__init__)
    25000    0.218    0.000    0.639    0.000 field_mapping.py:66(get_field_kwargs)
    25000    0.214    0.000    0.665    0.000 fields.py:743(__init__)
     5000    0.156    0.000    2.086    0.000 serializers.py:988(get_fields)
    25000    0.107    0.000    0.210    0.000 functional.py:234(wrapper)

Before, regular serializer:

         8060003 function calls (7960003 primitive calls) in 7.123 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    25000    1.569    0.000    3.897    0.000 functional.py:125(__prepare_class__)
  1350000    1.013    0.000    1.013    0.000 functional.py:145(__promise__)
  2365000    0.925    0.000    0.925    0.000 {built-in method builtins.hasattr}
  1550000    0.512    0.000    0.512    0.000 {built-in method builtins.setattr}
    25000    0.378    0.000    0.550    0.000 {built-in method builtins.__build_class__}
    25000    0.307    0.000    5.946    0.000 fields.py:740(__init__)
    30000    0.277    0.000    0.360    0.000 fields.py:297(__init__)
80000/5000    0.202    0.000    6.526    0.001 copy.py:132(deepcopy)
    25000    0.172    0.000    0.172    0.000 functional.py:100(__proxy__)
   540000    0.152    0.000    0.152    0.000 {built-in method builtins.getattr}
    25000    0.119    0.000    6.199    0.000 fields.py:604(__deepcopy__)

After, regular serializer:

         2150003 function calls (2050003 primitive calls) in 1.609 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    30000    0.224    0.000    0.293    0.000 fields.py:295(__init__)
    25000    0.181    0.000    0.607    0.000 fields.py:743(__init__)
80000/5000    0.151    0.000    1.074    0.000 copy.py:132(deepcopy)
    25000    0.102    0.000    0.819    0.000 fields.py:607(__deepcopy__)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment