-
-
Notifications
You must be signed in to change notification settings - Fork 6.9k
Improve performance of lazy validation message formatting #6709
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
For future reference, same benchmark was used: https://gist.github.com/bluetech/cdd3b3d6d9d8cfc7b1dfee3c6482679d |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy with this approach. Aside from the suggestion, I think this is good to go.
As a fix for issue encode#3354, commit 607e4ed made the evaluation of some validation error messages lazy. To achieve that, Django's django.utils.functional.lazy() function was used. However, that function is extremely heavy and slow, and slows down string validation significantly (lazy() is evaluated each time for each validator for each field which has one). We noticed this in our production system. Use a custom lazy_format() object instead which does the formatting lazily with less overhead. Using the benchmark attached to the PR (snipped to tottime>100ms): Before, model serializer: 9225123 function calls (9200068 primitive calls) in 8.337 seconds Ordered by: internal time ncalls tottime percall cumtime percall filename:lineno(function) 25000 1.299 0.000 3.534 0.000 functional.py:125(__prepare_class__) 2415001 0.954 0.000 0.954 0.000 {built-in method builtins.hasattr} 1350000 0.901 0.000 0.901 0.000 functional.py:145(__promise__) 1550001 0.521 0.000 0.521 0.000 {built-in method builtins.setattr} 25000 0.494 0.000 0.735 0.000 {built-in method builtins.__build_class__} 30000 0.298 0.000 0.385 0.000 fields.py:297(__init__) 25000 0.289 0.000 5.895 0.000 fields.py:740(__init__) 25000 0.264 0.000 0.802 0.000 field_mapping.py:66(get_field_kwargs) 25000 0.241 0.000 0.241 0.000 functional.py:100(__proxy__) 670003/670001 0.203 0.000 0.211 0.000 {built-in method builtins.getattr} 5000 0.189 0.000 7.722 0.002 serializers.py:990(get_fields) 25000 0.186 0.000 0.400 0.000 functools.py:186(total_ordering) 25000 0.158 0.000 0.299 0.000 functional.py:234(wrapper) 5000 0.129 0.000 0.136 0.000 serializers.py:1066(get_field_names) 25000 0.104 0.000 1.002 0.000 serializers.py:1195(build_standard_field) After, model serializer: 3265096 function calls (3240059 primitive calls) in 2.645 seconds Ordered by: internal time ncalls tottime percall cumtime percall filename:lineno(function) 30000 0.237 0.000 0.315 0.000 fields.py:295(__init__) 25000 0.218 0.000 0.639 0.000 field_mapping.py:66(get_field_kwargs) 25000 0.214 0.000 0.665 0.000 fields.py:743(__init__) 5000 0.156 0.000 2.086 0.000 serializers.py:988(get_fields) 25000 0.107 0.000 0.210 0.000 functional.py:234(wrapper) Before, regular serializer: 8060003 function calls (7960003 primitive calls) in 7.123 seconds Ordered by: internal time ncalls tottime percall cumtime percall filename:lineno(function) 25000 1.569 0.000 3.897 0.000 functional.py:125(__prepare_class__) 1350000 1.013 0.000 1.013 0.000 functional.py:145(__promise__) 2365000 0.925 0.000 0.925 0.000 {built-in method builtins.hasattr} 1550000 0.512 0.000 0.512 0.000 {built-in method builtins.setattr} 25000 0.378 0.000 0.550 0.000 {built-in method builtins.__build_class__} 25000 0.307 0.000 5.946 0.000 fields.py:740(__init__) 30000 0.277 0.000 0.360 0.000 fields.py:297(__init__) 80000/5000 0.202 0.000 6.526 0.001 copy.py:132(deepcopy) 25000 0.172 0.000 0.172 0.000 functional.py:100(__proxy__) 540000 0.152 0.000 0.152 0.000 {built-in method builtins.getattr} 25000 0.119 0.000 6.199 0.000 fields.py:604(__deepcopy__) After, regular serializer: 2150003 function calls (2050003 primitive calls) in 1.609 seconds Ordered by: internal time ncalls tottime percall cumtime percall filename:lineno(function) 30000 0.224 0.000 0.293 0.000 fields.py:295(__init__) 25000 0.181 0.000 0.607 0.000 fields.py:743(__init__) 80000/5000 0.151 0.000 1.074 0.000 copy.py:132(deepcopy) 25000 0.102 0.000 0.819 0.000 fields.py:607(__deepcopy__)
@rpkilby Yeah same. I'm going to leave the final okay with @carltongibson since he's managing the 3.10 release, but I'm happy. Nice one, @bluetech! |
Argh! Don't do that. 🙂 I'm quietly getting knocked sideways by life over here, so any help getting the 3.10 issues in is very welcome. I've got the schema docs to finish, which I'm on, but beyond that I keep seeing the number of things to pull in, vs the time I'm able to make for it, and I'm starting to panic a little. (Was going to say that in private, but TBH, given how OSS works, I think these things are better off said.) Quick look at this: it's seems fine, but I haven't looked deeply. Happy to trust @rpkilby's review. |
Just wanted to say that this fix improved the speed of some deeply nested serializers in our project by 60%! ❤️ Huge real world improvement in DRF 3.10. Thank you all! |
Backport to 3.9.x branch
Replaces #6644.
As a fix for issue #3354, commit 607e4ed made the
evaluation of some validation error messages lazy. To achieve that,
Django's django.utils.functional.lazy() function was used. However, that
function is extremely heavy and slow, and slows down string validation
significantly (lazy() is evaluated each time for each validator for each
field which has one). We noticed this in our production system.
Use a custom lazy_format() object instead which does the formatting
lazily with less overhead.
Using the benchmark attached to the PR (snipped to tottime>100ms):
Before, model serializer:
After, model serializer:
Before, regular serializer:
After, regular serializer: