-
-
Notifications
You must be signed in to change notification settings - Fork 32.1k
gh-127750: Fix and optimize functools.singledispatchmethod() #130008
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-127750: Fix and optimize functools.singledispatchmethod() #130008
Conversation
Remove broken singledispatchmethod caching introduced in pythongh-85160. Achieve the same performance using different optimization.
Lib/functools.py
Outdated
|
||
@property | ||
def __isabstractmethod__(self): | ||
return getattr(self.func, '__isabstractmethod__', False) | ||
def __module__(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is giving some problems in my interactive console. A minimal reproducer:
from functools import singledispatchmethod, singledispatch
from IPython.lib.pretty import pretty
class A:
def __init__(self, value):
self.value = value
@singledispatchmethod
def dp(self, x):
return id(self)
a=obj=A(4)
pretty(a.dp) # fails
_singledispatchmethod_get.__module__ # this is now a property, not a string
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, there is a problem with type attributes which we want to define for instances (__name__
, __qualname__
, __doc__
). If we define a property, it conflicts with the type attribute.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be resolved by defining __getattribute__
instead of __getattr__
. But this adds such large overhead, that it outweigh the optimization gain.
So I added setting just two instance attributes __module__
and __doc__
in the constructor. This adds some overhead, but not so large as in the original code. I hope that more satisfying solution will be found in future, but this is a complex issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fixes the bugs and the performance is good. But there are some more corner cases I am worried about. For example:
from functools import *
class A:
@singledispatchmethod
def dp(self, x):
return x
a = A()
print(repr(a.dp))
print(str(a.dp))
Results in
<functools._singledispatchmethod_get object at 0x0000014DCB8A3B60>
<functools._singledispatchmethod_get object at 0x0000014DCB8A4E10>
On main (and on 3.12) it is:
<function A.dp at 0x00000182896CA200>
<function A.dp at 0x00000182896CA200>
if name not in {'__name__', '__qualname__', '__isabstractmethod__', | ||
'__annotations__', '__type_params__'}: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: this set is functools.WRAPPER_ASSIGNMENTS
minus __doc__
and __module__
which have special handling. Fine to leave it this way
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Plus __isabstractmethod__
.
Yes, I aware of this. Actually, I wrote also code and tests for |
@serhiy-storchaka I think this has caused a regression in Sphinx tests (sphinx-doc/sphinx#13360), though the onus may be on Sphinx to fix this -- noting for visibility though. A |
There is a bug here. It affects also pydoc, |
I have found the cause, but it is late here, I'll fix this tomorrow. |
Thanks! A |
So caching was abandoned all together? With this, the call of Or am I missing something? |
Yes, we abandoned caching because of several issues with caching (we tried several options, they all have some issues). However, with this PR performance the performance is almost as good (or sometimes better) than the caching approaches. The performance issue was only partly creating a new instance, but mostly getting attributes when updating attributes on the newly created instance. Do you have any examples where performance is still an issue? |
I find it a bit hard to believe that performance of this is good. I have no personal interest in this, but just thinking in general. Don't think this sort of application is very common, but let's say an example: from functools import singledispatchmethod, singledispatch
@singledispatch
def flatten_js(obj, parent_key=None):
yield obj if parent_key is None else (parent_key, obj)
@flatten_js.register
def _(obj: dict, parent_key=None):
for k, v in obj.items():
new_key = (k,) if parent_key is None else parent_key + (k,)
yield from flatten_js(v, new_key)
@flatten_js.register
def _(obj: list, parent_key=None):
for k, v in enumerate(obj):
new_key = (k,) if parent_key is None else parent_key + (k,)
yield from flatten_js(v, new_key)
class A:
@singledispatchmethod
def flatten_js(self, obj, parent_key=None):
yield obj if parent_key is None else (parent_key, obj)
@flatten_js.register
def _(self, obj: dict, parent_key=None):
for k, v in obj.items():
new_key = (k,) if parent_key is None else parent_key + (k,)
yield from self.flatten_js(v, new_key)
@flatten_js.register
def _(self, obj: list, parent_key=None):
for k, v in enumerate(obj):
new_key = (k,) if parent_key is None else parent_key + (k,)
yield from self.flatten_js(v, new_key)
a = A()
# ---
./python.exe -m timeit -s $S "list(a.flatten_js({'a': [1, 2, 3, [4]]}))" # 10µs
./python.exe -m timeit -s $S "list(flatten_js({'a': [1, 2, 3, [4]]}))" # 5µs So for applications of low complexity such as this, half of the time spent is an overhead of constructing |
Yes, singledispatchmethod has an overhead. And caching does not solve it. If you have other solution, please open a new issue. |
With this: class singledispatchmethod(functools.singledispatchmethod):
def __set_name__(self, obj, name):
self.attrname = name
def __get__(self, obj, cls=None):
cache = obj.__dict__
try:
return cache[self.attrname]
except:
def method(*args, **kwargs):
return self.dispatcher.dispatch(args[0].__class__).__get__(obj, cls)(*args, **kwargs)
cache[self.attrname] = method
return method
./python.exe -m timeit -s $S "list(a.flatten_js({'a': [1, 2, 3, [4]]}))" # 7 µs So caching, at least for this case, results in 60% lower overhead. |
In terms of performance, caching as per #128648 (comment) had benefits - I have nothing else. If anyone wants needs performance benefit of caching and the approach of
|
Remove broken singledispatchmethod caching introduced in gh-85160. Achieve the same performance using different optimization.