Skip to content

Mitigating performance regressions in code that explicitly use __dict__ #93350

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
markshannon opened this issue May 30, 2022 · 7 comments
Closed
Labels
3.12 only security fixes performance Performance or resource usage

Comments

@markshannon
Copy link
Member

Python 3.11 creates dictionaries lazily.
In general this is a big win. The rare code that explicitly uses __dict__ slows down, but the speedups and memory saving for the vast majority of code make that worthwhile.

The problem is code that is written to use __dict__ to improve performance. For example, functools.cached_property lazily fills in the computed attribute. To do so, it needs to look into the instance dictionary, bypassing usual attribute lookup.

In order to make this perform well for 3.11 we need a way to look at the instance attributes, without creating the dict.
In other words, we need something equivalent to getattr(obj.__dict__, name, default) that does not create the dict.

If we were to add sys._get_instance_attr(obj, name, default) that is defined as being equivalent to

def _get_instance_attr(obj, name, default):
      return getattr(obj.__dict__, name, default)

then we could use that in functools.cached_property (and maybe other places) to prevent the performance regression in 3.11

This is arguably a new feature, and I want it in 3.11, so this might not be an acceptable change. @pablogsal?

@markshannon markshannon added the type-bug An unexpected behavior, bug, or error label May 30, 2022
@pablogsal
Copy link
Member

pablogsal commented May 30, 2022

Hummmmmm. Would this API be user facing? I can see that's underscore prefixed, but I wonder if we expect users or library authors to use it. If the answer is yes, then I think we should discuss it before in detail because it is a bit weird to have an API that circumvents the places when an optimization fails.

If it is just for internal usage and everyone thinks it is a good idea, I could consider it as long as it is not a lot of code and is very localized. Otherwise, it is a bit late in the game (today we need to release beta 2) for something big or even middle-sized.

@tiran
Copy link
Member

tiran commented May 30, 2022

IMHO it would make more sense to add the API to the operator / _operator modules. The _operator module is a built-in module that is always present.

@markshannon
Copy link
Member Author

It is user facing in the sense that it is visible and can be used but, much like sys._clear_type_cache(), I wouldn't it be used outside of a few specialized libraries.

I don't have much of an opinion on the name. The sys module seemed like the natural place for it, but I've no objection to the _operator module.

@AA-Turner AA-Turner added performance Performance or resource usage 3.11 only security fixes 3.12 only security fixes labels May 30, 2022
@stevendaprano
Copy link
Member

I think that it is naive to expect that it wouldn't be used except for "a few specialized libraries". As soon as people hear that this "improves performance", they will use it, leading underscore or not, whether it does or doesn't actually improve performance for them.

So might as well take the time to do it right and make it officially public, not rush it into production. (That won't necessarily preclude it getting into 3.11, the scope is not very big and I presume @markshannon already has an implementation, or soon will have.)

Would we want to update inspect.getattr_static to take advantage of this too?

I agree with @tiran that operator is a better place for it.

As for the name of the function itself, there are already many variations on the "get this" naming pattern. Perhaps too many. I prefer a name like "peekattr", it gives the sense of taking a quick peek which avoids running the full getattr machinery.

@markshannon
Copy link
Member Author

I don't think we need this for 3.11, after all.

Although, @cached_property does force the dictionary to be created, so does other code, and we already handle this case reasonably well.
Performance isn't as good as for objects without a materialized dict, but it's no worse than 3.10.
(LOAD_ATTR_WITH_HINT handles attribute loads in this case)

@markshannon markshannon removed 3.11 only security fixes type-bug An unexpected behavior, bug, or error labels May 31, 2022
@kumaraditya303
Copy link
Contributor

Perhaps, the pyperformance benchmark suite should have a benchmark which explicitly uses __dict__ as there are cases where it is used for performance and to measure regressions.

@markshannon
Copy link
Member Author

We have a different approach now. See #106485

@markshannon markshannon closed this as not planned Won't fix, can't repro, duplicate, stale Jul 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.12 only security fixes performance Performance or resource usage
Projects
None yet
Development

No branches or pull requests

6 participants