-
-
Notifications
You must be signed in to change notification settings - Fork 32k
JIT: improve memory allocation #119730
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This seems to be the Discourse discussion |
@terryjreedy that Discourse discussion is more for this issue: #118467 @tonybaloney did an initial implementation to dump the JIT code of an executor and that discussion is for a proposal to dump the JIT code associated with micro ops. This issue instead is targeting on how the JIT is allocating memory at runtime. At the moment every object is allocated to a new page, there is a lot of padding for new page alignment. |
Thanks for this great summary and issue! Yeah, I think this can progress in a few stages:
I can get the ball rolling on step one, and then we can iterate from there. |
Hello, thanks for laying an implementation plan. I was discussing with a colleague and he raised a couple of observations about the last point. Thoughts? |
Ah, neat, I didn't know Intel/AMD had hardware protection keys! Sounds like that's a good plan then. I agree that falling back to one trace per page on other platforms makes the most sense. |
How "recent" are we talking? We should be aware of the additional cost of two behaviors / code paths for this, especially in terms of testing. |
Feature or enhancement
Proposal:
The issue #116017 explains already what the problem is with memory allocation used by the JIT.
To give more data point, I decided to debug this a little bit further, put some debugging info in the
_PyJIT_Compile
and then ran a pyperformance run.The debugging info are around the memory allocated and the padding used to align it to the page size.
The function has been called 1288249 times and this is the ratio between the actual memory allocated and the padding due to 16K (on MacOS) page size:
71% of the memory allocated is wasted in padding whilst only 29% is being used by data. There is an indication that memory needed for these objects is usually much smaller than the page size.
This is a brain dump from @brandtbucher to help out with the implementation:
Has this already been discussed elsewhere?
I have already discussed this feature proposal on Discourse
Links to previous discussion of this feature:
This has been discussed with Brandt via email and in person at PyCon 2024.
The text was updated successfully, but these errors were encountered: