Open
Description
I've decided to drop the plan of true function inlining for 3.13. The main problem is that debuggers use frame.f_locals
modifications to work, which would break my current inlining plan.
Instead, I have another proposal that gets most of the frame inlining benefits, without the reconstruction requirements. This involves changing the frame layout slightly
The After
layout has the following benefits:
- Reduced overhead via interleaved stack and locals. No copying of arguments at all for most tier 1 specialisations and simple function calls. The old frame's stack becomes the new frame's locals.
- Makes frame inlining easier in the future, as no copying during reconstruction is needed now.
- Makes the "hot" (commonly accessed) parts of the frame smaller, likely fitting better in cache.
- No complex reconstruction needed for
sys._getframe
.
There are some cons:
- Worse locality of data for cold fields in frame. Things like globals, builtins, code object, code attribute names, and constants are now a separate memory cache line than the frame's localsplus. I don't think this is a con with though (in fact, this is a plus). Those fields are rarely used once globals get promoted to constants, and all constants become inlined loads. We already rarely use attribute names already in tier 1 as specialisation doesn't use it. Making localsplus fit in a cache line is better.
- One extra local variable needed in eval frame to hold reference to new
localsplus
to avoid one more pointer indirection in load and store fast. Not sure if this will affect register pressure.
Metadata
Metadata
Assignees
Labels
No labels