Skip to content

Conversation

sweeneyde
Copy link
Member

@sweeneyde sweeneyde added type-bug An unexpected behavior, bug, or error interpreter-core (Objects, Python, Grammar, and Parser dirs) 3.11 only security fixes labels Apr 26, 2022
@markshannon
Copy link
Member

I'm worried about keeping the adaptive form of the EXTENDED_ARG and its following opcode in sync. Adding instrumentation is likely to make this even more complex.

I think we should go back to the old approach of simply not quickening any opcode that is preceded by an EXTENDED_ARG.

According to the latest stats, EXTENDED_ARG is only 0.4% of the opcodes executed and only a few of those are candidates for specialization.

The new test looks good.

@sweeneyde
Copy link
Member Author

sweeneyde commented Apr 26, 2022

I think we should go back to the old approach of simply not quickening any opcode that is preceded by an EXTENDED_ARG.

It looks like FOR_ITER has an EXTENDED_ARG 7.7% of the time, and that's one motivation for this.

I'm not sure I understand what you mean by "in sync" -- tracing doesn't modify any opcodes in place, and the only difference in behavior between EXTENDED_ARG/EXTENDED_ARG_QUICK here is whether they do (the same-oparg equivalent of) NEXTOPARG() or TRACING_NEXTOPARG().

The in-sync-ness is guaranteed by the implementation of EXTENDED_ARG/EXTENDED_ARG_QUICK not checking for tracing before jumping straight into the next instruction, and adaptive instructions don't clobber the opcode because of NOTRACE_DISPATCH_SAME_OPARG.

Before this PR, when running EXTENDED_ARG+LOAD_GLOBAL_MODULE with tracing enabled, this happens:

* previous instruction does opcode |= 255
* DO_TRACING for EXTENDED_ARG
* TRACING_NEXTOPARG() takes us back to EXTENDED_ARG
* EXTENDED_ARG sets the big oparg
* EXTENDED_ARG jumps straight to LOAD_GLOBAL_MODULE => assertion failure.

With this PR, running EXTENDED_ARG_QUICK+LOAD_GLOBAL_MODULE with tracing enabled:

* previous instruction does opcode |= 255
* DO_TRACING for EXTENDED_ARG(_QUICK)
* TRACING_NEXTOPARG() takes us back to (deoptimized) EXTENDED_ARG
* EXTENDED_ARG sets the big oparg
* EXTENDED_ARG jumps straight to (deoptimized) LOAD_GLOBAL

This change should also make it impossible for superinstructions like LOAD_FAST__STORE_FAST with EXTENDED_ARG to be executed while tracing is enabled, which I think may fix some bugs as well -- the second instruction in the pair should be able to provoke tracing.

@markshannon
Copy link
Member

My concern was that we might optimize (or deoptimize) the EXTENDED_ARG, but not the following instruction (or vice versa).

But, from what you are saying that should be OK as EXTENDED_ARG is really the special form, only needed when tracing and EXTENDED_ARG_QUICK is the normal form. Is that correct?

Would it better to never quicken EXTENDED_ARG, but have the compiler emit EXTENDED_ARG_QUICK directly (so EXTENDED_ARG only exists for tracing).

@sweeneyde
Copy link
Member Author

My concern was that we might optimize (or deoptimize) the EXTENDED_ARG, but not the following instruction (or vice versa).

But, from what you are saying that should be OK as EXTENDED_ARG is really the special form, only needed when tracing and EXTENDED_ARG_QUICK is the normal form. Is that correct?

Yes, the current PR's EXTENDED_ARG primarily exists for tracing. It should always be safe to switch from EXTENDED_ARG_QUICK to EXTENDED_ARG because it's always safe (right?) to do as tracing does and execute _PyOpcode_Deopt[opcode] instead of opcode.

Would it better to never quicken EXTENDED_ARG, but have the compiler emit EXTENDED_ARG_QUICK directly (so EXTENDED_ARG only exists for tracing).

That would make sense -- I'll see how hard it is to implement EXTENDED_ARG/EXTENDED_ARG_FOR_TRACING instead, such that _PyOpcode_Deopt[EXTENDED_ARG] == EXTENDED_ARG_FOR_TRACING.

@sweeneyde
Copy link
Member Author

#91974 implements that idea. It slightly complicates the meaning of _PyOpcode_Deopt[] and which opcodes are included in dis.opmap and such, but it does make it so that EXTENDED_ARG_TRACE is only ever executed if tracing is happening.

I think I'd lean toward this PR (91945), but either is fine by me.

@markshannon
Copy link
Member

You can leave EXTENDED_ARG and EXTENDED_ARG_QUICK; no need for EXTENDED_ARG_FOR_TRACING or to change _PyOpcode_Deopt

I tried removing the case for EXTENDED_ARG from _PyCode_Quicken and emitting EXTENDED_ARG_QUICK instead of EXTENDED_ARG in write_instr and all the test pass.

@markshannon markshannon added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Apr 27, 2022
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @markshannon for commit 6591ba1 🤖

If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Apr 27, 2022
@markshannon markshannon merged commit 37c6db6 into python:main Apr 28, 2022
@markshannon
Copy link
Member

@sweeneyde Thanks for the fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.11 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants