A proposal for variable-length instruction decoding #539

gvanrossum · 2023-01-15T22:59:03Z

This contains an idea on how to do EXTENDED_ARG for variable-length instructions. It was weirdly inspired by the blog post that @lpereira mentioned in #537.

3.12/vm.md

We let the code generator generate labels and a switch that jumps there.

Co-authored-by: Irit Katriel <[email protected]>

gvanrossum · 2023-01-16T17:57:01Z

Sorry about the force pushes, I'm a gidiot. It should now be correct.

3.12/vm.md

brandtbucher · 2023-01-17T20:03:56Z

Just a thought I had while reading this: we could avoid the (slightly clunkly, in my opinion) double-dispatch into the middle of instructions by using a decoding scheme like this:

#define DECODE_OPARGS(OPARG_A, OPARG_B) \
    do {                                \
        assert((OPARG_A) & 0xFF == 0);  \
        assert((OPARG_B) & 0xFF == 0);  \
        word = *next_instr++;           \
        (OPARG_A) |= word.first_byte;   \
        (OPARG_B) |= word.second_byte;  \
    } while (0)

#define CLEAR_OPARGS() \
    do {               \
        oparg1 = 0;    \
        oparg2 = 0;    \
        oparg3 = 0;    \
        oparg4 = 0;    \
        oparg5 = 0;    \
    } while (0)

uint32_t oparg1, oparg2, oparg3, oparg4, oparg5;
CLEAR_OPARGS();
while (true) {
    _Py_CODEUNIT word = *next_instr++;
    uint8_t opcode = word.first_byte;
    assert(oparg1 & 0xFF == 0);
    oparg1 |= word.second_byte;
    switch (oparg) {
        case OP_1:
            // ...
            CLEAR_OPARGS();
            break;
        case OP_3:
            DECODE_OPARGS(oparg2, oparg3);
            // ...
            CLEAR_OPARGS();
            break;
        case OP_5:
            DECODE_OPARGS(oparg2, oparg3);
            DECODE_OPARGS(oparg4, oparg5);
            // ...
            CLEAR_OPARGS();
            break;
        case EXTENDED_ARG_1:
            oparg1 <<= 8;
            break;
        case EXTENDED_ARG_3:
            DECODE_OPARGS(oparg2, oparg3);
            oparg1 <<= 8;
            oparg2 <<= 8;
            oparg3 <<= 8;
            break;
        case EXTENDED_ARG_5:
            DECODE_OPARGS(oparg2, oparg3);
            DECODE_OPARGS(oparg4, oparg5);
            oparg1 <<= 8;
            oparg2 <<= 8;
            oparg3 <<= 8;
            oparg4 <<= 8;
            oparg5 <<= 8;
            break;
    }
}

Basically, the aforementioned clunkiness comes from the assignments to opargs, which would clobber any extended values. Instead, this zeroes all opargs at the end of each normal instruction and reads them in at the start of the next instruction using |.

I'm not sure how the cost of zeroing and or-ing the opargs in the common case compares to the double-switches in the uncommon case, but I know that I find this quite a bit easier to reason about.

lpereira · 2023-01-17T20:12:02Z

@brandtbucher wrote:

I'm not sure how the cost of zeroing and or-ing the opargs in the common case compares to the double-switches in the uncommon case, but I know that I find this quite a bit easier to reason about.

Is this pattern of zero+or necessary? Could opargs be just assigned to the locals instead?

brandtbucher · 2023-01-17T20:17:49Z

@brandtbucher wrote:

I'm not sure how the cost of zeroing and or-ing the opargs in the common case compares to the double-switches in the uncommon case, but I know that I find this quite a bit easier to reason about.

Is this pattern of zero+or necessary? Could opargs be just assigned to the locals instead?

But how would you extend them? The idea here is that the entry to an instruction is the same whether the opargs have been extended or not.

Co-authored-by: Brandt Bucher <[email protected]>

gvanrossum · 2023-01-17T21:53:47Z

I originally had a version where the opargs were zeroed out, but it seems pretty crazy to waste time writing zeros as part of every instruction, so I came up with the nested switch. At least the nested switch only exists for EXTENDED_ARG_{3,5} rather than in every DISPATCH() call as in the blog that L mentioned.

gvanrossum · 2023-05-23T15:59:02Z

Since we're not doing the register VM (yet) this is no longer important. Closing.

iritkatriel reviewed Jan 16, 2023

View reviewed changes

3.12/vm.md Outdated Show resolved Hide resolved

iritkatriel reviewed Jan 16, 2023

View reviewed changes

3.12/vm.md Outdated Show resolved Hide resolved

gvanrossum mentioned this pull request Jan 16, 2023

Variable length instructions #540

Open

gvanrossum force-pushed the vm branch from 4264b54 to bfc4328 Compare January 16, 2023 05:59

gvanrossum mentioned this pull request Jan 16, 2023

Start a chapter on 3.11 interpreter internals python/devguide#1028

Merged

gvanrossum and others added 4 commits January 16, 2023 09:44

Stuff about the VM

94e741b

Better solution for EXTENDED_ARG_3 etc.

deee88f

We let the code generator generate labels and a switch that jumps there.

Typo fix

2b683ff

Fix typo

1465a82

Co-authored-by: Irit Katriel <[email protected]>

gvanrossum force-pushed the vm branch from bfc4328 to 1465a82 Compare January 16, 2023 17:44

gvanrossum requested review from brandtbucher and markshannon January 16, 2023 17:56

brandtbucher reviewed Jan 17, 2023

View reviewed changes

3.12/vm.md Outdated Show resolved Hide resolved

lpereira reviewed Jan 17, 2023

View reviewed changes

3.12/vm.md Show resolved Hide resolved

Fix typos in 3.12/vm.md

34ba39a

Co-authored-by: Brandt Bucher <[email protected]>

This was referenced Jan 18, 2023

Instruction formats #530

Closed

Play around with variable-length instructions (DRAFT!) python/cpython#101160

Closed

gvanrossum closed this May 23, 2023

gvanrossum deleted the vm branch February 25, 2024 21:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

A proposal for variable-length instruction decoding #539

A proposal for variable-length instruction decoding #539

Uh oh!

gvanrossum commented Jan 15, 2023

Uh oh!

Uh oh!

Uh oh!

gvanrossum commented Jan 16, 2023

Uh oh!

Uh oh!

Uh oh!

brandtbucher commented Jan 17, 2023 •

edited

Loading

Uh oh!

lpereira commented Jan 17, 2023

Uh oh!

brandtbucher commented Jan 17, 2023

Uh oh!

gvanrossum commented Jan 17, 2023

Uh oh!

gvanrossum commented May 23, 2023

Uh oh!

Uh oh!

A proposal for variable-length instruction decoding #539

A proposal for variable-length instruction decoding #539

Uh oh!

Conversation

gvanrossum commented Jan 15, 2023

Uh oh!

Uh oh!

Uh oh!

gvanrossum commented Jan 16, 2023

Uh oh!

Uh oh!

Uh oh!

brandtbucher commented Jan 17, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lpereira commented Jan 17, 2023

Uh oh!

brandtbucher commented Jan 17, 2023

Uh oh!

gvanrossum commented Jan 17, 2023

Uh oh!

gvanrossum commented May 23, 2023

Uh oh!

Uh oh!

brandtbucher commented Jan 17, 2023 •

edited

Loading