Skip to content

Variable length instructions #540

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
gvanrossum opened this issue Jan 16, 2023 · 2 comments
Open

Variable length instructions #540

gvanrossum opened this issue Jan 16, 2023 · 2 comments
Labels
epic-registers Ideas related to a register VM

Comments

@gvanrossum
Copy link
Collaborator

Using a simple notation for instruction format (python/cpython#100957, python/cpython#100895) we can describe instructions with any number of opargs:

  • IX: An instruction without oparg, using two bytes
  • IB: An instruction with one oparg, using two bytes
  • IBBX: An instruction with two opargs, using four bytes
  • IBBB: An instruction with three opargs, using four bytes
  • IBBBBX: An instruction with four opargs, using six bytes
  • Etc.

I propose a hybrid instruction format where the first word of the instruction (opcode and oparg1) is decoded by the "infrastructure", and subsequent words (oparg2, oparg3, oparg4, etc.) are decoded by the code generated specifically for that instruction. For example, if we had a BINARY_OP_R instruction taking 4 opargs, the opcode definition would look like

register instr(BINARY_OP_R, (left, right -- res. unused)) {
    ... // Implementation
}

and the generator would transform this into

TARGET(BINARY_OP_R) {
    word = *next_instr++;
    oparg2 = word.first_byte;
    oparg3 = word.second_byte;
    word = *next_instr++;
    oparg4 = word.first_byte;
    PyObject *left = REG(oparg1);
    PyObject *right = REG(oparg2);
    PyObject *res;
    ... // Implementation
    SET_REG(oparg3, res);
    DISPATCH();
}

There are complications for the 3-arg version of EXTENDED_ARG for which I think I have a solution, see #539

@arhadthedev
Copy link

Answering to python/cpython#101160 (comment):

But what length should instructions have? 4 bytes still isn't enough for operators like BINARY_OP_R that need four arguments (and yes, we debated endlessly if we could do it with three -- the answer is, not easily). There's also cache sizes.

Initially I've thought about 8-byte instructions since:

  • they are single-mov on 64-bit platforms;
  • the most of istructions are way shorter so their frequent zero padding would be encoded with the shortest Huffman codes;
  • after decompression, the padding can host a few inline cache entries until spillover into usual PEP-659 16-bit entries.

However, I agree on cache pressure together with swap file I/O. So it's a compromise that needs to be weighted.

@jneb
Copy link

jneb commented Jan 24, 2023

Hm, with that enormous size of instructions you could even consider putting two small instructions together in a "double dispatcher".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
epic-registers Ideas related to a register VM
Projects
None yet
Development

No branches or pull requests

3 participants