Skip to content

Commit 74c2422

Browse files
authored
Update DSL docs for cases generator (#105753)
* Clarify things around goto error/ERROR_IF a bit * Remove docs for super-instructions * Add pseudo; fix heading markup
1 parent 1d857da commit 74c2422

File tree

1 file changed

+60
-44
lines changed

1 file changed

+60
-44
lines changed

Tools/cases_generator/interpreter_definition.md

+60-44
Original file line numberDiff line numberDiff line change
@@ -67,26 +67,24 @@ parts of instructions, we can reduce the potential for errors considerably.
6767

6868
## Specification
6969

70-
This specification is at an early stage and is likely to change considerably.
70+
This specification is a work in progress.
71+
We update it as the need arises.
7172

72-
Syntax
73-
------
73+
### Syntax
7474

7575
Each op definition has a kind, a name, a stack and instruction stream effect,
7676
and a piece of C code describing its semantics::
7777

7878
```
7979
file:
80-
(definition | family)+
80+
(definition | family | pseudo)+
8181
8282
definition:
8383
"inst" "(" NAME ["," stack_effect] ")" "{" C-code "}"
8484
|
8585
"op" "(" NAME "," stack_effect ")" "{" C-code "}"
8686
|
8787
"macro" "(" NAME ")" "=" uop ("+" uop)* ";"
88-
|
89-
"super" "(" NAME ")" "=" NAME ("+" NAME)* ";"
9088
9189
stack_effect:
9290
"(" [inputs] "--" [outputs] ")"
@@ -122,16 +120,17 @@ and a piece of C code describing its semantics::
122120
object "[" C-expression "]"
123121
124122
family:
125-
"family" "(" NAME ")" = "{" NAME ("," NAME)+ "}" ";"
123+
"family" "(" NAME ")" = "{" NAME ("," NAME)+ [","] "}" ";"
124+
125+
pseudo:
126+
"pseudo" "(" NAME ")" = "{" NAME ("," NAME)+ [","] "}" ";"
126127
```
127128

128129
The following definitions may occur:
129130

130131
* `inst`: A normal instruction, as previously defined by `TARGET(NAME)` in `ceval.c`.
131132
* `op`: A part instruction from which macros can be constructed.
132133
* `macro`: A bytecode instruction constructed from ops and cache effects.
133-
* `super`: A super-instruction, such as `LOAD_FAST__LOAD_FAST`, constructed from
134-
normal or macro instructions.
135134

136135
`NAME` can be any ASCII identifier that is a C identifier and not a C or Python keyword.
137136
`foo_1` is legal. `$` is not legal, nor is `struct` or `class`.
@@ -159,15 +158,21 @@ By convention cache effects (`stream`) must precede the input effects.
159158

160159
The name `oparg` is pre-defined as a 32 bit value fetched from the instruction stream.
161160

161+
### Special functions/macros
162+
162163
The C code may include special functions that are understood by the tools as
163164
part of the DSL.
164165

165166
Those functions include:
166167

167168
* `DEOPT_IF(cond, instruction)`. Deoptimize if `cond` is met.
168-
* `ERROR_IF(cond, label)`. Jump to error handler if `cond` is true.
169+
* `ERROR_IF(cond, label)`. Jump to error handler at `label` if `cond` is true.
169170
* `DECREF_INPUTS()`. Generate `Py_DECREF()` calls for the input stack effects.
170171

172+
Note that the use of `DECREF_INPUTS()` is optional -- manual calls
173+
to `Py_DECREF()` or other approaches are also acceptable
174+
(e.g. calling an API that "steals" a reference).
175+
171176
Variables can either be defined in the input, output, or in the C code.
172177
Variables defined in the input may not be assigned in the C code.
173178
If an `ERROR_IF` occurs, all values will be removed from the stack;
@@ -187,17 +192,39 @@ These requirements result in the following constraints on the use of
187192
intermediate results.)
188193
3. No `DEOPT_IF` may follow an `ERROR_IF` in the same block.
189194

190-
Semantics
191-
---------
195+
(There is some wiggle room: these rules apply to dynamic code paths,
196+
not to static occurrences in the source code.)
197+
198+
If code detects an error condition before the first `DECREF` of an input,
199+
two idioms are valid:
200+
201+
- Use `goto error`.
202+
- Use a block containing the appropriate `DECREF` calls ending in
203+
`ERROR_IF(true, error)`.
204+
205+
An example of the latter would be:
206+
```cc
207+
res = PyObject_Add(left, right);
208+
if (res == NULL) {
209+
DECREF_INPUTS();
210+
ERROR_IF(true, error);
211+
}
212+
```
213+
214+
### Semantics
192215
193216
The underlying execution model is a stack machine.
194217
Operations pop values from the stack, and push values to the stack.
195218
They also can look at, and consume, values from the instruction stream.
196219
197-
All members of a family must have the same stack and instruction stream effect.
220+
All members of a family
221+
(which represents a specializable instruction and its specializations)
222+
must have the same stack and instruction stream effect.
223+
224+
The same is true for all members of a pseudo instruction
225+
(which is mapped by the bytecode compiler to one of its members).
198226
199-
Examples
200-
--------
227+
## Examples
201228
202229
(Another source of examples can be found in the [tests](test_generator.py).)
203230
@@ -237,27 +264,6 @@ This would generate:
237264
}
238265
```
239266
240-
### Super-instruction definition
241-
242-
```C
243-
super ( LOAD_FAST__LOAD_FAST ) = LOAD_FAST + LOAD_FAST ;
244-
```
245-
This might get translated into the following:
246-
```C
247-
TARGET(LOAD_FAST__LOAD_FAST) {
248-
PyObject *value;
249-
value = frame->f_localsplus[oparg];
250-
Py_INCREF(value);
251-
PUSH(value);
252-
NEXTOPARG();
253-
next_instr++;
254-
value = frame->f_localsplus[oparg];
255-
Py_INCREF(value);
256-
PUSH(value);
257-
DISPATCH();
258-
}
259-
```
260-
261267
### Input stack effect and cache effect
262268
```C
263269
op ( CHECK_OBJECT_TYPE, (owner, type_version/2 -- owner) ) {
@@ -339,14 +345,26 @@ For explanations see "Generating the interpreter" below.)
339345
}
340346
```
341347

342-
### Define an instruction family
343-
These opcodes all share the same instruction format):
348+
### Defining an instruction family
349+
350+
A _family_ represents a specializable instruction and its specializations.
351+
352+
Example: These opcodes all share the same instruction format):
353+
```C
354+
family(load_attr) = { LOAD_ATTR, LOAD_ATTR_INSTANCE_VALUE, LOAD_SLOT };
355+
```
356+
357+
### Defining a pseudo instruction
358+
359+
A _pseudo instruction_ is used by the bytecode compiler to represent a set of possible concrete instructions.
360+
361+
Example: `JUMP` may expand to `JUMP_FORWARD` or `JUMP_BACKWARD`:
344362
```C
345-
family(load_attr) = { LOAD_ATTR, LOAD_ATTR_INSTANCE_VALUE, LOAD_SLOT } ;
363+
pseudo(JUMP) = { JUMP_FORWARD, JUMP_BACKWARD };
346364
```
347365

348-
Generating the interpreter
349-
==========================
366+
367+
## Generating the interpreter
350368

351369
The generated C code for a single instruction includes a preamble and dispatch at the end
352370
which can be easily inserted. What is more complex is ensuring the correct stack effects
@@ -401,9 +419,7 @@ rather than popping and pushing, such that `LOAD_ATTR_SLOT` would look something
401419
}
402420
```
403421

404-
Other tools
405-
===========
422+
## Other tools
406423

407424
From the instruction definitions we can generate the stack marking code used in `frame.set_lineno()`,
408425
and the tables for use by disassemblers.
409-

0 commit comments

Comments
 (0)