-
-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Enhancement: New Inline Assembly #5241
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Compared to GCC style syntax this is much more verbose. So would it really make sense to do this? What people really like is MSVC style, where clobbers, register allocation etc are mostly inferred by the compiler. This requires lots of good defaults, but is great to work with: downside is a lot of loss of control, which could be regained with some clever added optional constraints instead. That said it might violate the Zig explicitness goal. So if the ideal isn't achievable, why pick a style that is completely new for Zig, rather than the de facto standard? |
Also, everything is still strings, so it's basically leaving everything stringly typed. Compare to the Rust asm that at least defines what looks as constants for things like I suspect people will just consider this a confusing, hobbled and verbose version of GCC inline asm. Compare this: return asm {
"eax": u32 = 0,
"ebx" b: u32,
"ecx" c: u32,
"edx" d: u32,
"?memory",
} "cpuid"
: .{ b, c, d }; To static inline void cpuid(int code, uint32_t* a, uint32_t* d)
{
asm volatile ( "cpuid" : "=a"(*a), "=d"(*d) : "0"(code) : "ebx", "ecx" );
} You don't like that? Here's another one using code = 1 without the memory: int a = 0x1, b, c, d;
asm ( "cpuid" : "=a" (a), "=b" (b), "=c" (c), "=d" (d) : "0" (a) ); So what is the benefit? |
For comparison, the Rust new asm: https://doc.rust-lang.org/beta/unstable-book/library-features/asm.html |
I like this syntax. A lot. But there is one issue here I think just looks a little weird to me
In here, I just want to grab the value of a register. I don't care about what the mov instruction looks like, and I believe that should be left to the compiler to figure out. Should putting empty asm and replacing |
Hmm, I hadn’t thought of that. My instinct is to allow this, but I’m not sure if this would lead to parsing ambiguity. If not, then sure. |
I have an alternative proposal that, I think, will be much clearer, and far different from GCC inline asm, or any other asm syntax I've seen. And it won't be just string hackery, either. I imagine this proposal will take a long time to actually implement, but it'll be much, much clearer, and very elegant, and fits Zig's Zen (whereas the current proposal doesn't). The general idea are
To provide an example in action, here's the classic CPUID on x86, from Agner Fog's asmlib library, which uses a parameter as a return value (but for this example we just allocate it on the stack and use that). The original example is as follows: cpuid_ex:
%IFDEF WINDOWS
; parameters: rcx = abcd, edx = a, r8d = c
push rbx
xchg rcx, r8
mov eax, edx
cpuid ; input eax, ecx. output eax, ebx, ecx, edx
mov [r8], eax
mov [r8+4], ebx
mov [r8+8], ecx
mov [r8+12], edx
pop rbx
%ENDIF
%IFDEF UNIX
; parameters: rdi = abcd, esi = a, edx = c
push rbx
mov eax, esi
mov ecx, edx
cpuid ; input eax, ecx. output eax, ebx, ecx, edx
mov [rdi], eax
mov [rdi+4], ebx
mov [rdi+8], ecx
mov [rdi+12], edx
pop rbx
%ENDIF
ret We'll drop the prologue, and the example in this proposed syntax becomes: fn cpuid(a: u32, c: u32) [4]u32 {
var abcd: [4]u32 = undefined;
asm(inputs: a, c; outputs: abcd; clobbers: eax, ecx) {
edx = eax;
cpuid();
// These are memory-based movs
abcd[0] = eax;
abcd[1] = ebx;
abcd[2] = ecx;
abcd[3] = edx;
}
return abcd;
} As another example, take a more complex one, loading the GDT (sorry if this isn't quite valid, I'm not the most skilled at this): Original: load_gdt:
push %rbp
mov %rsp, %rbp
sub $32, %rsp
mov 8(%rsp), %rax
lgdt (%rax)
pushq $0x08
lea reload_segment_regs(%rip), %rax
push %rax
lretq
reload_segment_regs:
mov $0x10, %ax
mov %ax, %ds
mov %ax, %es
mov %ax, %fs
mov %ax, %gs
mov %ax, %ss
mov %rbp, %rsp
pop %rbp
ret In this syntax, this becomes: fn load_gdt(gdt: usize) void {
asm(inputs: gdt) {
rax = gdt;
lgdt(&rax);
push(0x08);
lea(:reload_segment_regs); // labels are always PIC/PIE unless `build.zig` explicitly indicates that the executable is not position independent
push(rax);
lret(); // long return
reload_segment_regs:
// Register-immediate load
ax = 0x10;
// register-register load and store
ds = ax;
es = ax;
fs = ax;
gs = ax;
ss = ax;
}
} Like I said, this definitely needs refinement and I think that this will take a long time to completely implement. However, I think that this is, most likely, the proposal that upholds Zig's zen and doesn't make inline assembly look like a complete and utter mess. This syntax has the benefit of giving the compiler a lot of information about what your trying to do, so it could very well optimize your loads/stores into something using AVX or neon if possible. What I'm unsure about are things like:
If you guys want to help refine this I'd appreciate it. I know that some of the syntactic elements that this introduces are unorthodox, and are quite different from Zig's normal syntax, but I did try to stay as close to Zig as possible while compromising on the fact that this was inline assembly and I didn't really have much of a choice. For the RHS of an assignment statement, most valid expressions are allowed, barring multi-loads or stores; I was thinking that you could even call built-ins as well. When this happened, the load/store would be a multi-load/store, but would finish as a single load/store; for example, if you used I understand that this syntax would result in "behind your back" instructions in certain instances. In the case of the aforementioned built-in function call idea, discarding that for now would be perfectly reasonable. The assignment statement thing was to eliminate the minutia of a ton of |
@ethindp You might want to draw some inspiration from how C3 does it: https://c3-lang.org/asm/ It creates a very simple, regular grammar and infers clobbers. |
@lerno That's an interesting syntax, but IMO it's not as clear as mine (but mine is more complex since I'm trying to be as flexible as possible). |
@ethindp Yes, the focus is trying to be as cheap as possible to implement for various variants of asm. |
Copied over from #215. Inspiration is via them. Thanks also to @MasterQ32 and @kubkon for help extending it to support stack machine architectures. See #7561 for standalone assembler improvements.
New Inline Assembly
asm volatile? {bindings}? body? : post_expression?
TL;DR: Benefits over Status Quo
Stack Machines
This syntax has first-class support for stack machine architectures such as WebAssembly, the JVM, and @MasterQ32's SPU Mk. II. It accomplishes this with a novel batch-push and -pop mechanism for marshaling between Zig and the stack. Because there is significant difference between register and stack machine architectures, a new
.paradigm()
method is defined onbuiltin.Arch
, which returns an enum with the variants.register
and.stack
. (NOTE: supporting stack machines with LLVM is a very hard problem -- maybe defer to stage 2?)Meta
At least one of body or post expression must be present. The expression inherits block/statement status from the post expression if present, and defaults to statement if not.
Volatile
This block has side effects, and may not be optimised away if its value is not used. Implied by a return type of
void
ornoreturn
, or a mutable symbol binding -- so, in practice, very rarely used.Bindings
There are three types of bindings: operand, symbol, and clobber. All of them use specially formatted comptime strings to interface with assembly, as in status quo. This decision was made as integrating the required functionality into Zig itself would have required either breaking several guidelines or introducing special constructs with no other use cases.
Operand
An operand binding has the form
"operand" name: type = value
. Within the block,?(name)
then refers tooperand
compatible with Zig typetype
, initially with valuevalue
, which may be a register (integer, float, or vector), a datum literal (only integer in every ISA I'm aware of), a stack top (array with size a multiple of stack alignment), or a processor condition code (boolean).type
must be coercible to all ofname
's uses in the block, taking into account sign- or zero-extension and lane width/count if applicable, and may be omitted if the type ofvalue
is known -- in addition,value
may be omitted if initialisation is not needed, andname
may be omitted if only initialisation is needed. The type of the binding must be derivable -- that is, at least one oftype
orvalue
must be present (this also means that operand and symbol bindings are syntactically distinct). Stack pushes and pops must be declared separately -- see below. Condition codes may not be initialised (type
must be present and must bebool
).operand
may be a wildcard, as described below.Symbol
A symbol binding has the form
"type" const? symbol
, wheresymbol
is a program symbol in scope.type
is a wildcard indicating the type ofsymbol
, which could be a variable or a function. Within the block,?(symbol)
then refers to the assembly program entity corresponding to the Zig program construct (which need not be an exported symbol -- it may be an internal label, a simple address, or even the referenced data itself on stack machines). Aconst
annotation indicates an immutable binding -- this may be safety-checked by comparing the value at the associated address before and after the block. (NOTE: In some assemblies, many label operations are actually macros, which expand to multiple instructions and relocations -- we'd need some way of propagating this information through the compilation pipeline from codegen to linking.)Clobber
A clobber is simply
"location"
, which may be a literal or a wildcard.Wildcards
Wildcards indicate that a binding has special properties, and give the compiler freedom to fill in some details. Wildcards start with
?
and run the length of the binding string. A literal?
is escaped with another one, for symmetry with in-block syntax. Wildcards may be followed by architecture-dependent:option
s to place restrictions on their resolution -- for instance,?reg:abcd
for a legacy x86 register on x86_64, or?int:lo12
for a 12-bit integer immediate on RISC-V. Options may change the type of a binding -- for instance,"?tmp:all" callconv(.fast)
is a clobber that binds all callee-saved registers under the fast calling convention.The following wildcards are defined:
Operand
?reg
Arbitrary register. Register machine architectures only.
value
may be an integer, a float, or an int/float vector, of any architecturally-supported width and length.?tmp
Arbitrary caller-saved register under current calling convention. See above. May be annotated with
callconv
to specify a different calling convention.?sav
Arbitrary callee-saved register under current calling convention. See above.
?lit
Literal.
value
must be comptime-known, and may be any architecturally-supported literal type.?psh
Array.
value
must be provided. Length * element size must be a multiple of platform stack alignment; elements must be size-compatible with stack cells if applicable. Pushed onto the stack at block entry, leftmost element topmost. Only one allowed per block. This is the only way of marshaling non-symbol values into assembly on stack machines.?pop
Uninitialised array (
value
must not be provided). See above. Popped from the stack on block exit, topmost element leftmost. This is the only way of marshaling non-symbol values out of assembly on stack machines.?stg
Additional stack growth, i.e. growth not already accounted for by
?push
or function calls, in bytes.name
,type
omitted.value
must be comptime-known. (NOTE: This does not imply that the stack pointer has a different value before and after the block -- in fact, unless it is listed as a clobber, this is not allowed.)Symbol
?locl
Local variable. Stack machine only.
?argm
Argument of current function. Stack machine only. Implies
const
.?glob
Global variable.
?thdl
Thread-local variable.
?comp
Comptime-known variable/constant. Substitution semantics of a literal. Implies
const
.?func
Function. Registers
symbol
in this block's call graph. Impliesconst
.Clobber
?memory
Unspecified memory.
?status
Processor status flags.
Body
The assembly code itself, as a comptime string. For symbol scoping purposes, treated as a separate file, i.e. declared symbols do not leak to the rest of the program and elsewhere-defined symbols are not visible except through bindings. May be omitted if only values of registers are desired.
Bound operands and symbols are accessed within the block by enclosing their names in
?()
. This syntax was chosen as the?
character is far less commonly used in assembly languages than%
, and pairs well with the theme of an unknown resolution -- additionally, parentheses are less likely to have semantic significance than square brackets, so the code is easier to scan. Accessing an unbound name in this manner is a compile error. As with wildcards, names may be modified with:options
, for instance?(r:hi)
to access the high byte of registerr
, or?(i:x)
to print integeri
in hexadecimal. A literal?
is escaped with another one, as regular escaping is not possible in multiline strings.Post Expression
An expression evaluated after the body, using the final values of all bindings. Becomes the value of the whole block. Preceded by a colon. May be omitted without ambiguity, in which case the return type is
void
. This permits us to return as many values as we like, in whatever format and location we choose. Moreover, we don't have to specify the exact lifetimes of all of our inputs and outputs to appease the optimiser -- we can decide for ourselves how our values are allocated and consumed.Examples
Simple, bindless assembly is simple:
More involved assembly is logical:
A simple bare-metal OS entry point on RISC-V:
POSIX startcode (adapted from
lib/std/start.zig
):The text was updated successfully, but these errors were encountered: