Open

Description
Split out from #2081 (comment) . See also #5241 for inline assembly improvements.
Zig Native Assembler
For system interfacing without libc in LLVM-less builds, we will need our own inline assembler. From there, it's not much more work to have a standalone assembler as well. This presents us an opportunity to make some improvements. It won't be wise to stray too far, though, as people will need to port their existing code.
Proposed Changes
- Intel syntax for x86/64, except directives start with
.
and symbols end with:
- scas-style local and relative labels instead of numerics:
.label:
/b .label
,_:
/b _(+|-)+
(nob _
-- ambiguous) .pub sym:
,.use sym
for sharing symbols within a compilation unit.export sym:
,.extern ("mod")? sym
for sharing symbols between objects- No
.comm
or.global
.end
takes a symbol(s):.end sym1, sym2
==.size sym1, (. - sym1) [\n] .size sym2, (. - sym2)
; all non-local symbols must be.end
ed; replaces.size
- Relax all guarantees of relative symbol layout beyond
.end
boundaries - Prepend or replace a symbol with its loader address:
0x8000 pin:
,0xff00:
-- only possible at the start of a coherent region (all previous symbols have been.end
ed); linker will detect clashes/range errors - On the Zig side,
use
keyword, to access.pub
symbols within the compilation unit:use const func: fn callconv(.c) (u64, bool) u64;
- Cull some redundant/historical symbols (flexible)
Notes
- Directives are carefully chosen so as not to clash with existing GNU/LLVM definitions. This way, an LLVM build of Zig can compile both Zig-flavoured and GNU-flavoured asm with no ambiguity.
.pub
/.use
take advantage of Zig's compilation units:.pub
symbols are not necessarily exported, and.export
symbols are not necessarily public. Symbols from pre-compiled object files cannot be.use
d; they must be.extern
al (see below)..pub
symbols populate a single global namespace; the amorphous organisation permitted by assembly means a strictly hierarchical symbol-sharing model would be untenable. Explicit.use
at least makes this much more manageable..extern "mod"
provides some primitive namespacing for libraries, as with Zigextern
-- without this, making the use of multiple libraries tenable requires a single global namespace for every symbol in every library on the system. The interpretation ofmod
is left to the linker, to facilitate versioning of libraries or different library paths; lack of "mod" is always taken to mean another explicit input object file (i.e. argument tozig build-(exe|lib)
. An unresolved symbol is a compile error..pub
lic and.export
ed symbols must be declared as such at the symbol definition site, i.e..pub sym:
both declares the symbolsym
and marks it public, and there is no way to separate these actions. This facilitates locating such a symbol by a simple global text search..global
makes symbols impossible to track down,.common
glosses over potential naming errors; their functionality is subsumed by.pub
/.use
..end
was chosen rather than some kind of hierarchical structure or dividing by non-local symbols to allow overlapping of symbols, as well as sequencing:
one:
; Some code
two:
.end one
; This code comes right after `one`, if both are included, but both need not be if optimised
This presents an interesting edge case: a local label may be dropped by a non-local symbol while its use would still be valid. I'm not aware of a clean solution to this.
- Prepending a section with a loader address may clash with section declarations in other files, and hence is best left to the build system; also, a specific loader address typically implies specific symbol addresses as well, and we make no guarantees of symbol layout within sections, so it would just be more complication for no reason. This at least gets us a bit closer to Zig-based alternative to linker scripts #3206.
- I considered a hypothetical
@sImport()
, but the need for strong typing would have made it untenable. Collecting all public symbols into a namespace wouldn't have helped, for the same reason: since builtins will be required anyway, and the lack of explicit source file dependencies means assembly building will have to be coordinated by build.zig anyway, there is no harm in accessing symbols individually. (Note: this still only applies to symbols within the same compilation unit;.export
ed symbols from prebuilt objects still come through Zigextern
, as usual.) - Under this system, there is a way for Zig code to directly use symbols from asm, but not the other way around. Unfortunately, there is no clean way around this: asm's flat symbol model can be expressed inside Zig's hierarchical model without recreating it from scratch, but not vice versa. This means that using Zig symbols from asm requires two compilation steps, and use of
export
/.extern
; this is annoying, however, as Zig is typically the driver of asm and not the other way around, and inline asm with Enhancement: New Inline Assembly #5241 makes any structure possible if need be, it is considered acceptable.