Skip to content

Native Assembler: Improvements, Tweaks, Enhancements #7561

Open
@ghost

Description

Split out from #2081 (comment) . See also #5241 for inline assembly improvements.

Zig Native Assembler

For system interfacing without libc in LLVM-less builds, we will need our own inline assembler. From there, it's not much more work to have a standalone assembler as well. This presents us an opportunity to make some improvements. It won't be wise to stray too far, though, as people will need to port their existing code.

Proposed Changes

  • Intel syntax for x86/64, except directives start with . and symbols end with :
  • scas-style local and relative labels instead of numerics: .label:/b .label, _:/b _(+|-)+ (no b _ -- ambiguous)
  • .pub sym:, .use sym for sharing symbols within a compilation unit
  • .export sym:, .extern ("mod")? sym for sharing symbols between objects
  • No .comm or .global
  • .end takes a symbol(s): .end sym1, sym2 == .size sym1, (. - sym1) [\n] .size sym2, (. - sym2); all non-local symbols must be .ended; replaces .size
  • Relax all guarantees of relative symbol layout beyond .end boundaries
  • Prepend or replace a symbol with its loader address: 0x8000 pin:, 0xff00: -- only possible at the start of a coherent region (all previous symbols have been .ended); linker will detect clashes/range errors
  • On the Zig side, use keyword, to access .pub symbols within the compilation unit: use const func: fn callconv(.c) (u64, bool) u64;
  • Cull some redundant/historical symbols (flexible)

Notes

  • Directives are carefully chosen so as not to clash with existing GNU/LLVM definitions. This way, an LLVM build of Zig can compile both Zig-flavoured and GNU-flavoured asm with no ambiguity.
  • .pub/.use take advantage of Zig's compilation units: .pub symbols are not necessarily exported, and .export symbols are not necessarily public. Symbols from pre-compiled object files cannot be .used; they must be .external (see below).
  • .pub symbols populate a single global namespace; the amorphous organisation permitted by assembly means a strictly hierarchical symbol-sharing model would be untenable. Explicit .use at least makes this much more manageable.
  • .extern "mod" provides some primitive namespacing for libraries, as with Zig extern -- without this, making the use of multiple libraries tenable requires a single global namespace for every symbol in every library on the system. The interpretation of mod is left to the linker, to facilitate versioning of libraries or different library paths; lack of "mod" is always taken to mean another explicit input object file (i.e. argument to zig build-(exe|lib). An unresolved symbol is a compile error.
  • .public and .exported symbols must be declared as such at the symbol definition site, i.e. .pub sym: both declares the symbol sym and marks it public, and there is no way to separate these actions. This facilitates locating such a symbol by a simple global text search.
  • .global makes symbols impossible to track down, .common glosses over potential naming errors; their functionality is subsumed by .pub/.use.
  • .end was chosen rather than some kind of hierarchical structure or dividing by non-local symbols to allow overlapping of symbols, as well as sequencing:
one:
  ; Some code
two:
.end one
  ; This code comes right after `one`, if both are included, but both need not be if optimised

This presents an interesting edge case: a local label may be dropped by a non-local symbol while its use would still be valid. I'm not aware of a clean solution to this.

  • Prepending a section with a loader address may clash with section declarations in other files, and hence is best left to the build system; also, a specific loader address typically implies specific symbol addresses as well, and we make no guarantees of symbol layout within sections, so it would just be more complication for no reason. This at least gets us a bit closer to Zig-based alternative to linker scripts #3206.
  • I considered a hypothetical @sImport(), but the need for strong typing would have made it untenable. Collecting all public symbols into a namespace wouldn't have helped, for the same reason: since builtins will be required anyway, and the lack of explicit source file dependencies means assembly building will have to be coordinated by build.zig anyway, there is no harm in accessing symbols individually. (Note: this still only applies to symbols within the same compilation unit; .exported symbols from prebuilt objects still come through Zig extern, as usual.)
  • Under this system, there is a way for Zig code to directly use symbols from asm, but not the other way around. Unfortunately, there is no clean way around this: asm's flat symbol model can be expressed inside Zig's hierarchical model without recreating it from scratch, but not vice versa. This means that using Zig symbols from asm requires two compilation steps, and use of export/.extern; this is annoying, however, as Zig is typically the driver of asm and not the other way around, and inline asm with Enhancement: New Inline Assembly #5241 makes any structure possible if need be, it is considered acceptable.

Metadata

Metadata

Assignees

No one assigned

    Labels

    proposalThis issue suggests modifications. If it also has the "accepted" label then it is planned.

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions