Description
This is a meta issue for tracking progress on code signing of binaries generated with the self-hosted MachO linker targeting arm64 Macs. I also hope to disseminate knowledge about the signing process used and enforced by Apple on the latest arm64 platform.
Background
With the release of Apple Silicon and macOS 11 Big Sur, Apple is now enforcing code signed binaries even at debugging stage. Essentially speaking, if the user wants to build an app/binary for local use/testing, they are expected to do the strict validation of the MachO binary and apply adhoc code signing. There is a pretty good stackExchange answer on that topic as well.
My understanding here is that adhoc signed binary can only be used on the machine it was originally built on; however, this requires further investigation.
What does all of this mean for Zig?
Cross-compilation to arm64 Macs will become tricky due to the additional code signing requirement. Distribution of Zig binaries as well; however for the latter I assume the standard code signing process required by Apple on other platforms (iOS, watchOS, tvOS) will be used here --- that is, obtaining a developer identity and certificate, and signing the binary with it. @jedisct1 pointed me to some nice, existing OSS solutions that do this and were not developed by Apple (see for instance gon) which hints on the possibility of having a similar solution written in Zig but hosted as a separate, preferably community-driven project. (Any takers? 😁)
What about local debug builds? @andrewrk suggested we investigate if perhaps there is an exception in the kernel which at least permits non-signed binaries to run fine via Apple debugger, however, this is not the case. No code signature means immediate SIGKILL -9
even for binaries build from source by the user on the very Mac.
With this in mind, my idea, and what I've been trying hard to understand and explore, is how should we tweak the self-hosted MachO linker to be able to generate a valid adhoc code signature. My idea was for this research effort to proceed in the following 3 steps:
- Get the generated MachO binary code signed with
codesign -s - binary
--- this will generate an adhoc signature and allow running of thebinary
locally. - Figure out if we want the necessary changes to the linker to be included in the codebase by default, or
- Be lucky enough and figure out how to replicate the output of
codesign -s -
in the self-hosted linker.
Depending on progress, we'd either go immediately from 1 -> 3, or 1 -> 2 and optionally -> 3.
The story so far...
Easy stuff first. The code signature is stored inside a __LINKEDIT
segment at an offset pointed to by LC_CODE_SIGNATURE
load command. The load command itself is a linked_data_command
.
The structure that's embedded into the binary is a little bit more cryptic and complicated. I couldn't find much info about it, however, from browsing Apple's sources, I managed to come up with a partial parser and included it in the draft helper project Zacho#64f0a0d. This is based largely on SecurityTool/codesign.c. In fact, this tool pointed out that a LC_VERSION_MIN_MACOSX
load command might not be optional after all since its presence or absence has direct effect on the code signature version generated and embedded within the binary by codesign
utility.
Now onto the more tricky stuff. It turns out that codesign
will perform strict validation of the MachO binary before it is signed, and if the binary doesn't conform to, it will either refuse to sign and generate some garbage. I'm still to work out what that is, however, I already know that the sections within __LINKEDIT
segment cannot have "holes" between them; i.e., the offset of one should be the offset + size of the preceding section. This complicated things for us since we specifically want to leave some gaps for easier management of the incremental linking process in the self-hosted. With that out of the way, I also think arm64 macOS requires the binary to be position-independent executables (or PIEs) which is a big setback wrt to the self-hosted linker since we store and rely on absolute addressing stored within __DATA,__got
section. We can work this out, however, since this is somewhat tangential to the code signature issue, I believe it is of interest to get a simple "exit syscall" binary working first. Such a binary will not contain any function call (or debugging info), thus not requiring any GOT indirection:
export fn _start() noreturn {
asm volatile ("svc #0x80"
:
: [number] "{x16}" (1),
[arg1] "{x0}" (0)
: "memory"
);
unreachable;
}
And this is what I'm currently working on. You can track the progress in kubkon/zig.git#stage2-arm-macos.