Skip to content

stage2: Hello, Silicon!\n #7231

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 37 commits into from
Nov 27, 2020
Merged

stage2: Hello, Silicon!\n #7231

merged 37 commits into from
Nov 27, 2020

Conversation

kubkon
Copy link
Member

@kubkon kubkon commented Nov 26, 2020

Closes #7103

First of, apologies for the size of the PR, but this is the minimal set of changes to stage2 which ensure that we can generate a valid MachO binary on Apple Silicon. To summarise, here are the main changes this PR brings to the table:

  • It adds adhoc code signing machinery which is now part of the self-hosted toolchain. The machinery resides in std/link/MachO/CodeSignature.zig as it seemed natural to make it part of the MachO linker only.
  • It ditches the use of absolute addressing in favour of RIP-relative on x86_64 and PC-relative on aarch64. This in turn made it possible to rework the global offset table so that it's PIE compatible. I should probably add here that now, the global offset table is now more like a jump table, than a table of addresses. It stores small snippets of code which make it possible to compute the relative address to the symbol we'd like to call/load into a register.
  • When targeting aarch64 from aarch64 macOS, the toolchain now performs a full artefact copy into a temporary filename, and then a rename back into the emitted binary. This is required to make incremental linking work with the latest constraints of the XNU kernel on Apple Silicon. For those lacking context, the latest kernel is actively caching run binaries and any change to the binary/inode that was already cached by the kernel ends up in an immediate SIGKILL. So copy-rename dance is the way to circumvent this.
  • Bonus: stage2 tests targeting aarch64-macos-gnu are now enabled and are passing!

Of course, this PR is just the first small step towards the self-hosted on Intel and Apple Silicon Macs, so naturally many bits in the linker and codegen are in dire need to be rewritten and cleaned up. For example, handling segments and their sizes in MachO. One thing I've made sure here was that the Macs native code signing tools can be used to sign the binaries generated by the self-hosted. For this reason, currently the __LINKEDIT segment has to be rewritten (and compressed) at every incremental linker run. I think this is a small price to pay that should repay us handsomely in the future for ensuring a full compatibility with codesign tool, etc. Another thing that needs some rethinking is the assembly generated to facilitate PIE relative jumps and fixups in the codegen. But as I said, one thing a time. For the moment, let us enjoy the fact that we can now play with stage2 to generate valid MachO binaries on Apple Silicon with full incremental linking support.

@kubkon kubkon added os-macos frontend Tokenization, parsing, AstGen, Sema, and Liveness. arch-x86_64 64-bit x86 arch-aarch64 64-bit ARM labels Nov 26, 2020
@kubkon kubkon requested a review from andrewrk November 26, 2020 10:45
@kubkon kubkon changed the title Stage2 arm macos stage2: Hello, Silicon!\n Nov 26, 2020
@kubkon kubkon added this to the 0.8.0 milestone Nov 26, 2020
src/link.zig Outdated
var tmp_file = try emit.directory.handle.createFile(tmp_file_name, .{ .mode = determineMode(base.options) });
defer tmp_file.close();
const stat = try f.stat();
_ = try f.copyRangeAll(0, tmp_file, 0, stat.size);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can clean a lot of code by using Dir.copyFile here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent suggestion, thanks!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in a8f8d82.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if copyFile with the same source and destination name would do the trick. Under the hood it creates a new file with a random name and then calls `renameat, it could possibly work.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're a genius @LemonBoy! Works like a charm, thanks so much for this absolutely brilliant suggestion!

Copy link
Member

@andrewrk andrewrk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wow! what an enormous effort.

Comment on lines 2586 to -2589
.memory => |addr| {
// The value is in memory at a hard-coded address.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The doc comments for the memory tag of the MCValue enum says:

            /// The value is in memory at a hard-coded address.
            /// If the type is a pointer, it means the pointer address is at this memory location.

I don't think it makes sense to put the logic here - instead of making it work differently for PIE targets and violating the doc comments for this enum tag, the MCValue should be populated with a different tag other than memory when doing PIE, at the location that sets the value. You can introduce a new MCValue tag with a special meaning if you need to.

I'd be ok with making this review comment a follow-up issue.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I totally agree with you that this should really be a different tag, I just didn't wanna change too many things at once before we get something working on the Silicon. I'll be more than happy to work on this in a subsequent PR(s), as I'd love to generalise and clean it up!

while (self.pie_fixups.popOrNull()) |fixup| {
const target_addr = fixup.address;
const this_addr = symbol.n_value + fixup.start;
if (self.base.options.target.cpu.arch == .x86_64) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This if-else would be better as a switch that had else => unreachable to catch bugs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arch-aarch64 64-bit ARM arch-x86_64 64-bit x86 frontend Tokenization, parsing, AstGen, Sema, and Liveness. os-macos
Projects
None yet
Development

Successfully merging this pull request may close these issues.

stage2: code signing in self-hosted MachO linker (arm64)
3 participants