Skip to content

compiler: Fix "power alignment" problems on AIX #142310

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

workingjubilee
Copy link
Member

Hello, this is a localized rustc-focused fix for the AIX "power alignment" issue. It does not fix upstream because I expect that to be a more annoying experience and would take some time to propagate into the release. I mostly wish to remove the "power alignment" lint so we do not have to work it into updates to the "improper-ctypes" lint, but it feels wrong to do so without actually fixing the codegen issue, especially since it's such a small change.

cc @daltenty @gilamn5tr @mustartt @amy-kwan Can you confirm whether this change allows rustc to do FFI correctly with C code compiled using the default AIX ABI?

@rustbot
Copy link
Collaborator

rustbot commented Jun 10, 2025

r? @wesleywiser

rustbot has assigned @wesleywiser.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jun 10, 2025
@rustbot
Copy link
Collaborator

rustbot commented Jun 10, 2025

These commits modify compiler targets.
(See the Target Tier Policy.)

@rust-log-analyzer

This comment has been minimized.

This lint was based on a false premise: LLVM lacks a correct datalayout,
but rustc assumed that the AIX datalayout was correct.
@rust-log-analyzer

This comment has been minimized.

@RalfJung
Copy link
Member

It does not fix upstream

I assume you are referring to llvm/llvm-project#133599 here? Always good to leave some cross-references. :)

Please also add comments in the code referencing that (both in the aix target triple, and the special exception for the data layout consistency check). Right now, just looking at the code after applying this diff, one would have a very hard time figuring out what happens.

@workingjubilee
Copy link
Member Author

aye aye capitan

@RalfJung
Copy link
Member

As discussed on IRLO, as-is this patch would make some types more correct and others less correct. Hard to say whether that's overall a net positive...

@workingjubilee
Copy link
Member Author

It does fix every type that doesn't have f64 as its recursive-first-element, so I'm feeling pretty good about it alone.

@RalfJung
Copy link
Member

RalfJung commented Jun 12, 2025 via email

@beetrees
Copy link
Contributor

Not all types which start with f64 would be affected, just those that also need at least 4 bytes of padding at the end of the struct, all of which could easily be fixed by adding _padding: MaybeUninit<u32> at the end of the relevant struct (a lint could be added for this case with a suggestion if desired). Compared to the #112480-like issues that giving f64 an alignment of 8 causes, I think this PR is a definite improvement.

@workingjubilee
Copy link
Member Author

Are there really more types with f64 in some later field vs the first field?

I think so, when we're considering that it includes all nested aggregates?

Yes, the problem after this patch is now "possible-to-fix-in-bindgen"-tier.

@RalfJung
Copy link
Member

all of which could easily be fixed by adding _padding: MaybeUninit at the end of the relevant struct

Ah, that's a good point.

@amy-kwan
Copy link
Contributor

Hi! Thanks for this patch. I wanted to test this on an AIX machine, but if I try to build it, I get an assertion on the LLVM side:

. . .

Assertion failed: Target.isCompatibleDataLayout(getDataLayout()) && "Can't create a MachineFunction using a Module with a " "Target-incompatible DataLayout attached\n", file  rust/src/llvm-project/llvm/lib/CodeGen/MachineFunction.cpp, line 248, void llvm::MachineFunction::init()()
rustc exited with signal: 6 (SIGABRT)

Did not run successfully: signal: 6 (SIGABRT)
error: could not compile `compiler_builtins` (lib)
. . .

Since the datalayout string in LLVM does not match the datalayout string in rustc. If I make the strings match to unblock the build, I see the internal compiler bug message that was added in the patch:

error: internal compiler error: compiler/rustc_codegen_llvm/src/context.rs:223:17: LLVM got fixed, please remove this exception in cg_llvm!

In any case, I tried to work around these issues to test the patch a bit. There are a few concerns from our end:

  • This change is not buildable on AIX as-is without the backend change but also not buildable due to the rustc exception.
  • As mentioned in the previous comments, some structs that start with f64 would require the extra 4 bytes of padding at the end of the struct so that the size is correct.
  • It seems the struct sizing is also affected when we're not using repr(C). For something like the following, this should have a size of 24, but now we would report 20 for this struct:
pub struct Floats {
    a: f64,
    b: u32,
    c: f64,
}

Ideally, we would only want these changes for repr(C) structs and do not want to affect normal Rust structs at all.

FYI @daltenty

@RalfJung
Copy link
Member

For something like the following, this should have a size of 24, but now we would report 20 for this struct:

That's a Rust layout struct. Why should it have AIX-specific layout? We use the same (undocumented, can-change-any-day) algorithm for repr(Rust) on all targets and we really want to keep it that way for the sake of everyone's sanity.

@workingjubilee
Copy link
Member Author

This change is not buildable on AIX as-is without the backend change but also not buildable due to the rustc exception.

Ah, sorry. I'll just remove the bug! then.

@workingjubilee
Copy link
Member Author

@amy-kwan New patch to try out, should be less comical to test.

@workingjubilee
Copy link
Member Author

Ideally, we would only want these changes for repr(C) structs and do not want to affect normal Rust structs at all.

While I may make further changes to the layout algorithm that may overalign in some cases for the performance reasons you note, the fundamental detail is that it doesn't matter: once any f64 may be underaligned (read: 4, the ABI alignment on AIX), our codegen will notice they all are at that alignment and our reads and writes of that type will be generated with such an alignment annotation for LLVM. LLVM will only be able to upgrade the alignment as an optimization.

That is not an optimization I believe we should ourselves perform on our LLVMIR. This is because it would be extremely fragile, as repr(Rust) structs and repr(C) structs can be nested within each other and everything must still make sense when we do that. Because of this, we should not use the knowledge we have entered a struct with a certain repr to modify the way we do accesses to types.

This is also partly because any reasoning that we overaligned things depends on specifics of the layout algorithm that we are allowed to undermine. It would be globally embedding a local assumption from another part of the code. So for this:

pub struct Floats {
    a: f64,
    b: u32,
    c: f64,
}

If the true alignment of f64 is 4, then you can neither rely on us choosing this layout:

#[repr(linear)]
pub struct Floats {
    a: f64,
    c: f64,
    b: u32,
}

nor can you rely on us choosing this layout:

#[repr(linear)]
pub struct Floats {
    b: u32,
    a: f64,
    c: f64,
}

We are allowed to pick either actual effective layout. You do not have a "should" you can rely on. And in general neither should we: if we make our code break with our own rules, then we make it harder to update.

@workingjubilee
Copy link
Member Author

workingjubilee commented Jun 16, 2025

Also you could probably have built the previous commit by disabling assertions for LLVM but I can understand not wanting to, for, you know, testing-the-patch purposes. :^)

@rust-log-analyzer

This comment has been minimized.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants