-
Notifications
You must be signed in to change notification settings - Fork 13.4k
[BOLT] Optimizing immediate relocation with addends produces incorrect assembly #97937
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
After porting BOLT to RISCV some of the relocations were broken on both AArch64 and X86. On AArch64 the example of broken relocations would be GOT, during handling them, we should replace the symbol to __BOLT_got_zero in order to address GOT entry, not the symbol that addresses this entry. This is done further in code, so it is too early to add rel here. On X86 it is a mistake to add relocations without addend. This is the exact problem that is raised on llvm#97937. Due to different code generation I had to use gcc-generated yaml test, since with clang I wasn't able to reproduce problem. Added tests for both architectures and made the problematic condition riscV-specific.
Hello! Thanks for raise an issue, should be fixed by #100890 |
Hi @yota9, thanks for taking a look. We have found that at least on AAarch64 our linker's behavior was not correct and it should not have generated a relocation for an absolute symbol. When tested on a toolchain with the fix here, the relocation was not generated so BOLT and pre-BOLT assembly were both correct. We still think this should be fixed in BOLT as well so it works correctly with older toolchains and had come up with a similar patch to yours. This may explain why you were not able to reproduce it locally if you were using a newer toolchain. |
@s-dag Hi! I agree that in this case using GOT table is excess on the first place. With different options clang/gcc would output different assembly that would have different impact on BOLT. But BOLT patch fixes other "legal" problems too, we still need to apply it, so I think I would be able to push it ASAP when someone from meta team would review it :) |
Hello @yota9,
|
@johndoe2012 Hello! |
Hello @yota9, regarding my previous comment, i was talking about two approaches on the fix:
considering the right behavior is not to generate relocs for abs symbol on aarch64, wouldn't ignoring it be the correct action here? Thanks, |
Hello @johndoe2012! Sorry, at this point I could not tell much without looking at your binary and the problem you observe :( Currently I don't see any problems with my patch but observe some fixed test cases for both aarch64 and x86. So please provide the binary and exact place in it where you observe incorrect behaviour. |
After porting BOLT to RISCV some of the relocations were broken on both AArch64 and X86. On AArch64 the example of broken relocations would be GOT, during handling them, we should replace the symbol to __BOLT_got_zero in order to address GOT entry, not the symbol that addresses this entry. This is done further in code, so it is too early to add rel here. On X86 it is a mistake to add relocations without addend. This is the exact problem that is raised on #97937. Due to different code generation I had to use gcc-generated yaml test, since with clang I wasn't able to reproduce problem. Added tests for both architectures and made the problematic condition riscV-specific.
Hi @yota9, we had to prioritize the dynamic linking issue first but we will try to come up with a simple enough reproducer here again for the relocation of absolute symbols. We're seeing with #100890 (and yes it does fix build id reproducer above), relocation of absolute symbols are still handled incorrectly on Aarch64. |
@s-dag Hi. I need more information of what kind of problems do you see. Better with test, but at least diff of old and new binaries in a place that has a problem |
hello @yota9, terribly sorry we haven't been able to produce a minimal reproducer, but here are some descriptions of the problem we see:
|
Thanks @johndoe2012 ! Looks strange, I think we've got plenty of tests with adr/adrp , not sure why it is happening. If you willing to share the binary I would try to check what is going on there. |
hello @yota9, thanks for the quick response. but sharing the full binary is unlikely. we are attempting a simple reproducer but currently have some difficulty achieving the intended |
@johndoe2012 Please check BOLT with --print-all --print-only= foo_function, and write here e.g. after disassembly, cfg and somewhere in the last pass what would it print out for this ADRP instruction |
hi @yota9, please find following the bolt logs. there are several more passes but the
|
It seems that the adrp relocation was skipped for some strange reason, it should be adrp x0, __metadata_start not 13492224 imm value. It would be interested to know the reason behind it. I suggest you to run with --debug-only=bolt and check the messages after the |
Thanks @yota9, this is really helpful, enabled debug log give us this:
so we know relocs is skipped at https://github.com/llvm/llvm-project/blob/main/bolt/lib/Rewrite/RewriteInstance.cpp#L2670 also
are you suggesting it is expected for bolt to change its value after bolt parsed the original elf, then fixes it by the relocation (suppose it is not skipped in handleRelocation)? |
@johndoe2012 I mean the section where this symbol __metadata_start located, is there a section for this 0x5000000 or it is in the middle of nowhere? Or it might be something really strange like non-allocatable section, basically show me the output of readelf for section and segments :) |
hello @yota9, yes 0x5000000 is indeed out of section. we have the line in linker file, and have linker file line
|
It is segments, you forgot about sections (readelf -S). Anyway it seems to me that it is the binary fault, not BOLTs. Previously indeed the relocation was used before this skip but it was done by accident. The binary expected to be well-formed so all sections, segments must be in-place in order BOLT would be able to work properly. |
it is not in any part of the sections, so we cannot see it from we see your point. we are not aware of such restrictions on out of section use case for absolute symbols. in man page for elf here it mentioned specifically "...An object file does not have sections for these special indices: ...
if bolt is confused by such legitimate use case, we think bolt should be fixed to handle it.
|
Well, AFAICT probably it should be fixed, yes :) Your contribution would be appreciated. |
I ran into an issue running BOLT on a program that accesses a global struct field provided by the linker. When accessing a field of that struct, the optimized binary wipes the addend and instead replaces reads of that field with reads of the struct+0.
Versions of things
I tested with both x86 and aarch64 and managed to reproduce.
Repro
This is a minimal repro on x86.
The following is the C,
print_global.c
I used the default linker script outputted by the
-Wl,-verbose
gcc flag, then inserted thebuild_id_note
at an absolute location right under that firstPROVIDE
inbuild_id.ld
, like so:This is compiled with
Then bolting:
$ llvm-bolt ./print_global -o print_global.bolted --funcs print_build_id BOLT-INFO: Target architecture: x86_64 BOLT-INFO: BOLT version: bb6a4850553dd4140a5bd63187ec1b14d0b731f9 BOLT-INFO: first alloc address is 0x200000 BOLT-INFO: creating new program header table at address 0x800000, offset 0x600000 BOLT-INFO: enabling relocation mode BOLT-INFO: enabling -align-macro-fusion=all since no profile was specified BOLT-INFO: enabling lite mode BOLT-INFO: 0 out of 12 functions in the binary (0.0%) have non-empty execution profile BOLT-INFO: setting _end to 0xa001cc BOLT-INFO: patched build-id (flipped last bit)
The output of the non-bolted binary is:
Yet, the bolted binary produces the wrong result
Of course, the build id is still as expected for the bolted binary:
So this is a case where BOLT leads to incorrect runtime behavior. Inspecting the assembly, we see the root of the problem is that the immediate address for
build_id_note.hash
is missing the addend and thus we print starting from thepad
field:Non-bolted asm:
Bolted asm:
The text was updated successfully, but these errors were encountered: