Skip to content

llvm-strip --strip-debug on riscv64 produces unusually large binaries #89524

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
q66 opened this issue Apr 21, 2024 · 9 comments
Open

llvm-strip --strip-debug on riscv64 produces unusually large binaries #89524

q66 opened this issue Apr 21, 2024 · 9 comments

Comments

@q66
Copy link
Contributor

q66 commented Apr 21, 2024

I started noticing that in my distribution binaries on riscv64 come out roughly 3.5 times larger than they should be. We compile with -g2 by default and then process everything with llvm-strip --strip-debug. On all other architectures (x86_64, aarch64, ppc64le, ppc64) things come out more or less the same as before.

The set of flags used does not matter, other than the debug level. Dropping debug level to -g0 produces small binaries. Using strip without arguments likewise produces small binaries; --strip-debug does not, however.

For instance, a build of Lua 5.4 package has an installed size of 3.5MB instead of 1MB on riscv64 now. It seems to apply to all packages in general.

@github-actions github-actions bot added the clang Clang issues not falling into any other category label Apr 21, 2024
@EugeneZelenko EugeneZelenko added tools:llvm-objcopy/strip and removed clang Clang issues not falling into any other category labels Apr 21, 2024
@llvmbot
Copy link
Member

llvmbot commented Apr 21, 2024

@llvm/issue-subscribers-tools-llvm-objcopy-strip

Author: q66 (q66)

I started noticing that in my distribution binaries on riscv64 come out roughly 3.5 times larger than they should be. We compile with `-g2` by default and then process everything with `llvm-strip --strip-debug`. On all other architectures (x86_64, aarch64, ppc64le, ppc64) things come out more or less the same as before.

The set of flags used does not matter, other than the debug level. Dropping debug level to -g0 produces small binaries. Using strip without arguments likewise produces small binaries; --strip-debug does not, however.

For instance, a build of Lua 5.4 package has an installed size of 3.5MB instead of 1MB on riscv64 now. It seems to apply to all packages in general.

@q66
Copy link
Contributor Author

q66 commented Apr 21, 2024

More information: it does not seem to be related to the tools. Using binutils strip exhibits the same behavior with overbloated binaries with --strip-debug.

@q66
Copy link
Contributor Author

q66 commented Apr 21, 2024

here is readelf -a for an unstripped binary: https://0x0.st/Xo3r.txt

after processing with --strip-debug: https://0x0.st/Xo3s.txt

after processing with strip with no arguments: https://0x0.st/Xo3z.txt

q66 added a commit to chimera-linux/cports that referenced this issue Apr 21, 2024
Since clang 18 we get unstrippable junk in binaries when building
with debuginfo, inflating stripped binaries roughly 3.5x on avg,
so drop debug until this is solved.

Ref llvm/llvm-project#89524
@q66 q66 changed the title llvm-strip --strip-debug on riscv64 produces unusually large binaries since llvm/clang 18 llvm-strip --strip-debug on riscv64 produces unusually large binaries Apr 21, 2024
@q66
Copy link
Contributor Author

q66 commented Apr 21, 2024

it seems the issue has been present for much longer, actually; this is not an 18 regression

@jh7370
Copy link
Collaborator

jh7370 commented Apr 22, 2024

I doubt that this is an llvm-objcopy/strip issue, given that using GNU strip produces the same output. I think, if there is an issue, it's much more likely to come from earlier in the pipeline, e.g. the assembler or linker. I'd need to compare the readelf output you've provided with that of a "normal" case, e.g. x86, to see, but my suspicion is that the cause is the many, many unnamed STT_NOTYPE local symbols in the output: llvm-strip --strip-debug would do nothing with those. However, when it is run without arguments, it removes the symbol table, so those symbols will disappear completely, removing any impact they have on the final binary size.

@llvmbot
Copy link
Member

llvmbot commented Apr 22, 2024

@llvm/issue-subscribers-backend-risc-v

Author: q66 (q66)

I started noticing that in my distribution binaries on riscv64 come out roughly 3.5 times larger than they should be. We compile with `-g2` by default and then process everything with `llvm-strip --strip-debug`. On all other architectures (x86_64, aarch64, ppc64le, ppc64) things come out more or less the same as before.

The set of flags used does not matter, other than the debug level. Dropping debug level to -g0 produces small binaries. Using strip without arguments likewise produces small binaries; --strip-debug does not, however.

For instance, a build of Lua 5.4 package has an installed size of 3.5MB instead of 1MB on riscv64 now. It seems to apply to all packages in general.

@jh7370
Copy link
Collaborator

jh7370 commented Apr 22, 2024

I've added the RISC-V label, since I reckon that it's in this area that any issue will be present.

@q66
Copy link
Contributor Author

q66 commented Apr 22, 2024

yes, i also suspect all these NOTYPE local symbols are the issue

@MaskRay
Copy link
Member

MaskRay commented Apr 23, 2024

I agree that this is a RISC-V issue instead of an llvm-objcopy issue. These empty name symbols are generated for assembler directives related to .eh_frame/.debug_line. gas uses a fake label name .L0 which will be discarded by ld/objcopy --discard-locals.

I created #89693 to match gas. I was aware of the behavior difference but did not think hard about the size concern when ld/objcopy -X are concerned.


For distributions, --strip-unneeded might be handy if you don't need .symtab and --strip-all might be useful if you don't need more non-SHF_ALLOC sections like .comment.

As a workaround, you can apply llvm-objcopy --strip-symbol='' for executables/DSOs if they are not linked with -Wl,--emit-relocs. If --emit-relocs is used, the option would likely lead to errors: "not stripping symbol '' because it is named in a relocation"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants