Skip to content

Prevent bindgen from generating #[derive(Debug)] #316

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 29, 2021

Conversation

nbdd0121
Copy link
Member

@nbdd0121 nbdd0121 commented May 29, 2021

Bindgen generates #[derive(Debug)] for all bindings by default.
We never use these, and probably it didn't make sense to use them
at all. But as we currently do not have LTO enabled, and all the
bindings are exported via pub mod bindings, these Debug
implementations take a lot of space.

On my config this saves 400 KB of space:

   text	   data	    bss	    dec	    hex	filename
6388276	2284324	 292872	8965472	 88cd60	vmlinux (with `Debug`)
6079475	2200836	 292872	8573183	 82d0ff	vmlinux (without `Debug`)

Discovered by a tiny static analysis tool that I am writing :)

Bindgen generates `#[derive(Debug)]` for all bindings by default.
We never use these, and probably it didn't make sense to use them
at all. But as we currently do not have LTO enabled, and all the
bindings are exported via `pub mod bindings`, these `Debug`
implementations take a lot of space.

On my config this saves 400 KB of space:

   text	   data	    bss	    dec	    hex	filename
6388276	2284324	 292872	8965472	 88cd60	vmlinux (with `Debug`)
6079475	2200836	 292872	8573183	 82d0ff	vmlinux (without `Debug`)

Signed-off-by: Gary Guo <[email protected]>
@ksquirrel
Copy link
Member

Review of 211d8ae02986:

  • ✔️ Commit 211d8ae: Looks fine!

Copy link

@wedsonaf wedsonaf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@alex alex merged commit 51a2b6c into Rust-for-Linux:rust May 29, 2021
@TheSven73
Copy link
Collaborator

What's the reason we don't have LTO enabled? I'm not saying it should be, just curious about the tradeoffs or difficulties regarding LTO.

@bjorn3
Copy link
Member

bjorn3 commented May 29, 2021

Regular LTO won't work because --emit obj is used and rustc doesn't do the linking. Linker plugin lto would work, but requires a linker plugin for LTO that has the exact same version of LLVM as rustc.

@TheSven73
Copy link
Collaborator

Interesting. And the linker (which I'm guessing is gcc's ld?) doesn't have a way to tell which .o objects are unused?

@nbdd0121
Copy link
Member Author

Only one .o file is created per crate.

@TheSven73
Copy link
Collaborator

Sorry, I meant the objects inside the .o. Such as functions, global variables, etc.

@nbdd0121
Copy link
Member Author

Well, it's the same as C; without LTO or ffunction-sections + gc-sections, all functions with external linkage will be kept.

@TheSven73
Copy link
Collaborator

Ah I see. And there's no simple way have rustc generate function sections? Which are then gc'ed by ld ?

@bjorn3
Copy link
Member

bjorn3 commented May 29, 2021

That is the default for rustc, but not for linux.

@TheSven73
Copy link
Collaborator

Thanks! So you're saying that rustc generates function sections by default, but ld as used by Linux doesn't gc them?

@bjorn3
Copy link
Member

bjorn3 commented May 29, 2021

The target specs used by rust-for-linux disable function sections completely AFAIK.

@nbdd0121
Copy link
Member Author

nbdd0121 commented May 29, 2021

Kernel's --gc-sections support is behind config LD_DEAD_CODE_DATA_ELIMINATION which is marked experimental and hidden beind config EXPERT.

@TheSven73
Copy link
Collaborator

TheSven73 commented May 30, 2021

Very interesting info, thanks! I noticed that the next kernel version (v5.13-rc3) contains optional LLVM LTO support. The defconfig allows us to select LTO or ThinLTO. Only works for arm64 and x86 though.

Is the plan to simply piggy-back onto this, when we move to the next kernel version? Perhaps @ojeda has some insights too?

Edited: ThinLTO worked, although slowly. Full LTO simply exhausted all memory on my 16 gig RAM laptop before it could finish. Holy smoke.

Edit 2: ThinLTO kernel size is actually larger than no LTO kernel size on my builds. Seems like there are a few wrinkles to be ironed out upstream still.

@nbdd0121
Copy link
Member Author

Edit 2: ThinLTO kernel size is actually larger than no LTO kernel size on my builds. Seems like there are a few wrinkles to be ironed out upstream still.

LTO is usually meant for performance improvement rather than size reduction. It could reduce the size if a lot of functions are unused, but obviously that's not the case for kernel's C code.

To be able to LTO rust, you need to add -Clinker-plugin-lto to the flags. This still doesn't allow any functions to be removed though, because they are all exported, so you additionally need CONFIG_TRIM_UNUSED_KSYMS.

@wedsonaf
Copy link

wedsonaf commented Jun 1, 2021

Is the plan to simply piggy-back onto this, when we move to the next kernel version? Perhaps @ojeda has some insights too?

FWIW, at Google we enable LTO when building android kernels.

@TheSven73
Copy link
Collaborator

FWIW, at Google we enable LTO when building android kernels.

Thanks! Why is this done?

@wedsonaf
Copy link

wedsonaf commented Jun 1, 2021

FWIW, at Google we enable LTO when building android kernels.

Thanks! Why is this done?

I wasn't directly involved in the work, but AFAIU, LTO improved performance and binary size. It was also required for control-flow integrity (CFI), which itself degrades performance but this degradation is offset by the improvements brought by LTO.

I can confirm that building the kernel takes longer but we're willing to take the compilation time hit if the resulting binary is faster and/or harder to exploit.

@TheSven73
Copy link
Collaborator

I see, that sounds like a good tradeoff.

I'm assuming not all of this stuff made it upstream? Because I noticed that ThinLTO has a bigger binary size compared to no LTO. And FullLTO ate all my laptop's memory :) For v5.13-rc4 targeting arm64 anyway.

@wedsonaf
Copy link

wedsonaf commented Jun 1, 2021

I'm assuming not all of this stuff made it upstream? Because I noticed that ThinLTO has a bigger binary size compared to no LTO. And FullLTO ate all my laptop's memory :) For v5.13-rc4 targeting arm64 anyway.

I'm not sure, it could be differences in other configurations too. I know the goal is to have everything upstream but I'm not sure if we're there yet. The kernels are all publicly available though, some details (including link to git repo) are available here:
https://source.android.com/devices/architecture/kernel/generic-kernel-image

@ojeda
Copy link
Member

ojeda commented Jun 1, 2021

I wasn't directly involved in the work, but AFAIU, LTO improved performance and binary size. It was also required for control-flow integrity (CFI), which itself degrades performance but this degradation is offset by the improvements brought by LTO.

Related: LWN had an article on CFI a few days ago https://lwn.net/Articles/856514/

metaspace pushed a commit to metaspace/linux that referenced this pull request Apr 16, 2024
In case when is64 == 1 in emit(A64_REV32(is64, dst, dst), ctx) the
generated insn reverses byte order for both high and low 32-bit words,
resuling in an incorrect swap as indicated by the jit test:

[ 9757.262607] test_bpf: Rust-for-Linux#312 BSWAP 16: 0x0123456789abcdef -> 0xefcd jited:1 8 PASS
[ 9757.264435] test_bpf: Rust-for-Linux#313 BSWAP 32: 0x0123456789abcdef -> 0xefcdab89 jited:1 ret 1460850314 != -271733879 (0x5712ce8a != 0xefcdab89)FAIL (1 times)
[ 9757.266260] test_bpf: Rust-for-Linux#314 BSWAP 64: 0x0123456789abcdef -> 0x67452301 jited:1 8 PASS
[ 9757.268000] test_bpf: Rust-for-Linux#315 BSWAP 64: 0x0123456789abcdef >> 32 -> 0xefcdab89 jited:1 8 PASS
[ 9757.269686] test_bpf: Rust-for-Linux#316 BSWAP 16: 0xfedcba9876543210 -> 0x1032 jited:1 8 PASS
[ 9757.271380] test_bpf: Rust-for-Linux#317 BSWAP 32: 0xfedcba9876543210 -> 0x10325476 jited:1 ret -1460850316 != 271733878 (0xa8ed3174 != 0x10325476)FAIL (1 times)
[ 9757.273022] test_bpf: Rust-for-Linux#318 BSWAP 64: 0xfedcba9876543210 -> 0x98badcfe jited:1 7 PASS
[ 9757.274721] test_bpf: Rust-for-Linux#319 BSWAP 64: 0xfedcba9876543210 >> 32 -> 0x10325476 jited:1 9 PASS

Fix this by forcing 32bit variant of rev32.

Fixes: 1104247 ("bpf, arm64: Support unconditional bswap")
Signed-off-by: Artem Savkov <[email protected]>
Tested-by: Puranjay Mohan <[email protected]>
Acked-by: Puranjay Mohan <[email protected]>
Acked-by: Xu Kuohai <[email protected]>
Message-ID: <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

7 participants