-
Notifications
You must be signed in to change notification settings - Fork 13.4k
Hack together inline-always-overrides #141055
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
@bors try |
Hack together inline-always-overrides `@cbiffle` pointed out to me that sometimes when size-optimizing, you really want a `#[inline(always)]` on some function in a dependency or the standard library and it's a chore to add that attribute in, and it would be neat to give the compiler an override. I don't think this is too hard to hack together, so here's a hack. I don't think the implementation can be more effective than this, so this draft will be useful to collect some information on whether this can even do the thing.
☀️ Try build successful - checks-actions |
Alright, I built this locally from the commit bors used above (though I have also now tested it on your original commit). I'm testing using the stage1 build, in case that's significant. Without flags, this reproduces the poor inlining I noted on the keypad:GO firmware (good!) Building with (I then switched to flags in Adding a comma-separated list of strings causes panics building most crates, though it's of course possible I'm holding it wrong. Things that cause panics include
It seems the call to rustc-ice-2025-05-18T11_20_04-56902.txt (Side note: since rustc doesn't fill in the generic parameters in the debug symbols (something I would love to fix!) I'm not sure that the name I get from |
Oh, incidentally, if you'd like to test the repro case I'm using yourself, the firmware is https://github.com/cbiffle/keypad-go-firmware Built at current tip commit (5442d69) using 1.87 or current nightly, the Range iter impl I mentioned above doesn't inline, so I'm trying to fix that. You will need the Steps to reproduce: edit the
|
Yeah, I meant to try this out then realized after you posted your scenario that figuring out the def_path_str for the iterator method is going to be gnarly. So I think I need to also add a way to dump def_path_str values or base this on mangled symbols. |
Btw you can use https://crates.io/crates/rustup-toolchain-install-master to install a try build, just be aware you can't add components after the fact so you may need to use some |
Thanks! Now that I'm set up to build the toolchain again, I'm happy just doing that -- my current laptop takes about 3-4 minutes. It's faster than downloading the trybuild archive on my current internet connection. |
With the latest commit, I can do this:
And the symbols mentioning Range are gone. Actually wait something is afoot... |
I'm not sure why, but setting the flag seems to make every symbol in the keybad-fw executable disappear. But between my debug printing and some local testing on a program built for my host arch, things seem to work. The flag is now a prefix; we can't use the literal entire symbol name because the suffix on them is a hash that contains all the flags that were used for the crate. |
As a symmetric feature, can we ask the compiler to never inline? this is useful for debugging. maybe it looks like this
|
Maybe. If you want something to not be inlined for debugging, why doesn't lowering the optimization level suffice? |
I want to keep performance as high as possible. Using |
So do you want to turn off inlining or all optimizations for a function? We have The specific problem Cliff pointed out to me is that a generic function defined in In addition, this PR is not a feature proposal. That would be done by a MCP on rust-lang/compiler-team. I'm doing a single experiment here to establish a use case. I don't want to have runaway discussion here about why various things are or aren't possible in the current compiler architecture or aren't reliable because of the precompiled sysroot. I can try to provide such an explanation in a place that's suited for discussion. PRs are not. |
I just want the inline turn off so it is tracked. It's probably a std or third-party library code and mark I just happened to see this. sorry for insert into this thread. |
@quininer I actually agree that this would be useful, strategic un-inlining of a routine is something I need much less often, but I have wished I had it once or twice. Personally, applying |
Building that way also makes the PHDRs disappear, resulting in an ELF file that only contains the debug symbols. This is because overriding Adding the overrides to
...owing to how the range iterator impls are actually written. With these overrides, it does appear to work! As an example of a case where the approach doesn't appear to work, the resulting binary contains some outlined routines for |
This is an unavoidable limitation of the precompiled sysroot. The override can only change inlining of instantiations that are done with the override flag provided. I'm saying instantiations not compilations, because you could get a generic or compiler-builtins is a slightly special case. All of its public functions are That being said, all the symbols that compiler-builtins exports are magic intrinsics known to LLVM. So in theory some optimization can be done without looking at the implementation of the intrinsics. You'd just have to be really sure the optimization will be profitable. |
The action on instantiations makes sense to me, thanks for explaining that. From my perspective, the next step is to put some miles on this with different programs and learn why it's wrong / how it fails / etc. I'll poke around my projects directory and see if I've got some suitable tests. (I'm basically ignoring UI issues right now, assuming they could be fixed if we decide the feature is good. e.g. prefix match on mangled symbol names is probably not quite right, but that's also just one line of the patch, and can probably be redone when/if required.) I'm concerned that a lot of cases I'm interested in might run into the "precompiled sysroot" limitation, particularly around builtins. But I am hoping to prove myself wrong there! Thanks for prototyping this! |
@cbiffle pointed out to me that sometimes when size-optimizing, you really want a
#[inline(always)]
on some function in a dependency or the standard library and it's a chore to add that attribute in, and it would be neat to give the compiler an override. I don't think this is too hard to hack together, so here's a hack. I don't think the implementation can be more effective than this, so this draft will be useful to collect some information on whether this can even do the thing.