-
Notifications
You must be signed in to change notification settings - Fork 13.6k
Closed
Labels
A-LLVMArea: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.A-codegenArea: Code generationArea: Code generationA-const-evalArea: Constant evaluation, covers all const contexts (static, const fn, ...)Area: Constant evaluation, covers all const contexts (static, const fn, ...)C-enhancementCategory: An issue proposing an enhancement or a PR with one.Category: An issue proposing an enhancement or a PR with one.C-optimizationCategory: An issue highlighting optimization opportunities or PRs implementing suchCategory: An issue highlighting optimization opportunities or PRs implementing suchE-needs-testCall for participation: An issue has been fixed and does not reproduce, but no test has been added.Call for participation: An issue has been fixed and does not reproduce, but no test has been added.I-slowIssue: Problems and improvements with respect to performance of generated code.Issue: Problems and improvements with respect to performance of generated code.T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.Relevant to the compiler team, which will review and decide on the PR/issue.
Description
The godbolt link: https://godbolt.org/z/dc9o3x
I think this snippet should just return true
:
pub fn compare() -> bool {
let bytes = 12.5f32.to_ne_bytes();
bytes == if cfg!(target_endian = "big") {
[0x41, 0x48, 0x00, 0x00]
} else {
[0x00, 0x00, 0x48, 0x41]
}
}
The generated asm with opt-level=3
:
example::compare:
push rax
mov al, 1
pop rcx
ret
hellow554
Metadata
Metadata
Assignees
Labels
A-LLVMArea: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.A-codegenArea: Code generationArea: Code generationA-const-evalArea: Constant evaluation, covers all const contexts (static, const fn, ...)Area: Constant evaluation, covers all const contexts (static, const fn, ...)C-enhancementCategory: An issue proposing an enhancement or a PR with one.Category: An issue proposing an enhancement or a PR with one.C-optimizationCategory: An issue highlighting optimization opportunities or PRs implementing suchCategory: An issue highlighting optimization opportunities or PRs implementing suchE-needs-testCall for participation: An issue has been fixed and does not reproduce, but no test has been added.Call for participation: An issue has been fixed and does not reproduce, but no test has been added.I-slowIssue: Problems and improvements with respect to performance of generated code.Issue: Problems and improvements with respect to performance of generated code.T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.Relevant to the compiler team, which will review and decide on the PR/issue.
Activity
mati865 commentedon Jul 9, 2019
I think there are 2 issues:
to_ne_bytes
is not const fnif cfg!
is not const fnSee this modified example: https://godbolt.org/z/86WRHF
tesuji commentedon Jul 9, 2019
I may misunderstand but in the C version, does clang require similar conditions to optimize code?
@rustbot modify labels: A-codegen A-const-eval
mati865 commentedon Jul 9, 2019
Those are preprocessor directives and are handled at compile time so their Rust equivalent would be:
https://godbolt.org/z/Ou1bTH
tesuji commentedon Jul 9, 2019
https://doc.rust-lang.org/nightly/std/macro.cfg.html states that
cfg!
macro:mati865 commentedon Jul 9, 2019
Good point, those two are equal:
https://godbolt.org/z/V7Iuoo
I don't know enough about const and I could be totally wrong but I think in the first case
if cfg!
expands to something like:It cannot make
arr
constant because #49146 is not yet complete.tesuji commentedon Jul 10, 2019
Interesting, If I look at MIR output, I think rustc already have enough information:
MIR
ecstatic-morse commentedon Jul 10, 2019
@mati865 Whether a function is marked
const
at the source level has no effect on whether or not it can be optimized by LLVM. Even ifto_ne_bytes
was notconst
, inlining and const propagation could conceivably reduce this to a constant.ecstatic-morse commentedon Oct 23, 2019
Rust now generates the following (almost optimal) assembly for the OP's code:
I would guess that recent improvements to const propagation in the MIR are responsible for this (thanks @wesleywiser!), but it might also have been a change in LLVM. This seems like a decent candidate for a codegen regression test, although an equivalent one might already exist.
tesuji commentedon Oct 23, 2019
It is better now. But are
push rax
andpop rcx
instructions necessary?ecstatic-morse commentedon Oct 23, 2019
I assume it is due to a difference in calling convention, but it does seem kind of silly. You'll need to ask someone more knowledgeable unfortunately.
ecstatic-morse commentedon Oct 23, 2019
@lzutao A quick google leads me to believe that the
push
is used to align the stack to a 16-byte boundary.4 remaining items
tesuji commentedon Feb 24, 2021
This issue appears to be fixed on beta. Maybe we need a codegen test for it.
Edit: It seems that the llvm-ir generated by rustc are the same between beta and stable.
Only the optimized asm for x86 is different. Should I just write assembly codegen or just close this issue.
nagisa commentedon May 9, 2021
The issue seems to have re-appeared in master :(
JOE1994 commentedon Feb 19, 2024
Both
rustc 1.76.0
&rustc nightly (as of Feb 19 2024)
generates the following asm (with-C opt-level=3
)Rollup merge of rust-lang#125298 - tesuji:arrcmp, r=nikic
Auto merge of rust-lang#125298 - tesuji:arrcmp, r=nikic