-
Notifications
You must be signed in to change notification settings - Fork 13.4k
Optimization for powi(x, y) * x
#69862
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@llvm/issue-subscribers-backend-x86 Author: None (k-arrows)
Consider the following:
```cpp
#include <cmath>
double f1(double x) double f2(double x)
|
llvm-project/llvm/lib/Transforms/InstCombine/InstCombineMulDivRem.cpp Lines 708 to 717 in a63dc79
powi(X, Y) * X --> powi(X, Y + 1) also works.@vfdff I guess you are interested in this optimization :) |
pow(x, i1) * pow(x, i2)
(X86 vs ARM)powi(x, y) * x
|
vfdff
added a commit
to vfdff/llvm-project
that referenced
this issue
Oct 24, 2023
Try to transform the powi(X, Y) * X into powi(X, Y+1) with Ofast For this case, when the Y is 3, then powi(X, 4) is replaced by X2 = X * X; X2 * X2 in the further step. Similar to D109954, who requires reassoc. Fixes llvm#69862.
vfdff
added a commit
to vfdff/llvm-project
that referenced
this issue
Mar 2, 2024
Try to transform the powi(X, Y) * X into powi(X, Y+1) with Ofast For this case, when the Y is 3, then powi(X, 4) is replaced by X2 = X * X; X2 * X2 in the further step. Similar to D109954, who requires reassoc. Fixes llvm#69862.
vfdff
added a commit
to vfdff/llvm-project
that referenced
this issue
Mar 7, 2024
Try to transform the powi(X, Y) * X into powi(X, Y+1) with Ofast For this case, when the Y is 3, then powi(X, 4) is replaced by X2 = X * X; X2 * X2 in the further step. Similar to D109954, who requires reassoc. Fixes llvm#69862.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Consider the following:
(X86) https://godbolt.org/z/nd3ME13Po
(ARM) https://godbolt.org/z/z5vjnPhrK
On X86, if
-Ofast
is specified, GCC just calculatesx * x * x * x
forf1
with twomulsd
, but Clang performsmulsd
three times. There is no such difference on ARM.The text was updated successfully, but these errors were encountered: