Closed
Description
Consider the following:
#include <cmath>
double f1(double x)
{
return pow(x, 3) * pow(x, 1);
}
double f2(double x)
{
return pow(x, 4);
}
(X86) https://godbolt.org/z/nd3ME13Po
(ARM) https://godbolt.org/z/z5vjnPhrK
On X86, if -Ofast
is specified, GCC just calculates x * x * x * x
for f1
with two mulsd
, but Clang performs mulsd
three times. There is no such difference on ARM.