[ESIMD] Make esimd implementation of fmod compatible with std::fmod #6242

fineg74 · 2022-06-03T16:25:12Z

This fix, brings sycl::ext::intel::experimental::esimd::fmod implementation in line with std::fmod implementation.
The std::fmod implementation details are from https://en.cppreference.com/w/cpp/numeric/math/fmod
Complementary test PR intel/llvm-test-suite#1045

v-klochkov

The implementation needs some correctness fixes.
Currently, for fmod(-5.1, 3.0) it returns -5.1

fineg74 · 2022-06-14T18:06:01Z

The implementation needs some correctness fixes. Currently, for fmod(-5.1, 3.0) it returns -5.1

Fixed

v-klochkov · 2022-06-15T13:28:00Z

sycl/include/sycl/ext/intel/experimental/esimd/math.hpp

+  reminder_sign_mask.merge(1.0f, 0.0f, reminder < 0);
+
+  fmod = reminder + abs_x * reminder_sign_mask;
+  return __ESIMD_NS::abs(fmod) * y_sign_mask;


This implementation is better and more correct, but still not fully compatible with std::fmod.

std::fmod(-0.0f, 1.0f) would return -0.0f, while the implementation here returns "+0.0f".
The LIT test (intel/llvm-test-suite#1045) passes because it compares the expected and computed results using '==', returning true even though they are not bitwise equal.

@akolesov-intel can you please give some advice on how to implement it the most efficient way that would also follow std::fmod rules?

Hi Slava, you are correct. Fully compliant fmod implementation is not that simple. See attached codes for single and double precision cases. As usual there are fast paths (__imf_fmodf, __imf_fmod) and slow ones for large and special arguments (__imf_internal_sfmod, __imf_internal_dfmod). These implementations are fully compliant to standard and return exact results
fmod.zip
.

Added more logic to correctly propagate the sign. Also updated the test to add check for bit sign.

v-klochkov · 2022-06-17T18:09:12Z

sycl/include/sycl/ext/intel/experimental/esimd/math.hpp

                                             __ESIMD_NS::simd<float, N> x) {
-  __ESIMD_NS::simd<int, N> v_quot;
  __ESIMD_NS::simd<float, N> fmod;
+  __ESIMD_NS::simd<float, N> abs_x;


This code is definitely more correct that what it was before this PR.
From other side, it is not close to that complex code attached by @akolesov-intel .
We probably, can have this code for now. @akolesov-intel do you see one or some obvious cases where the proposed code would give wrong result comparing to std::fmod?

Also, if keep the current variant, then some minor inefficiencies can be optimized (reducing number of compares, replacing MULs with OR/AND, less vector consts like 1.0,-1.0):

__ESIMD_NS::simd<float, N> abs_x = __ESIMD_NS::abs(x); __ESIMD_NS::simd<float, N> abs_y = __ESIMD_NS::abs(y); __ESIMD_NS::simd<float, N> reminder = abs_y - abs_x * __ESIMD_NS::trunc<float>(abs_y / abs_x); // After this line 'abs_x' means (reminder < 0 ? abs(x) : 0); abs_x.merge(0.0, reminder >= 0); __ESIMD_NS::simd<float, N> fmod = reminder + abs_x; __ESIMD_NS::simd<float, N> fmod_abs = __ESIMD_NS::abs(fmod); auto fmod_sign_mask = (y.template bit_cast_view<int32_t>()) & 0x80000000; auto fmod_bits = (fmod_abs.template bit_cast_view<int32_t>()) | fmod_sign_mask; return fmod_bits.template bit_cast_view<float>();

This code is definitely more correct that what it was before this PR. From other side, it is not close to that complex code attached by @akolesov-intel . We probably, can have this code for now. @akolesov-intel do you see one or some obvious cases where the proposed code would give wrong result comparing to std::fmod?

It's quite hard to tell without testing, many combinations of two arguments are possible. I would expect some possible problems on denormals and IEEE special inputs (inf, NaN). If I got some spare time I will try to substitute our version by this one and run our internal tests.

I have ran our tests for this implementation (did I convert it properly to plain C++?)

float __imf_fmodf (float y, float x)
{
float abs_x = std::abs(x);
float abs_y = std::abs(y);
float reminder = abs_y - abs_x * std::trunc(abs_y / abs_x);
abs_x = (reminder >= 0)?0.0:abs_x;
float fmod = reminder + abs_x;
float fmod_abs = std::abs(fmod);
auto fmod_sign_mask = as_int(y) & 0x80000000;
auto fmod_bits = as_int(fmod_abs) | fmod_sign_mask;
return as_float(fmod_bits);
}

Test reported 12% of incorrect results from whole dataset. Mostly they are related to special, denormal or very small/big arguments. See attached file
. REF1 is for reference (correct) result there.

ts_fmod_s_xa.zip

The main problem with the code provided by @akolesov-intel is that it is not really vectorizable if at all.

Ok, let's have this new and more correct version for now.
Perhaps, we can add even more correct implementation a bit later. I'll create internal tracker for this and other math functions.

v-klochkov · 2022-06-22T01:35:56Z

/verify with intel/llvm-test-suite#1045

Make esimd implementation of fmod compatible with std::fmod

88df143

fineg74 requested a review from a team as a code owner June 3, 2022 16:25

fineg74 mentioned this pull request Jun 3, 2022

[ESIMD] Add a test to validate new implementation of esimd::fmod intel/llvm-test-suite#1045

Merged

v-klochkov requested changes Jun 14, 2022

View reviewed changes

Fix an issue that results in incorrect results in some cases

49c709e

fineg74 requested a review from v-klochkov June 14, 2022 23:15

v-klochkov reviewed Jun 15, 2022

View reviewed changes

fineg74 added 2 commits June 15, 2022 20:33

Better handle sign propagation

d485589

fix clang-format issue

77528eb

fineg74 requested a review from v-klochkov June 16, 2022 03:52

v-klochkov reviewed Jun 17, 2022

View reviewed changes

fineg74 added 2 commits June 18, 2022 10:37

Optimize implementation per PR comments

0f75bd3

fix clang-format issues

38dfb10

v-klochkov approved these changes Jun 21, 2022

View reviewed changes

v-klochkov merged commit 7a076bd into intel:sycl Jun 22, 2022

fineg74 deleted the fmod branch July 14, 2022 17:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[ESIMD] Make esimd implementation of fmod compatible with std::fmod #6242

[ESIMD] Make esimd implementation of fmod compatible with std::fmod #6242

Uh oh!

fineg74 commented Jun 3, 2022

Uh oh!

v-klochkov left a comment

Uh oh!

fineg74 commented Jun 14, 2022

Uh oh!

v-klochkov Jun 15, 2022 •

edited

Loading

Uh oh!

akolesov-nv Jun 15, 2022

Uh oh!

fineg74 Jun 16, 2022

Uh oh!

v-klochkov Jun 17, 2022

Uh oh!

akolesov-nv Jun 17, 2022

Uh oh!

akolesov-nv Jun 17, 2022

Uh oh!

fineg74 Jun 18, 2022

Uh oh!

v-klochkov Jun 21, 2022

Uh oh!

v-klochkov commented Jun 22, 2022

Uh oh!

Uh oh!

[ESIMD] Make esimd implementation of fmod compatible with std::fmod #6242

[ESIMD] Make esimd implementation of fmod compatible with std::fmod #6242

Uh oh!

Conversation

fineg74 commented Jun 3, 2022

Uh oh!

v-klochkov left a comment

Choose a reason for hiding this comment

Uh oh!

fineg74 commented Jun 14, 2022

Uh oh!

v-klochkov Jun 15, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

akolesov-nv Jun 15, 2022

Choose a reason for hiding this comment

Uh oh!

fineg74 Jun 16, 2022

Choose a reason for hiding this comment

Uh oh!

v-klochkov Jun 17, 2022

Choose a reason for hiding this comment

Uh oh!

akolesov-nv Jun 17, 2022

Choose a reason for hiding this comment

Uh oh!

akolesov-nv Jun 17, 2022

Choose a reason for hiding this comment

Uh oh!

fineg74 Jun 18, 2022

Choose a reason for hiding this comment

Uh oh!

v-klochkov Jun 21, 2022

Choose a reason for hiding this comment

Uh oh!

v-klochkov commented Jun 22, 2022

Uh oh!

Uh oh!

v-klochkov Jun 15, 2022 •

edited

Loading