-
Notifications
You must be signed in to change notification settings - Fork 791
[ESIMD] Make esimd implementation of fmod compatible with std::fmod #6242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 4 commits
Commits
Show all changes
6 commits
Select commit
Hold shift + click to select a range
88df143
Make esimd implementation of fmod compatible with std::fmod
fineg74 49c709e
Fix an issue that results in incorrect results in some cases
fineg74 d485589
Better handle sign propagation
fineg74 77528eb
fix clang-format issue
fineg74 0f75bd3
Optimize implementation per PR comments
fineg74 38dfb10
fix clang-format issues
fineg74 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code is definitely more correct that what it was before this PR.
From other side, it is not close to that complex code attached by @akolesov-intel .
We probably, can have this code for now. @akolesov-intel do you see one or some obvious cases where the proposed code would give wrong result comparing to std::fmod?
Also, if keep the current variant, then some minor inefficiencies can be optimized (reducing number of compares, replacing MULs with OR/AND, less vector consts like 1.0,-1.0):
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's quite hard to tell without testing, many combinations of two arguments are possible. I would expect some possible problems on denormals and IEEE special inputs (inf, NaN). If I got some spare time I will try to substitute our version by this one and run our internal tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have ran our tests for this implementation (did I convert it properly to plain C++?)
float __imf_fmodf (float y, float x)
{
float abs_x = std::abs(x);
float abs_y = std::abs(y);
float reminder = abs_y - abs_x * std::trunc(abs_y / abs_x);
abs_x = (reminder >= 0)?0.0:abs_x;
float fmod = reminder + abs_x;
float fmod_abs = std::abs(fmod);
auto fmod_sign_mask = as_int(y) & 0x80000000;
auto fmod_bits = as_int(fmod_abs) | fmod_sign_mask;
return as_float(fmod_bits);
}
Test reported 12% of incorrect results from whole dataset. Mostly they are related to special, denormal or very small/big arguments. See attached file
. REF1 is for reference (correct) result there.
ts_fmod_s_xa.zip
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main problem with the code provided by @akolesov-intel is that it is not really vectorizable if at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, let's have this new and more correct version for now.
Perhaps, we can add even more correct implementation a bit later. I'll create internal tracker for this and other math functions.