-
Notifications
You must be signed in to change notification settings - Fork 465
Less accuracy due to FMAs #1031
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This is typically seen with the testsuite - #732 , #744 , other issues referenced therein... and it only gets worse when one uses something other than the Reference BLAS that leverages FMA for performance ... but I am not sure if it is still possible to "give a build option that disables FMA" across all platforms and compilers (unless it is an archived copy of gcc3 or similar) |
In many cases FMA improves the accuracy because two operations are executed with only a single rounding. However, there are examples where FMA harms accuracy. The example that you give actually coincides with one that Kahan used to illustrate that FMA can lead to a loss of accuracy if not used with care. Since there are plenty of examples where FMA is beneficial (notably dot product/GEMM), forbidding FMA does not sound like a viable choice. If there are occurrences
I second this. |
I like that approach even more. I tested it for my example, and it works. @angsch Should I create a MR with the changed parentheses or do you think the comparison against machine epsilon is better? |
I support the fix with parentheses. As far as I know, compilers respect parentheses for all optimization levels except for when non-IEEE compliant/unsafe math operations are enabled ( |
Thanks a lot for all the details about this issue! Very interesting! Short version: My opinion is:
TL;DR I am not sure we should consider it problem if the computed eigenvalues in single precision are '[1.999999988777289 2.000000011222711]' (instead of 2 and 2). The output is not "wrong" in a single-precision sense. In general, using FMA tends to give better accuracy than not using it. I understand that, in this example, FMA deteriorates the accuracy, but I am not convinced that FMA always deteriorates the accuracy in this situation, sometimes it might deteriorate the accuracy (for example @ACSimon33's example), sometimes (maybe) it might improve it. For this example, with these numerical values, it seems that FMA deteriorates the accuracy because it destroys the "symmetry" that we have in the expression BBCS + DDSN and the values in our matrix. But if the values in our matrix are not that symmetric, I would expect FMA to help. (I do not know, this is an assumption.) So, all in all, (*) I do not think we should disable FMA with a compiler flag, certainly not fot the whole library, and even for a specific routine, I am leery about it. (Compilation flags change over time, vary across compilers, etc.) (*) I have mixed feelings about using parenthesis. I agree with when @angsch writes “If there are occurrences that are known to be critical, maybe the best approach is to add parentheses and prevent the compiler from using FMA in this particular expression?” It seems that there are interesting specific cases where there is naturally a symmetry occurring in the input data and so not using FMA would help these instances. (Which in practice do not seem that farfetched.) The potential loss of accuracy for not using FMA for other cases seems acceptable. So yes, I am in favor of using parenthesis to (attempt to) prevent FMA. (*) As far as using a numerical test with machine precision, I read the code a few times, and, as far as I can see, I am not in favor of it. This does not sound a good idea to me. |
To clarify, More insights: When using diagonally scaling ( I'll go ahead and create a draft for the MR. |
In light of this, may I ask about the current opinion on the similar patches for ?LAHQR proposed in #732 two years ago ? |
I tried to reproduce #732 on the current master, and it seems that the double-precision complex errors are already resolved, and only some of the single-precision errors remain:
Unfortunately, applying the proposed patches reduces the number of complex errors to only |
Hi,
I think the topic of FMA instructions came up a couple of times before, but my question is what kind of accuracy loss do we consider as acceptable given that this is a reference implementation. The point where I stumbled across it was an eigenvalue decomposition of a 2x2 matrix
The eigenvalues should be
[2.0, 2.0]
. When the compiler uses FMAs we get[1.999999988777289 2.000000011222711]
instead. Using gfortran on Linux one has to enable native intrinsic (-march=native
) to accomplish that, but on arm64 we get the slightly wrong results even with default compiler flags. To get the correct values one has to use-ffp-contract=off
which disables FMAs.Should we maybe give users a build option which disables FMAs?
For everyone interested here is a quick reproducer of the example above
The problematic line that causes the accuracy loss is line 253 in dlanv2.f:
This expression should evaluate to zero but instead evaluates to
-2.5189846806723163E-017
due to different rounding of the FMA instructions. This value is later compare to zero and used inside aSQRT
which increases the inaccuracy to ~10^-9
:Maybe an alternative approach would be to compare against machine accuracy instead of zero but that's not my field of expertise.
The text was updated successfully, but these errors were encountered: