-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Test failure on PPC: Nonsymmetric-Generalized-Eigenvalue-Problem-driver-EIG/xeigtsts #4415
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
INFO= 9 from SGGES appears to be "QZ algorithm failed to converge" - unfortunately some of the tests in the LAPACK testsuite are quite fragile against small numerical differences compared to the unoptimized Reference BLAS. |
This is a Power9 CPU (PowerNV 8335-GTX), cpuinfo shows "POWER9, altivec supported" |
Ouch, this is tricky. We have the following patch:
With this patch the mentioned issue happens on PPC. Without it, it happens on AArch. |
I remember playing with varying sequences in some test inputs before, but it does not really make sense if these are subsequent runs with varied matrix dimensions - unless there are uninitialized variables involved (never zeroed, or not zeroed between runs)... |
Turns out the actual AArch issue was for the double precision variant and fixed by removing the "6" as you suggested in #4032 (comment) I tried the same for sgd and the test runs successfully on PPC. I.e. I now have this patch which I'll test on a larger number of CPU architectures:
|
OK, that makes it a little better - could claim that the matrix of size 6 is somehow ill conditioned (ISTR there actually were a few issues in Reference-LAPACK with "broken" pencils for GGEV/GGES) |
Yes that matches the observation of @bartoldeman who did an earlier patch with a comment
How can one understand that change/line exactly? You wrote
then there is "Number of matrix dimensions" = 5, but there are 6 values for "Matrix dimensions" (after the above, new patch only 5). |
The "5" could be seen as an error in the original test file; what happens if you use:
is that it'll use 2 6 10 12 20 but ignore 30. It's just Fortran code that reads 5 numbers, then discards the rest of the line. |
yes, six values but only the first five are actually used - maybe matrix size 30 was simply too big to use in routine testing on whatever ancient hardware was in use at the time that test was conceived. (oops - I simply took longer to type my response than bartoldeman :) ) |
Uh oh!
There was an error while loading. Please reload this page.
Since 0.3.23 I have (besides ~41 numerical errors) also 1 "other error".
The summary (in this case using 0.3.26) looks like this:
And searching for that "other error" I found
I was using
make lapack-test BINARY='64' CC='gcc' FC='gfortran' MAKE_NB_JOBS='-1' USE_OPENMP='1' USE_THREAD='1'
for this.Any idea what this could be and how to fix it?
The text was updated successfully, but these errors were encountered: