-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Cross compilation for Skylake X #2986
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Thanks for the very comprehensive report - there was a long-standing bug in the cpu parameter assignment during DYNAMIC_ARCH builds fixed post 0.3.12, but I would not expect it to have caused this particular mess. Will try to reproduce this |
Please make sure to patch BIOS/microcode past this bug: You code does fine, at least on haswell (and forced back to sandybridge) does not throw the error. |
@brada4 that theory does not seem to fit the report that it works when compiled on the SKX (and if I read you correctly your test was on Haswell only ?) |
Only failures are when skylakex code is invoked on a compatible CPU... I just checked the rest as well as possible + usual cold shower regarding microcode. |
Thank you for this suggestion. I am not root, but the machine has rebooted after this package was installed, and dmesg shows that it runs microcode from June 2020 (search for 0x2006906):
|
Thank you for confirming, your microcode is way past that accuracy problem from 3 years ago. |
Reproduced with a build created on Sandybridge, not reproduced with a build made on Haswell even when setting the build TARGET for all common code to SANDYBRIDGE there. (Had to update binutils on the Sandybridge to get AVX512 code to compile, so it cannot be caused by an outdated assembler on the older machine). Quite disconcerting. |
Problem appears to be unrelated to multithreading (though the falsely flagged matrix element does vary with thread count), and |
closing as fixed (or at least worked around for now) by #3469 |
…achine Origin: upstream, OpenMathLib/OpenBLAS#3579 Bug: OpenMathLib/OpenBLAS#2986 OpenMathLib/OpenBLAS#3454 OpenMathLib/OpenBLAS#3557 Bug-Debian: https://bugs.debian.org/1025480 Applied-Upstream: 0.3.21 Reviewed-by: Sébastien Villemot <[email protected]> Last-Update: 2023-06-26 When building OpenBLAS with dynamic arch selection on x86-64 hardware that does not support AVX2 (e.g. Intel Ivybridge or earlier), then the AVX512 (SkylakeX) kernel for DGEMM would produce incorrect results (of course when run on AVX512-capable hardware). The problem was that the check for determining whether the compiler is able to understand AVX512 assembly/intrinsics was doubly incorrect: it would test the build machine capabilities (instead of the compiler capabilities); and it would check for AVX2 instead of AVX512. As a consequence, on pre-AVX2 hardware, the build system would conclude that the compiler is not able to understand AVX512 primitives, and would create a broken AVX512 (SkylakeX) DGEMM kernel (essentially a Haswell kernel, but with some wrong assumptions, hence leading to incorrect numerical results). Last-Update: 2023-06-26 Gbp-Pq: Name avx512-dgemm.patch
I have run into an issue where openblas (both v0.3.12 and v0.3.6) gives wrong results on Skylake X when compiled with DYNAMIC_ARCH=1 on a machine with an older cpu. When I add NO_AVX512=1 or compile on the Skylake X machine, everything runs fine.
Compiler: gcc/gfortran 7.5.0 and 9.3.0 (the Skylake X machine has 7.5.0, I have tested both 7.5.0 and 9.3.0 on the older machine).
I have attached the compilation logs, code and data that triggers the error. The code computes the Cholesky factorization of a 1107x1107 matrix with eigenvalues between 1 and 3.6 x 104. On Skylake X, the value of info from dpotrf_ is 65, wrongly indicating that the matrix is not positive definite.
The compilation logs are from three different builds:
on the older machine:
make DYNAMIC_ARCH=1 NUM_THREADS=64 >log_old_machine.txt 2>&1
on the Skylake X machine:
make DYNAMIC_ARCH=1 NUM_THREADS=64 >log_skylake.txt 2>&1
on the Skylake X machine:
make DYNAMIC_ARCH=1 NUM_THREADS=64 TARGET=SANDYBRIDGE >log_skylake_target_sandy.txt 2>&1
Statically linking the test code to openblas on the older machine and running the binary on the Skylake X machine triggers the error. There is no error when I use OPENBLAS_CORETYPE=Haswell, or use the library from log_skylake.txt or log_skylake_target_sandy.txt.
log_old_machine.txt
log_skylake.txt
log_skylake_target_sandy.txt
code_and_data.zip
The text was updated successfully, but these errors were encountered: