Skip to content

Can't use AVX-512 on X86-64 with openblas 0.3.6 version? #2182

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
gxkevin opened this issue Jul 9, 2019 · 5 comments
Closed

Can't use AVX-512 on X86-64 with openblas 0.3.6 version? #2182

gxkevin opened this issue Jul 9, 2019 · 5 comments

Comments

@gxkevin
Copy link

gxkevin commented Jul 9, 2019

I found that in the latest version 0.3.6, the AVX512 DGEMM kernel has been disabled again due to unsolved problems.

It means that we can't use the avx512?

what about the 0.3.7-dev ?

@gxkevin
Copy link
Author

gxkevin commented Jul 9, 2019

DGEMM was disabled, what about the SGEMM ?

@martin-frbg
Copy link
Collaborator

Indeed most of the AVX512 DGEMM optimizations had to be disabled in 0.3.6, and 0.3.7.dev disables the remaining ones that were originally thought to be safe. Unfortunately the person who contributed this code has not been active here since march, and I have been unable to track down the problem in his code. There are no known problems with the AVX512 SGEMM kernel.

@gxkevin
Copy link
Author

gxkevin commented Jul 9, 2019

what about 0.3.5 or earlier release? Does it enable the avx512?

@martin-frbg
Copy link
Collaborator

AVX512 DGEMM was added in 0.3.4 but it returns wrong results for some inputs. (#1955, #2029, #2168)
Actually disabling or enabling the AVX512 kernels is just a matter of (un)commenting a few lines in kernel/x86_64/KERNEL.SKYLAKEX, if you want to check if its affects your typical use cases.

@martin-frbg
Copy link
Collaborator

Fixed now by wjc404's new AVX512 DGEMM kernel from #2286

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants