Skip to content

cblas_dgemm crashed! #2203

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
xinsuinizhuan opened this issue Aug 5, 2019 · 29 comments
Closed

cblas_dgemm crashed! #2203

xinsuinizhuan opened this issue Aug 5, 2019 · 29 comments

Comments

@xinsuinizhuan
Copy link

I cmake the OpenBlas,then include the "lapack-netlib\CBLAS\include" in my project and link the "lib\RELEASE\openblas.lib" and "lib\DEBUG\openblas.lib" to my project.Then i run my project, when it run the "cblas_dgemm", it crashed! Then what should i do ? then OpenMp is default!

@TiborGY
Copy link
Contributor

TiborGY commented Aug 5, 2019

Then what should i do ?

Learn how to write a bug report.

In all seriousness, please include information such as:

  • What is your build environment? (OS, OS version, 32 bit or 64 bit, compiler, compiler version, type of CPU in the computer used for compiling)
  • What is your build target? (by default it is same as the machine you are using for compiling, if you have changed that or cross-compiling you should say that)
  • What version of OpenBLAS are you trying to build? (current git, 0.3.6 or something ancient like 0.2.2?)
  • How did it crash? What messages were given when it crashed?

@brada4
Copy link
Contributor

brada4 commented Aug 5, 2019

Can you get exception code from event log or better backtrace while having x64dbg as jit debuger?
Also call you make to blas library (also seen in great deal in backtrace) will help on repeating the issue.

@brada4
Copy link
Contributor

brada4 commented Aug 5, 2019

Rewording to your format : next you do following : install x64dbg , configure as JIT debugger, repeat the crash, and debug it.

@martin-frbg
Copy link
Collaborator

martin-frbg commented Aug 5, 2019

It is usually better to include the *.h files that a make install produces in the final location, rather than the header files from the source directories directly. At the very least, you would need to copy the "cblas_mangling_with_flags.h.in" to its final name "cblas_mangling.h" to that cblas.h can include it - but actually the "final" cblas.h from OpenBLAS has many added lines, it includes openblas_config.h as well) And as the others wrote, which version of OpenBLAS, which cpu ?

@martin-frbg
Copy link
Collaborator

If you are on 64bit Windows, you could also try the zip archive I put in #805

@xinsuinizhuan
Copy link
Author

Thank you!My version is vs2019, windows10, x64. the new code of Openblas. Let have a try!

@brada4
Copy link
Contributor

brada4 commented Aug 6, 2019

Please generate some picture of what happened.
What you say so far does not get far from initial statement of 'it does not work'
Namely :
Exception type
Any backtrace
Does same code work with mkl or reference BLAS?
Linyx in a VM?

@xinsuinizhuan
Copy link
Author

I use the #805 openblas036-win64.zip, myproject works normally. I do not know why? I do not know how to user mkl,i install mkl,but don't know which file and *.lib should add to my project?

@brada4
Copy link
Contributor

brada4 commented Aug 7, 2019

regarding MKL - google for "mkl link line advisor"

You mean "crashing" like windows debugger/crash reporter pops up or plainly some other error is displayed and everything stops?

@martin-frbg
Copy link
Collaborator

You could also just check if it still works when you use the lib that you built yourself, but take the includes from the openblas036-win64.zip. Most likely the problem was just that you took the unmodified cblas.h template from lapack-netlib\CBLAS\include. (It is probably not necessary to try your code with mkl or the reference BLAS from netlib now that it works, unless you want to learn about them)

@xinsuinizhuan
Copy link
Author

You could also just check if it still works when you use the lib that you built yourself, but take the includes from the openblas036-win64.zip. Most likely the problem was just that you took the unmodified cblas.h template from lapack-netlib\CBLAS\include. (It is probably not necessary to try your code with mkl or the reference BLAS from netlib now that it works, unless you want to learn about them)

Thank you! I test it, with my cmake lib and includes from the openblas036-win64.zip, it is also crashed!

You mean "crashing" like windows debugger/crash reporter pops up or plainly some other error is displayed and everything stops?
This is my crashed snapshot!
1565163067(1)

@xinsuinizhuan
Copy link
Author

Please generate some picture of what happened.
What you say so far does not get far from initial statement of 'it does not work'
Namely :
Exception type
Any backtrace
Does same code work with mkl or reference BLAS?
Linyx in a VM?
It crashed, as below snapshot!
1
2

@brada4
Copy link
Contributor

brada4 commented Aug 7, 2019

Can you set breakpoint before cblas_dgemm to distingquish if enqueue fails or the blas call itself.
(c0...05 means memory protection violation, like access outside allocated memory.)

@xinsuinizhuan
Copy link
Author

I can set breakpoint before cblas_dgemm , and i look at very variable value, they are normal, so how i should do to distingquish if enqueue fails or the blas call itself? Others, the same as #805 openblas036-win64.zip is woks, so i gues the blas call itself.

@martin-frbg
Copy link
Collaborator

I do not think debugging the crash will produce any useful information - it looks like something went wrong with the compilation of OpenBLAS by Visual Studio. Did you follow the suggestions in the wiki,
https://github.com/xianyi/OpenBLAS/wiki/How-to-use-OpenBLAS-in-Microsoft-Visual-Studio ? (These were written from user reports, so maybe they are not quite up to date - the library in the zip was cross-compiled on Linux)

@brada4
Copy link
Contributor

brada4 commented Aug 7, 2019

These were written from user reports,

I added one paragraph recently that gets asked often...

@brada4
Copy link
Contributor

brada4 commented Aug 7, 2019

can set breakpoint before cblas_dgemm

You can dump call parameters there - then look back if arrays were of right size(s) ?
Also you should check memory map of what exctly was memory cell generating exception, usually something like hitting guard page after allocation, but can be something random - usually the area allocared (and maybe freed) right before is one that was mis-referenced.

Any better with MKL - in this case OpenBLAS has a bug .
(But if you can make minimal repeater, like arbitrary sized marices of zeroes clashing, it is much easier for us to try in other places)

@xinsuinizhuan
Copy link
Author

I only think it went wrong with the compilation of OpenBLAS by Visual Studio. Because the #805 openblas036-win64.zip woks, but it compile under linux by cross compile. You can look forward this direction to find problem!

@brada4
Copy link
Contributor

brada4 commented Aug 8, 2019

OK, thats in part good news.
We dont have so much windows. Could you try recording logs from the way you built OpenBLAS?

Other ways to build under windows (tell if any works without too much customisation):

  • Clang-cl
  • MinGW but setting HOSTCC to gcc, and CC/FC to full cross-compiler name, then following by cv2pdb
  • A linux vm or ubuntu subsystem v2

@xinsuinizhuan
Copy link
Author

I only think it went wrong with the compilation of OpenBLAS by Visual Studio. Because the #805 openblas036-win64.zip woks, but it compile under linux by cross compile. You can look forward this direction to find problem!

OK, thats in part good news.
We dont have so much windows. Could you try recording logs from the way you built OpenBLAS?

Other ways to build under windows (tell if any works without too much customisation):

* Clang-cl

* MinGW but setting HOSTCC to gcc, and CC/FC to full cross-compiler name, then following by cv2pdb

* A linux vm or ubuntu subsystem v2

my cmake version is 3.15.0, vs2019 preview(x64), windows 10 x64.
cmake
cmake1
1565245151(1)
Then i configure it, it tip many info, test is below, such as :

CMake Warning at CMakeLists.txt:64 (message):
CMake support is experimental. It does not yet support all build options
and may not produce the same Makefiles that OpenBLAS ships with.

GEMM multithread threshold set to 4.
Multi-threading enabled with 4 threads.
CMake Deprecation Warning at C:/Program Files/CMake/share/cmake-3.15/Modules/CMakeForceCompiler.cmake:103 (message):
The CMAKE_FORCE_Fortran_COMPILER macro is deprecated. Instead just set
CMAKE_Fortran_COMPILER and allow CMake to identify the compiler.
Call Stack (most recent call first):
cmake/f_check.cmake:27 (CMAKE_FORCE_Fortran_COMPILER)
cmake/prebuild.cmake:87 (include)
cmake/system.cmake:157 (include)
CMakeLists.txt:67 (include)

CMake Warning (dev) at cmake/prebuild.cmake:306 (if):
Policy CMP0054 is not set: Only interpret if() arguments as variables or
keywords when unquoted. Run "cmake --help-policy CMP0054" for policy
details. Use the cmake_policy command to set the policy and suppress this
warning.

Quoted variables like "MSVC" will no longer be dereferenced when the policy
is set to NEW. Since the policy is not set the OLD behavior will be used.
Call Stack (most recent call first):
cmake/system.cmake:157 (include)
CMakeLists.txt:67 (include)
This warning is for project developers. Use -Wno-dev to suppress it.

MSVC
Running getarch
GETARCH results:
CORE=generic
LIBCORE=generic
NUM_CORES=4
GENERIC=1
L1_DATA_SIZE=32768
L1_DATA_LINESIZE=128
L2_SIZE=512488
L2_LINESIZE=128
DTB_DEFAULT_ENTRIES=128
DTB_SIZE=4096
L2_ASSOCIATIVE=8
MAKE += -j 4

Compiling a 64-bit binary.
Building Single Precision
Building Double Precision
Building Complex Precision
Building Double Complex Precision
Reading vars from I:/OpenBLAS/OpenBLAS-develop/kernel/x86_64/KERNEL...
Reading vars from I:/OpenBLAS/OpenBLAS-develop/kernel/x86_64/KERNEL.generic...
CMake Warning (dev) at kernel/CMakeLists.txt:125 (if):
Policy CMP0054 is not set: Only interpret if() arguments as variables or
keywords when unquoted. Run "cmake --help-policy CMP0054" for policy
details. Use the cmake_policy command to set the policy and suppress this
warning.

Quoted variables like "GENERIC" will no longer be dereferenced when the
policy is set to NEW. Since the policy is not set the OLD behavior will be
used.
Call Stack (most recent call first):
kernel/CMakeLists.txt:547 (build_core)
This warning is for project developers. Use -Wno-dev to suppress it.

CMake Warning (dev) at utest/CMakeLists.txt:4 (if):
Policy CMP0054 is not set: Only interpret if() arguments as variables or
keywords when unquoted. Run "cmake --help-policy CMP0054" for policy
details. Use the cmake_policy command to set the policy and suppress this
warning.

Quoted variables like "MSVC" will no longer be dereferenced when the policy
is set to NEW. Since the policy is not set the OLD behavior will be used.
This warning is for project developers. Use -Wno-dev to suppress it.

Generating openblas_config.h in include/openblas
Generating f77blas.h in include/openblas
Generating cblas.h in include/openblas
Configuring done

@brada4
Copy link
Contributor

brada4 commented Aug 8, 2019

Can you post a picture of cpu-z?
It looks like your CPU is not (yet/ by time of v0.3.6) detected by OpenBLAS. It should not be broken, but it is better properly added to OpenBLAS.

Policy warnings are OK - We support old versions (think what is available on 10 years old commercially supported enterprise Linux), no defects in whole log shown.

@martin-frbg
Copy link
Collaborator

The problem here is simply that Visual Studio does not understand the dialect of assembler instructions we use in the optimized BLAS kernels (OpenBLAS has old "AT&T" syntax, common on Unix and used by the GNU tools, Microsoft expects "Intel" syntax - different order of operands, some different instructions). So with VS alone, you will only get the "generic" version of OpenBLAS functions written in C - a bit faster than the old reference BLAS, but much slower than the cpu-specific assembler versions.
(This is why the build instructions in the wiki suggest to install either clang or minwg compiler to build OpenBLAS.)
Now either there is a bug in the "generic" dgemm kernel (which has not been touched in years, I think) or VS2019 mis-compiles it. This certainly needs looking into, but you are much better off using the library from the zip file anyway.

@xinsuinizhuan
Copy link
Author

Can you post a picture of cpu-z?
It looks like your CPU is not (yet/ by time of v0.3.6) detected by OpenBLAS. It should not be broken, but it is better properly added to OpenBLAS.

Policy warnings are OK - We support old versions (think what is available on 10 years old commercially supported enterprise Linux), no defects in whole log shown.

1565252943(1)

@brada4
Copy link
Contributor

brada4 commented Aug 8, 2019

I wanted stepping numbers from CPU-Z to drill through cpuid_x86.c ;-)
Can you confirm this one https://valid.x86.fr/cache/screenshot/s0hgmn.png ?

@martin-frbg
Copy link
Collaborator

@brada4 the cpuid really plays no role here (and i5-7500 should be detected alright) - on Windows with just the VS compiler, getarch is called (by prebuild.cmake) with -DFORCE_GENERIC to prevent the compiler from stumbling into assembly code that it cannot handle anyway.

@xinsuinizhuan
Copy link
Author

I wanted stepping numbers from CPU-Z to drill through cpuid_x86.c ;-)
Can you confirm this one https://valid.x86.fr/cache/screenshot/s0hgmn.png ?
This is my cpu-z snapshot:
1565311735(1)

@brada4
Copy link
Contributor

brada4 commented Aug 9, 2019

Martin already explained, your compiler is limited, not cpu at fault

@xinsuinizhuan
Copy link
Author

Martin already explained, your compiler is limited, not cpu at fault

So the cmaked openblas is disabled by vs. To use openblas in windows must compile it with clang and mingw cross compile?

@TiborGY
Copy link
Contributor

TiborGY commented Aug 9, 2019 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants