-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Issue with multi-threaded sgemm kernel (reading past end of block->segmentatation faults) #535
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Sorry, more info: Compiled with this command: I couldn't get a stack trace although it was compiled with debug- sorry. |
@danpovey , thank you for the feedback. |
It sometimes causes a segfault, yes. |
... it's in a larger application. But with other math libraries it's OK, and without threaded OpenBLAS. |
Is sgemm called immediately after set_num_threads? Same to #447 ? |
No, I don't think we are calling set_num_threads at all. Dan On Tue, Apr 7, 2015 at 10:59 AM, Zhang Xianyi [email protected]
|
Also, this is what we're linking with, which may tell you something. Amit, I'd like to know how you were compiling OpenBLAS. On Tue, Apr 7, 2015 at 11:04 AM, Daniel Povey [email protected] wrote:
|
Oh, I see this is how we were compiling OpenBLAS:
Dan On Tue, Apr 7, 2015 at 11:05 AM, Daniel Povey [email protected] wrote:
|
Above is valgrind output, right ? If so, perhaps you can get more information by running valgrind with the |
I tried that, but unfortunately could not get a stack trace. I'm trying to On Tue, Apr 7, 2015 at 11:28 AM, Martin Kroeker [email protected]
|
After recompiling with debug I was still not able to get a stack trace, for 0x0000000009e1a141 in sgemm_kernel () at 2267 movq K, %rax==16059== Thread 2: On Tue, Apr 7, 2015 at 11:32 AM, Daniel Povey [email protected] wrote:
|
In OpenBLAS gemm kernel, we stores frame register (rbp) to the stack, and uses frame register (rbp) for matrix B. Thus, it may not get the stack trace in the kernel. |
Let me know if there is a workaround that I can do in gdb. On Tue, Apr 7, 2015 at 11:48 AM, Zhang Xianyi [email protected]
|
@danpovey , please try the latest develop branch. I think I fixed this bug. |
Thanks a lot for fixing it so fast! Yes, the issue seems to have gone away. On Tue, Apr 7, 2015 at 12:58 PM, Zhang Xianyi [email protected]
|
Thanks for solving this so quickly - you guys are awesome! |
@amitbeka , Thank you for choosing OpenBLAS. |
==4126== Invalid read of size 16
==4126== at 0x9D12F47: sgemm_kernel (in /home/ubuntu/workspace/kaldi/tools/OpenBLAS/install/lib/libopenblas_sandybridgep-r0.2.14.so)
==4126== Address 0x29fe95cc is 16,556 bytes inside a block of size 16,560 alloc'd
The text was updated successfully, but these errors were encountered: