Skip to content

segmentation fault in dgemv_n() when using more than a certain number of threads #373

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
vahuja4 opened this issue May 22, 2014 · 6 comments

Comments

@vahuja4
Copy link

vahuja4 commented May 22, 2014

Hi,

I compiled OpenBLAS using the flag USE_OPENMP=1. Each OpenMP thread performs matrix multiplication for a 50X50 matrix of doubles. Each thread allocates this matrix on the heap. The code runs fine upto 7 threads, but when I try with 8, I get the following:

(gdb) run 8
Starting program: /home/vishal/Desktop/filters/CKF_Armadillo/src/CKF 8
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/i386-linux-gnu/libthread_db.so.1".
[New Thread 0xb5062b40 (LWP 2924)]
[New Thread 0xb4eb9b40 (LWP 2925)]
[New Thread 0xb4d10b40 (LWP 2926)]
[New Thread 0xb4b67b40 (LWP 2927)]
[New Thread 0xb49beb40 (LWP 2928)]
[New Thread 0xb36a7b40 (LWP 2929)]
[New Thread 0xb34feb40 (LWP 2930)]
BLAS : Program is Terminated. Because you tried to allocate too many memory regions.

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0xb34feb40 (LWP 2930)]
0xb7044898 in dgemv_n () from /usr/lib/libopenblas.so.0
(gdb) bt
#0 0xb7044898 in dgemv_n () from /usr/lib/libopenblas.so.0
#1 0xb788aff4 in ?? () from /usr/lib/libopenblas.so.0

Backtrace stopped: previous frame inner to this frame (corrupt stack?)

Now, if I were to set the number of OpenMP threads to 1, then I can run as many as 512 such programs simultaneously. Can someone please help.

@martin-frbg
Copy link
Collaborator

Google found me this topic on the old (gmane-archived) openblas discussion list. Solution there was to
recompile OpenBLAS with NUM_THREADS=8 (or however many threads you need). You could try if this still works with the current version.

@vahuja4
Copy link
Author

vahuja4 commented May 23, 2014

My program is multithreaded, so I don't want OpenBLAS to spawn any more threads. I also used the flag USE_THREADS=0 when compiling OpenBLAS. Please let me know if this does not sound correct.

@vahuja4
Copy link
Author

vahuja4 commented May 23, 2014

I thought of trying the developer branch for my code. Please tell me if this looks okay:

vishal@vishal-Think:/OpenBlas_dev$ git clone https://github.com/xianyi/OpenBLAS.git
Cloning into 'OpenBLAS'...
remote: Reusing existing pack: 11849, done.
remote: Total 11849 (delta 0), reused 0 (delta 0)
Receiving objects: 100% (11849/11849), 12.33 MiB | 4.79 MiB/s, done.
Resolving deltas: 100% (7491/7491), done.
vishal@vishal-Think:
/OpenBlas_dev$ git checkout develop
fatal: Not a git repository (or any of the parent directories): .git
vishal@vishal-Think:/OpenBlas_dev$ ls
OpenBLAS
vishal@vishal-Think:
/OpenBlas_dev$ cd OpenBLAS/
vishal@vishal-Think:/OpenBlas_dev/OpenBLAS$ git checkout develop
Already on 'develop'
vishal@vishal-Think:
/OpenBlas_dev/OpenBLAS$

@martin-frbg
Copy link
Collaborator

See if "git log" in your local repository shows you the same latest changes as displayed by github ?

@xianyi
Copy link
Collaborator

xianyi commented May 26, 2014

Sorry for the delay. I prepared my Ph.D. defense this week.

It looks like a bug in OpenBLAS. @vahuja4, could you provide the test code?

@martin-frbg
Copy link
Collaborator

Have you tried setting the OMP_STACKSIZE variable as was suggested to you on stackoverflow ?
(Maybe running your code from valgrind in addition to gdb might help track down the problem, if you cannot provide a small self-contained test code here.)

@wernsaar wernsaar closed this as completed Jun 4, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants