-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Multithreaded version of OpenBLAS causes intermittent segfaults when starting Julia #229
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Thank you for the report. I will test it. Could you try to build OpenBLAS with NO_AFFINITY=1? Xianyi |
OK, I built it with NO_AFFINITY=1. I still get intermittent segfaults when starting Julia -- see below. Is there anything else I should try? Cheers, Stuart [sdb@localhost julia]$ ./julia julia> julia> |
Hi @brorson , What's your AMD processor? Is it bulldozer? Could you try to build OpenBLAS with USE_OPENMP=1? Xianyi |
OK, I just did the rebuild with USE_OPENMP=1. Julia starts reliably now -- no segfaults. See below. Oh, and I was wrong about the AMD processor. My laptop is an 8-way Intel processor. Here's the cpuinfo: [sdb@localhost AHS]$ cat /proc/cpuinfo processor : 1 processor : 2 processor : 3 processor : 4 processor : 5 processor : 6 processor : 7 Finally, here's my current Makefile.config_last in the openblas directory: OSNAME=Linux Julia starts reliably now, with USE_OPENMP=1: [sdb@localhost julia]$ ./julia julia> julia> julia> |
Do you think this was due to #221 ? |
I cloned and built the latest Julia on my FC13 machine (x64, AMD processor). When I start Julia, I get intermittant segfaults. Here's my machine info:
[sdb@localhost examples]$ uname -a
Linux localhost.localdomain 2.6.34.7-61.fc13.x86_64 #1 SMP Tue Oct 19
04:06:30 UTC 2010 x86_64 x86_64 x86_64 GNU/Linux
I ran valgrind --tool=drd on Julia. This tool analyzes the behavior of threads and reports problems and conflicts between the various threads. I find many conflicts involving OpenBLAS threads. See the partial log from valgrind below.
When I set OPENBLAS_NUM_THREADS=1 then Julia start up perfectly each time -- no segfaults.
There appears to be some multithreading issues with OpenBLAS which cause the intermittent Julia start-up failures on my machine. Please take a look at the below valgrind log and see if any problems can be identified from it.
[sdb@localhost examples]$ valgrind --tool=drd ../julia
==6424== drd, a thread error detector
==6424== Copyright (C) 2006-2009, and GNU GPL'd, by Bart Van Assche.
==6424== Using Valgrind-3.5.0 and LibVEX; rerun with -h for copyright info
==6424== Command: ../julia
==6424==
--6424-- Warning: DWARF2 CFI reader: unhandled DW_OP_ opcode 0x8
--6424-- Warning: DWARF2 CFI reader: unhandled DW_OP_ opcode 0x8
--6424-- Warning: DWARF2 CFI reader: unhandled DW_OP_ opcode 0x8
--6424-- Warning: DWARF2 CFI reader: unhandled DW_OP_ opcode 0x8
--6424-- Warning: DWARF2 CFI reader: unhandled DW_OP_ opcode 0x8
--6424-- Warning: DWARF2 CFI reader: unhandled DW_OP_ opcode 0x8
==6424== Thread 3:
==6424== Conflicting load by thread 3 at 0x0e4ecb14 size 4
==6424== at 0xDA2D45D: blas_memory_alloc (in /usr/local/src/julia/usr/lib/libopenblas.so)
==6424== by 0xDA2E0D4: ??? (in /usr/local/src/julia/usr/lib/libopenblas.so)
==6424== by 0x4A0CE50: vgDrd_thread_wrapper (drd_pthread_intercepts.c:272)
==6424== by 0x345E007760: start_thread (in /lib64/libpthread-2.12.1.so)
==6424== by 0x345DCE14FC: clone (in /lib64/libc-2.12.1.so)
==6424== Allocation context: BSS section of /usr/local/src/julia/usr/lib/libopenblas.so
==6424== Other segment start (thread 2)
==6424== at 0x345DCE14C1: clone (in /lib64/libc-2.12.1.so)
==6424== Other segment end (thread 2)
==6424== at 0x345DCC8897: sched_yield (in /lib64/libc-2.12.1.so)
==6424== by 0xDA2E11E: ??? (in /usr/local/src/julia/usr/lib/libopenblas.so)
==6424== by 0x4A0CE50: vgDrd_thread_wrapper (drd_pthread_intercepts.c:272)
==6424== by 0x345E007760: start_thread (in /lib64/libpthread-2.12.1.so)
==6424== by 0x345DCE14FC: clone (in /lib64/libc-2.12.1.so)
==6424==
==6424== Conflicting load by thread 3 at 0x0e4ecb14 size 4
==6424== at 0xDA2D4FF: blas_memory_alloc (in /usr/local/src/julia/usr/lib/libopenblas.so)
==6424== by 0xDA2E0D4: ??? (in /usr/local/src/julia/usr/lib/libopenblas.so)
==6424== by 0x4A0CE50: vgDrd_thread_wrapper (drd_pthread_intercepts.c:272)
==6424== by 0x345E007760: start_thread (in /lib64/libpthread-2.12.1.so)
==6424== by 0x345DCE14FC: clone (in /lib64/libc-2.12.1.so)
==6424== Allocation context: BSS section of /usr/local/src/julia/usr/lib/libopenblas.so
==6424== Other segment start (thread 2)
==6424== at 0x345DCE14C1: clone (in /lib64/libc-2.12.1.so)
==6424== Other segment end (thread 2)
==6424== at 0x345DCC8897: sched_yield (in /lib64/libc-2.12.1.so)
==6424== by 0xDA2E11E: ??? (in /usr/local/src/julia/usr/lib/libopenblas.so)
==6424== by 0x4A0CE50: vgDrd_thread_wrapper (drd_pthread_intercepts.c:272)
==6424== by 0x345E007760: start_thread (in /lib64/libpthread-2.12.1.so)
==6424== by 0x345DCE14FC: clone (in /lib64/libc-2.12.1.so)
==6424==
==6424== Conflicting store by thread 3 at 0x0e4ecb54 size 4
==6424== at 0xDA2D5C8: blas_memory_alloc (in /usr/local/src/julia/usr/lib/libopenblas.so)
==6424== by 0xDA2E0D4: ??? (in /usr/local/src/julia/usr/lib/libopenblas.so)
==6424== by 0x4A0CE50: vgDrd_thread_wrapper (drd_pthread_intercepts.c:272)
==6424== by 0x345E007760: start_thread (in /lib64/libpthread-2.12.1.so)
==6424== by 0x345DCE14FC: clone (in /lib64/libc-2.12.1.so)
==6424== Allocation context: BSS section of /usr/local/src/julia/usr/lib/libopenblas.so
==6424== Other segment start (thread 2)
==6424== at 0x345DCE14C1: clone (in /lib64/libc-2.12.1.so)
==6424== Other segment end (thread 2)
==6424== at 0x345DCC8897: sched_yield (in /lib64/libc-2.12.1.so)
==6424== by 0xDA2E11E: ??? (in /usr/local/src/julia/usr/lib/libopenblas.so)
==6424== by 0x4A0CE50: vgDrd_thread_wrapper (drd_pthread_intercepts.c:272)
==6424== by 0x345E007760: start_thread (in /lib64/libpthread-2.12.1.so)
==6424== by 0x345DCE14FC: clone (in /lib64/libc-2.12.1.so)
... etc ....
The text was updated successfully, but these errors were encountered: