Description
I cloned and built the latest Julia on my FC13 machine (x64, AMD processor). When I start Julia, I get intermittant segfaults. Here's my machine info:
[sdb@localhost examples]$ uname -a
Linux localhost.localdomain 2.6.34.7-61.fc13.x86_64 #1 SMP Tue Oct 19
04:06:30 UTC 2010 x86_64 x86_64 x86_64 GNU/Linux
I ran valgrind --tool=drd on Julia. This tool analyzes the behavior of threads and reports problems and conflicts between the various threads. I find many conflicts involving OpenBLAS threads. See the partial log from valgrind below.
When I set OPENBLAS_NUM_THREADS=1 then Julia start up perfectly each time -- no segfaults.
There appears to be some multithreading issues with OpenBLAS which cause the intermittent Julia start-up failures on my machine. Please take a look at the below valgrind log and see if any problems can be identified from it.
[sdb@localhost examples]$ valgrind --tool=drd ../julia
==6424== drd, a thread error detector
==6424== Copyright (C) 2006-2009, and GNU GPL'd, by Bart Van Assche.
==6424== Using Valgrind-3.5.0 and LibVEX; rerun with -h for copyright info
==6424== Command: ../julia
==6424==
--6424-- Warning: DWARF2 CFI reader: unhandled DW_OP_ opcode 0x8
--6424-- Warning: DWARF2 CFI reader: unhandled DW_OP_ opcode 0x8
--6424-- Warning: DWARF2 CFI reader: unhandled DW_OP_ opcode 0x8
--6424-- Warning: DWARF2 CFI reader: unhandled DW_OP_ opcode 0x8
--6424-- Warning: DWARF2 CFI reader: unhandled DW_OP_ opcode 0x8
--6424-- Warning: DWARF2 CFI reader: unhandled DW_OP_ opcode 0x8
==6424== Thread 3:
==6424== Conflicting load by thread 3 at 0x0e4ecb14 size 4
==6424== at 0xDA2D45D: blas_memory_alloc (in /usr/local/src/julia/usr/lib/libopenblas.so)
==6424== by 0xDA2E0D4: ??? (in /usr/local/src/julia/usr/lib/libopenblas.so)
==6424== by 0x4A0CE50: vgDrd_thread_wrapper (drd_pthread_intercepts.c:272)
==6424== by 0x345E007760: start_thread (in /lib64/libpthread-2.12.1.so)
==6424== by 0x345DCE14FC: clone (in /lib64/libc-2.12.1.so)
==6424== Allocation context: BSS section of /usr/local/src/julia/usr/lib/libopenblas.so
==6424== Other segment start (thread 2)
==6424== at 0x345DCE14C1: clone (in /lib64/libc-2.12.1.so)
==6424== Other segment end (thread 2)
==6424== at 0x345DCC8897: sched_yield (in /lib64/libc-2.12.1.so)
==6424== by 0xDA2E11E: ??? (in /usr/local/src/julia/usr/lib/libopenblas.so)
==6424== by 0x4A0CE50: vgDrd_thread_wrapper (drd_pthread_intercepts.c:272)
==6424== by 0x345E007760: start_thread (in /lib64/libpthread-2.12.1.so)
==6424== by 0x345DCE14FC: clone (in /lib64/libc-2.12.1.so)
==6424==
==6424== Conflicting load by thread 3 at 0x0e4ecb14 size 4
==6424== at 0xDA2D4FF: blas_memory_alloc (in /usr/local/src/julia/usr/lib/libopenblas.so)
==6424== by 0xDA2E0D4: ??? (in /usr/local/src/julia/usr/lib/libopenblas.so)
==6424== by 0x4A0CE50: vgDrd_thread_wrapper (drd_pthread_intercepts.c:272)
==6424== by 0x345E007760: start_thread (in /lib64/libpthread-2.12.1.so)
==6424== by 0x345DCE14FC: clone (in /lib64/libc-2.12.1.so)
==6424== Allocation context: BSS section of /usr/local/src/julia/usr/lib/libopenblas.so
==6424== Other segment start (thread 2)
==6424== at 0x345DCE14C1: clone (in /lib64/libc-2.12.1.so)
==6424== Other segment end (thread 2)
==6424== at 0x345DCC8897: sched_yield (in /lib64/libc-2.12.1.so)
==6424== by 0xDA2E11E: ??? (in /usr/local/src/julia/usr/lib/libopenblas.so)
==6424== by 0x4A0CE50: vgDrd_thread_wrapper (drd_pthread_intercepts.c:272)
==6424== by 0x345E007760: start_thread (in /lib64/libpthread-2.12.1.so)
==6424== by 0x345DCE14FC: clone (in /lib64/libc-2.12.1.so)
==6424==
==6424== Conflicting store by thread 3 at 0x0e4ecb54 size 4
==6424== at 0xDA2D5C8: blas_memory_alloc (in /usr/local/src/julia/usr/lib/libopenblas.so)
==6424== by 0xDA2E0D4: ??? (in /usr/local/src/julia/usr/lib/libopenblas.so)
==6424== by 0x4A0CE50: vgDrd_thread_wrapper (drd_pthread_intercepts.c:272)
==6424== by 0x345E007760: start_thread (in /lib64/libpthread-2.12.1.so)
==6424== by 0x345DCE14FC: clone (in /lib64/libc-2.12.1.so)
==6424== Allocation context: BSS section of /usr/local/src/julia/usr/lib/libopenblas.so
==6424== Other segment start (thread 2)
==6424== at 0x345DCE14C1: clone (in /lib64/libc-2.12.1.so)
==6424== Other segment end (thread 2)
==6424== at 0x345DCC8897: sched_yield (in /lib64/libc-2.12.1.so)
==6424== by 0xDA2E11E: ??? (in /usr/local/src/julia/usr/lib/libopenblas.so)
==6424== by 0x4A0CE50: vgDrd_thread_wrapper (drd_pthread_intercepts.c:272)
==6424== by 0x345E007760: start_thread (in /lib64/libpthread-2.12.1.so)
==6424== by 0x345DCE14FC: clone (in /lib64/libc-2.12.1.so)
... etc ....