Skip to content

Multithreaded version of OpenBLAS causes intermittent segfaults when starting Julia #229

Closed
@brorson

Description

@brorson

I cloned and built the latest Julia on my FC13 machine (x64, AMD processor). When I start Julia, I get intermittant segfaults. Here's my machine info:

[sdb@localhost examples]$ uname -a
Linux localhost.localdomain 2.6.34.7-61.fc13.x86_64 #1 SMP Tue Oct 19
04:06:30 UTC 2010 x86_64 x86_64 x86_64 GNU/Linux

I ran valgrind --tool=drd on Julia. This tool analyzes the behavior of threads and reports problems and conflicts between the various threads. I find many conflicts involving OpenBLAS threads. See the partial log from valgrind below.

When I set OPENBLAS_NUM_THREADS=1 then Julia start up perfectly each time -- no segfaults.

There appears to be some multithreading issues with OpenBLAS which cause the intermittent Julia start-up failures on my machine. Please take a look at the below valgrind log and see if any problems can be identified from it.


[sdb@localhost examples]$ valgrind --tool=drd ../julia
==6424== drd, a thread error detector
==6424== Copyright (C) 2006-2009, and GNU GPL'd, by Bart Van Assche.
==6424== Using Valgrind-3.5.0 and LibVEX; rerun with -h for copyright info
==6424== Command: ../julia
==6424==
--6424-- Warning: DWARF2 CFI reader: unhandled DW_OP_ opcode 0x8
--6424-- Warning: DWARF2 CFI reader: unhandled DW_OP_ opcode 0x8
--6424-- Warning: DWARF2 CFI reader: unhandled DW_OP_ opcode 0x8
--6424-- Warning: DWARF2 CFI reader: unhandled DW_OP_ opcode 0x8
--6424-- Warning: DWARF2 CFI reader: unhandled DW_OP_ opcode 0x8
--6424-- Warning: DWARF2 CFI reader: unhandled DW_OP_ opcode 0x8
==6424== Thread 3:
==6424== Conflicting load by thread 3 at 0x0e4ecb14 size 4
==6424== at 0xDA2D45D: blas_memory_alloc (in /usr/local/src/julia/usr/lib/libopenblas.so)
==6424== by 0xDA2E0D4: ??? (in /usr/local/src/julia/usr/lib/libopenblas.so)
==6424== by 0x4A0CE50: vgDrd_thread_wrapper (drd_pthread_intercepts.c:272)
==6424== by 0x345E007760: start_thread (in /lib64/libpthread-2.12.1.so)
==6424== by 0x345DCE14FC: clone (in /lib64/libc-2.12.1.so)
==6424== Allocation context: BSS section of /usr/local/src/julia/usr/lib/libopenblas.so
==6424== Other segment start (thread 2)
==6424== at 0x345DCE14C1: clone (in /lib64/libc-2.12.1.so)
==6424== Other segment end (thread 2)
==6424== at 0x345DCC8897: sched_yield (in /lib64/libc-2.12.1.so)
==6424== by 0xDA2E11E: ??? (in /usr/local/src/julia/usr/lib/libopenblas.so)
==6424== by 0x4A0CE50: vgDrd_thread_wrapper (drd_pthread_intercepts.c:272)
==6424== by 0x345E007760: start_thread (in /lib64/libpthread-2.12.1.so)
==6424== by 0x345DCE14FC: clone (in /lib64/libc-2.12.1.so)
==6424==
==6424== Conflicting load by thread 3 at 0x0e4ecb14 size 4
==6424== at 0xDA2D4FF: blas_memory_alloc (in /usr/local/src/julia/usr/lib/libopenblas.so)
==6424== by 0xDA2E0D4: ??? (in /usr/local/src/julia/usr/lib/libopenblas.so)
==6424== by 0x4A0CE50: vgDrd_thread_wrapper (drd_pthread_intercepts.c:272)
==6424== by 0x345E007760: start_thread (in /lib64/libpthread-2.12.1.so)
==6424== by 0x345DCE14FC: clone (in /lib64/libc-2.12.1.so)
==6424== Allocation context: BSS section of /usr/local/src/julia/usr/lib/libopenblas.so
==6424== Other segment start (thread 2)
==6424== at 0x345DCE14C1: clone (in /lib64/libc-2.12.1.so)
==6424== Other segment end (thread 2)
==6424== at 0x345DCC8897: sched_yield (in /lib64/libc-2.12.1.so)
==6424== by 0xDA2E11E: ??? (in /usr/local/src/julia/usr/lib/libopenblas.so)
==6424== by 0x4A0CE50: vgDrd_thread_wrapper (drd_pthread_intercepts.c:272)
==6424== by 0x345E007760: start_thread (in /lib64/libpthread-2.12.1.so)
==6424== by 0x345DCE14FC: clone (in /lib64/libc-2.12.1.so)
==6424==
==6424== Conflicting store by thread 3 at 0x0e4ecb54 size 4
==6424== at 0xDA2D5C8: blas_memory_alloc (in /usr/local/src/julia/usr/lib/libopenblas.so)
==6424== by 0xDA2E0D4: ??? (in /usr/local/src/julia/usr/lib/libopenblas.so)
==6424== by 0x4A0CE50: vgDrd_thread_wrapper (drd_pthread_intercepts.c:272)
==6424== by 0x345E007760: start_thread (in /lib64/libpthread-2.12.1.so)
==6424== by 0x345DCE14FC: clone (in /lib64/libc-2.12.1.so)
==6424== Allocation context: BSS section of /usr/local/src/julia/usr/lib/libopenblas.so
==6424== Other segment start (thread 2)
==6424== at 0x345DCE14C1: clone (in /lib64/libc-2.12.1.so)
==6424== Other segment end (thread 2)
==6424== at 0x345DCC8897: sched_yield (in /lib64/libc-2.12.1.so)
==6424== by 0xDA2E11E: ??? (in /usr/local/src/julia/usr/lib/libopenblas.so)
==6424== by 0x4A0CE50: vgDrd_thread_wrapper (drd_pthread_intercepts.c:272)
==6424== by 0x345E007760: start_thread (in /lib64/libpthread-2.12.1.so)
==6424== by 0x345DCE14FC: clone (in /lib64/libc-2.12.1.so)
... etc ....

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions