Skip to content

SEGFAULT in test/zblat2 on IA32 with gcc 4.6 #32

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
JeffBezanson opened this issue May 31, 2011 · 7 comments
Closed

SEGFAULT in test/zblat2 on IA32 with gcc 4.6 #32

JeffBezanson opened this issue May 31, 2011 · 7 comments
Assignees
Labels
Milestone

Comments

@JeffBezanson
Copy link

I get the following:

OPENBLAS_NUM_THREADS=1 ./zblat1
Complex BLAS Test Program Results

Test of subprogram number 1 ZDOTC
make[1]: *** [level1] Segmentation fault (core dumped)
make[1]: *** Waiting for unfinished jobs....
OPENBLAS_NUM_THREADS=1 ./dblat2 < ./dblat2.dat
OPENBLAS_NUM_THREADS=1 ./cblat2 < ./cblat2.dat
OPENBLAS_NUM_THREADS=1 ./zblat2 < ./zblat2.dat
/bin/sh: line 1: 32249 Segmentation fault (core dumped) OPENBLAS_NUM_THREADS=1 ./zblat2 < ./zblat2.dat
make[1]: *** [level2] Error 139
make[1]: Leaving directory `/home/public/download/OpenBLAS/test'
make: *** [tests] Error 2

When I build with DEBUG=1, the problem goes away.

This is my CPU:
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 37
model name : Intel(R) Core(TM) i5 CPU 650 @ 3.20GHz
stepping : 2
cpu MHz : 3197.934
cache size : 4096 KB
physical id : 0
siblings : 4
core id : 0
cpu cores : 2

$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/i686-pc-linux-gnu/4.6.0/lto-wrapper
Target: i686-pc-linux-gnu
Configured with: /build/src/gcc-4.6-20110429/configure --prefix=/usr --libdir=/usr/lib --libexecdir=/usr/lib --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=https://bugs.archlinux.org/ --enable-languages=c,c++,ada,fortran,go,lto,objc,obj-c++ --enable-shared --enable-threads=posix --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-clocale=gnu --enable-gnu-unique-object --enable-linker-build-id --with-ppl --enable-cloog-backend=isl --enable-lto --enable-gold --enable-ld=default --enable-plugin --with-plugin-ld=ld.gold --disable-multilib --disable-libstdcxx-pch --enable-checking=release
Thread model: posix
gcc version 4.6.0 20110429 (prerelease) (GCC)

@xianyi
Copy link
Collaborator

xianyi commented May 31, 2011

Hi Jeff,

I cannot reproduce this core dump.

The test box is Intel Core2 Q8400, Ubuntu 10.04 and gcc 4.4.3.

What's your OpenBLAS version? Could you try develop branch?

Thanks

Xianyi Zhang

2011/5/31 JeffBezanson <
[email protected]>

I get the following:

OPENBLAS_NUM_THREADS=1 ./zblat1
Complex BLAS Test Program Results

Test of subprogram number 1 ZDOTC
make[1]: *** [level1] Segmentation fault (core dumped)
make[1]: *** Waiting for unfinished jobs....
OPENBLAS_NUM_THREADS=1 ./dblat2 < ./dblat2.dat
OPENBLAS_NUM_THREADS=1 ./cblat2 < ./cblat2.dat
OPENBLAS_NUM_THREADS=1 ./zblat2 < ./zblat2.dat
/bin/sh: line 1: 32249 Segmentation fault (core dumped)
OPENBLAS_NUM_THREADS=1 ./zblat2 < ./zblat2.dat
make[1]: *** [level2] Error 139
make[1]: Leaving directory `/home/public/download/OpenBLAS/test'
make: *** [tests] Error 2

When I build with DEBUG=1, the problem goes away.

This is my CPU:
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 37
model name : Intel(R) Core(TM) i5 CPU 650 @ 3.20GHz
stepping : 2
cpu MHz : 3197.934
cache size : 4096 KB
physical id : 0
siblings : 4
core id : 0
cpu cores : 2

$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/i686-pc-linux-gnu/4.6.0/lto-wrapper
Target: i686-pc-linux-gnu
Configured with: /build/src/gcc-4.6-20110429/configure --prefix=/usr
--libdir=/usr/lib --libexecdir=/usr/lib --mandir=/usr/share/man
--infodir=/usr/share/info --with-bugurl=https://bugs.archlinux.org/--enable-languages=c,c++,ada,fortran,go,lto,objc,obj-c++ --enable-shared
--enable-threads=posix --with-system-zlib --enable-__cxa_atexit
--disable-libunwind-exceptions --enable-clocale=gnu
--enable-gnu-unique-object --enable-linker-build-id --with-ppl
--enable-cloog-backend=isl --enable-lto --enable-gold --enable-ld=default
--enable-plugin --with-plugin-ld=ld.gold --disable-multilib
--disable-libstdcxx-pch --enable-checking=release
Thread model: posix
gcc version 4.6.0 20110429 (prerelease) (GCC)

Reply to this email directly or view it on GitHub:
#32

@JeffBezanson
Copy link
Author

It still happens with the develop branch. Can you try gcc 4.6.0? Maybe that is the key difference.

I'm seeing what looks like a mis-adjusted stack pointer in gbmv_kernel:

#0 0x0807fe7d in gbmv_kernel (args=0xbfffb3e4, range_m=0xbfffb204, range_n=0xbfffb1f0,
dummy1=0x0, buffer=, pos=0) at gbmv_thread.c:145
145 result = MYDOT(ll - uu, a + uu * COMPSIZE, 1, x + uu * COMPSIZE, 1);
(gdb) n
gbmv_kernel (args=0x808afc4, range_m=0xbfffb3e4, range_n=0xbfffb204, dummy1=0xbfffb1f0,
buffer=, pos=-1242758912) at gbmv_thread.c:150
150 *(y + 0) += CREAL(result);

After calling MYDOT (zdotu_k) it looks like the arguments to gbmv_kernel are shifted.

@ghost ghost assigned xianyi Jun 1, 2011
@xianyi
Copy link
Collaborator

xianyi commented Jun 2, 2011

Hi Jeff,

When I build the lib with gcc-4.6, I meet this error, too:)

Thanks
Xianyi

2011/6/1 JeffBezanson <
[email protected]>

It still happens with the develop branch. Can you try gcc 4.6.0? Maybe that
is the key difference.

I'm seeing what looks like a mis-adjusted stack pointer in gbmv_kernel:

#0 0x0807fe7d in gbmv_kernel (args=0xbfffb3e4, range_m=0xbfffb204,
range_n=0xbfffb1f0,
dummy1=0x0, buffer=, pos=0) at gbmv_thread.c:145
145 result = MYDOT(ll - uu, a + uu * COMPSIZE, 1, x + uu *
COMPSIZE, 1);
(gdb) n
gbmv_kernel (args=0x808afc4, range_m=0xbfffb3e4, range_n=0xbfffb204,
dummy1=0xbfffb1f0,
buffer=, pos=-1242758912) at gbmv_thread.c:150
150 *(y + 0) += CREAL(result);

After calling MYDOT (zdotu_k) it looks like the arguments to gbmv_kernel
are shifted.

Reply to this email directly or view it on GitHub:
#32 (comment)

@JeffBezanson
Copy link
Author

GCC 4.6.0 seems to have some regressions, and it seems possible that this is a compiler bug. Do you think we should consider filing a bug against gcc, or just wait for 4.6.1?

@xianyi xianyi closed this as completed in 31040e4 Jun 3, 2011
@xianyi
Copy link
Collaborator

xianyi commented Jun 3, 2011

Hi Jeff,
Could you help me verify it?
I fixed this bug on develop branch.

Thanks
Xianyi Zhang

@JeffBezanson
Copy link
Author

Yes, it works for me now, thank you!
Out of curiosity, why is zdot_sse2.S affected by this but not any of the other x86 zdot implementations (zdot, zdot_amd, zdot_sse)?

@xianyi
Copy link
Collaborator

xianyi commented Jun 3, 2011

I think it use different method of return value in zdot_sse.S

The codes in zdot_sse.S are
.L999:
subl $2 * SIZE, %esp
movss %xmm0, 0 * SIZE(%esp)
movss %xmm1, 1 * SIZE(%esp)
movl 0 * SIZE(%esp), %eax
movl 1 * SIZE(%esp), %edx
addl $2 * SIZE, %esp

Xianyi

Yes, it works for me now, thank you!
Out of curiosity, why is zdot_sse2.S affected by this but not any of the other x86 zdot implementations (zdot, zdot_amd, zdot_sse)?

xianyi pushed a commit that referenced this issue Sep 14, 2011
In i386 calling convention, the caller put the address of return value of zdot into the first hidden parameter.
Thus, the callee should delete this address before return.
Actually, I have fixed the same bug on x86/zdot_sse2.S (issue #32). However, that is not a good implementation which uses 3 instructions. Mr. John told me used "ret $0x4" to skip the first hidden address (4 bytes).
martin-frbg added a commit that referenced this issue Feb 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants