Skip to content

zblat1 test failure during build #58

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jpsinthemix opened this issue Sep 11, 2011 · 5 comments
Closed

zblat1 test failure during build #58

jpsinthemix opened this issue Sep 11, 2011 · 5 comments
Assignees
Labels

Comments

@jpsinthemix
Copy link

Hi,

I get the following testsuite error when building openblas-0.1a2.2git:

OPENBLAS_NUM_THREADS=1 ./zblat1
make[1]: *** [level1] Segmentation fault
make[1]: Leaving directory `/home/bld/openblas-0.1a2.2git-jps_src/openblas-0.1a2.2git/test'
make: *** [tests] Error 2

I'm am building on a (very old) 900 MHz Thinkpad T22 Pentium3 laptop:

model name: Pentium III (Coppermine)
flags: fpu vme de pse tsc msr pae mce cx8 sep mtrr pge mca cmov pse36 mmx fxsr sse

under i686-pc-linux-gnu with linux-3.0.3, gcc-4.6.1, glibc-2.14, and binutils-2.21.1a, and am using gfortran.

For reference, after patching to get a successful build with all tests passing, the build summary is:

OpenBLAS build complete.

OS ... Linux
Architecture ... x86
BINARY ... 32bit
C compiler ... GCC (command line : gcc)
Fortran compiler ... GFORTRAN (command line : gfortran)
Library Name ... libopenblas_coppermine-r0.1alpha2.2.a (Single threaded)

Also for reference, the compilation environment/switches used for building these routines is:

On examining the dot.f routines in kernel/x86, there appear to be cases where 'hidden' parameters may not be handled properly on FUNCTION returns. The patch below addresses the issue in kernel/x86/zdot.S (which causes the zblat1 test failure), and makes similar changes in kernel/x86/xdot.S and kernel/x86/zdot_sse2.S. The routine kernel/x86/qdot.S may also need similar changes, but I'm not sure about this one, and the remaining dot.f look ok.

This patch is based on my (hopefully correct) understanding that the default gfortran calling conventions for DOUBLE COMPLEX (COMPLEX_16), REAL_16, and COMPLEX*32 involve the use of 'hidden' parameter addresses on the stack for the returned value which should be removed from the stack by the callee. To do this removal, I simply conditionally use 'ret $4' instead of 'ret' in the patch.

Also for reference, the compilation environment/switches used for building these routinesa follows:

common to all:
gcc -c -O2 -Wall -DMAX_CPU_NUMBER=1 -DEXPRECISION -m128bit-long-double -m32 -DF_INTERFACE_GFORT -fPIC

sdot_k_ -UDOUBLE -UCOMPLEX ../kernel/x86/dot_sse.S -o sdot_k.o
sdsdot_k -UDOUBLE -UCOMPLEX ../kernel/x86/dot_sse.S -o sdsdot_k.o
dsdot_k -UDOUBLE -UCOMPLEX -DDSDOT ../kernel/x86/dot_sse.S -o dsdot_k.o

ddot_k -DDOUBLE -UCOMPLEX ../kernel/x86/dot.S -o ddot_k.o

cdotc_k -UDOUBLE -DCOMPLEX -DCONJ ../kernel/x86/zdot_sse.S -o cdotc_k.o
cdotu_k -UDOUBLE -DCOMPLEX -UCONJ ../kernel/x86/zdot_sse.S -o cdotu_k.o

zdotc_k -DDOUBLE -DCOMPLEX -DCONJ ../kernel/x86/zdot.S -o zdotc_k.o
zdotu_k -DDOUBLE -DCOMPLEX -UCONJ ../kernel/x86/zdot.S -o zdotu_k.o

qdot_k -DXDOUBLE -UCOMPLEX ../kernel/x86/qdot.S -o qdot_k.o

xdotc_k -DXDOUBLE -DCOMPLEX -DCONJ ../kernel/x86/xdot.S -o xdotc_k.o
xdotu_k -DXDOUBLE -DCOMPLEX -UCONJ ../kernel/x86/xdot.S -o xdotu_k.o

---- patch:
--- openblas-0.1a2.2git.old/kernel/x86/zdot.S 2011-08-07 02:49:50.000000000 -0400
+++ openblas-0.1a2.2git.new/kernel/x86/zdot.S 2011-09-11 00:45:01.164200335 -0400
@@ -283,7 +283,12 @@
popl %ebx
popl %esi
popl %edi
+
+#if defined(DOUBLE) || defined(XDOUBLE)

  •   ret $0x4
    

    +#else
    ret
    +#endif
    ALIGN_3

    .L88:
    @@ -305,6 +310,11 @@
    popl %ebx
    popl %esi
    popl %edi
    +
    +#if defined(DOUBLE) || defined(XDOUBLE)

  •   ret $0x4
    

    +#else
    ret
    +#endif

    EPILOGUE
    

    --- openblas-0.1a2.2git.old/kernel/x86/xdot.S 2011-08-07 02:49:50.000000000 -0400
    +++ openblas-0.1a2.2git.new/kernel/x86/xdot.S 2011-09-11 01:05:54.897543384 -0400
    @@ -307,7 +307,11 @@
    popl %ebx
    popl %esi
    popl %edi
    +#if defined(F_INTERFACE) && defined(RETURN_BY_STACK)

  •   ret     $4
    

    +#else
    ret
    +#endif
    ALIGN_3

    .L88:
    @@ -326,6 +330,10 @@
    popl %ebx
    popl %esi
    popl %edi
    +#if defined(F_INTERFACE) && defined(RETURN_BY_STACK)

  •   ret     $4
    

    +#else
    ret
    +#endif

    EPILOGUE
    

    --- openblas-0.1a2.2git.old/kernel/x86/zdot_sse2.S 2011-08-07 02:49:50.000000000 -0400
    +++ openblas-0.1a2.2git.new/kernel/x86/zdot_sse2.S 2011-09-11 01:01:06.122841604 -0400
    @@ -1541,8 +1541,9 @@
    popl %ebx
    popl %esi
    popl %edi
    -/remove the hidden return value address from the stack./

  •   popl    %ecx
    
  •   xchgl   %ecx, 0(%esp)
    
  •   ret
    

    +#if defined(DOUBLE) || defined(XDOUBLE)

  •    ret $0x4
    

    +#else

  •    ret
    

    +#endif
    EPILOGUE

thanks much for your time.
John

@xianyi
Copy link
Collaborator

xianyi commented Sep 11, 2011

Thank you for the patch. I will check this issue in next week.
Xianyi
2011 9 11 14:38"jpsinthemix" <
[email protected]

д
Hi,

I get the following testsuite error when building openblas-0.1a2.2git:

OPENBLAS_NUM_THREADS=1 ./zblat1
make[1]: *** [level1] Segmentation fault
make[1]: Leaving directory
`/home/bld/openblas-0.1a2.2git-jps_src/openblas-0.1a2.2git/test'
make: *** [tests] Error 2

I'm am building on a (very old) 900 MHz Thinkpad T22 Pentium3 laptop:

model name: Pentium III (Coppermine)
flags: fpu vme de pse tsc msr pae mce cx8 sep mtrr pge mca cmov pse36 mmx
fxsr sse

under i686-pc-linux-gnu with linux-3.0.3, gcc-4.6.1, glibc-2.14, and
binutils-2.21.1a, and am using gfortran.

For reference, after patching to get a successful build with all tests
passing, the build summary is:

OpenBLAS build complete.

OS ... Linux
Architecture ... x86
BINARY ... 32bit
C compiler ... GCC (command line : gcc)
Fortran compiler ... GFORTRAN (command line : gfortran)
Library Name ... libopenblas_coppermine-r0.1alpha2.2.a (Single threaded)

Also for reference, the compilation environment/switches used for building
these routines is:

On examining the dot.f routines in kernel/x86, there appear to be cases
where 'hidden' parameters may not be handled properly on FUNCTION returns.
The patch below addresses the issue in kernel/x86/zdot.S (which causes the
zblat1 test failure), and makes similar changes in kernel/x86/xdot.S and
kernel/x86/zdot_sse2.S. The routine kernel/x86/qdot.S may also need similar
changes, but I'm not sure about this one, and the remaining dot.f look ok.

This patch is based on my (hopefully correct) understanding that the
default gfortran calling conventions for DOUBLE COMPLEX (COMPLEX_16),
REAL_16, and COMPLEX*32 involve the use of 'hidden' parameter addresses on
the stack for the returned value which should be removed from the stack by
the callee. To do this removal, I simply conditionally use 'ret $4' instead
of 'ret' in the patch.

Also for reference, the compilation environment/switches used for building
these routinesa follows:

common to all:
gcc -c -O2 -Wall -DMAX_CPU_NUMBER=1 -DEXPRECISION -m128bit-long-double
-m32 -DF_INTERFACE_GFORT -fPIC

sdot_k_ -UDOUBLE -UCOMPLEX ../kernel/x86/dot_sse.S -o sdot_k.o
sdsdot_k -UDOUBLE -UCOMPLEX ../kernel/x86/dot_sse.S -o sdsdot_k.o
dsdot_k -UDOUBLE -UCOMPLEX -DDSDOT ../kernel/x86/dot_sse.S -o dsdot_k.o

ddot_k -DDOUBLE -UCOMPLEX ../kernel/x86/dot.S -o ddot_k.o

cdotc_k -UDOUBLE -DCOMPLEX -DCONJ ../kernel/x86/zdot_sse.S -o cdotc_k.o
cdotu_k -UDOUBLE -DCOMPLEX -UCONJ ../kernel/x86/zdot_sse.S -o cdotu_k.o

zdotc_k -DDOUBLE -DCOMPLEX -DCONJ ../kernel/x86/zdot.S -o zdotc_k.o
zdotu_k -DDOUBLE -DCOMPLEX -UCONJ ../kernel/x86/zdot.S -o zdotu_k.o

qdot_k -DXDOUBLE -UCOMPLEX ../kernel/x86/qdot.S -o qdot_k.o

xdotc_k -DXDOUBLE -DCOMPLEX -DCONJ ../kernel/x86/xdot.S -o xdotc_k.o
xdotu_k -DXDOUBLE -DCOMPLEX -UCONJ ../kernel/x86/xdot.S -o xdotu_k.o

---- patch:
--- openblas-0.1a2.2git.old/kernel/x86/zdot.S 2011-08-07
02:49:50.000000000 -0400
+++ openblas-0.1a2.2git.new/kernel/x86/zdot.S 2011-09-11
00:45:01.164200335 -0400
@@ -283,7 +283,12 @@
popl %ebx
popl %esi
popl %edi
+
+#if defined(DOUBLE) || defined(XDOUBLE)

  • ret $0x4
    +#else
    ret
    +#endif
    ALIGN_3

.L88:
@@ -305,6 +310,11 @@
popl %ebx
popl %esi
popl %edi
+
+#if defined(DOUBLE) || defined(XDOUBLE)

  • ret $0x4
    +#else
    ret
    +#endif

EPILOGUE
--- openblas-0.1a2.2git.old/kernel/x86/xdot.S 2011-08-07
02:49:50.000000000 -0400
+++ openblas-0.1a2.2git.new/kernel/x86/xdot.S 2011-09-11
01:05:54.897543384 -0400
@@ -307,7 +307,11 @@
popl %ebx
popl %esi
popl %edi
+#if defined(F_INTERFACE) && defined(RETURN_BY_STACK)

  • ret $4
    +#else
    ret
    +#endif
    ALIGN_3

.L88:
@@ -326,6 +330,10 @@
popl %ebx
popl %esi
popl %edi
+#if defined(F_INTERFACE) && defined(RETURN_BY_STACK)

  • ret $4
    +#else
    ret
    +#endif

EPILOGUE
--- openblas-0.1a2.2git.old/kernel/x86/zdot_sse2.S 2011-08-07
02:49:50.000000000 -0400
+++ openblas-0.1a2.2git.new/kernel/x86/zdot_sse2.S 2011-09-11
01:01:06.122841604 -0400
@@ -1541,8 +1541,9 @@
popl %ebx
popl %esi
popl %edi
-/remove the hidden return value address from the stack./

  • popl %ecx
  • xchgl %ecx, 0(%esp)
  • ret
    +#if defined(DOUBLE) || defined(XDOUBLE)
  • ret $0x4
    +#else
  • ret
    +#endif
    EPILOGUE

thanks much for your time.
John

Reply to this email directly or view it on GitHub:
#58

@xianyi
Copy link
Collaborator

xianyi commented Sep 12, 2011

Hi John,
I think this is the same bug as Issue #32. (#32)

However, I only fixed it on kernel/x86/zdot_sse2.S,

/remove the hidden return value address from the stack./
popl %ecx
xchgl %ecx, 0(%esp)
ret

Xianyi

@ghost ghost assigned xianyi Sep 12, 2011
@jpsinthemix
Copy link
Author

Yea, I agree. I'm curious (just for my own education) , for the correction in kernel/x86/zdot_sse2.S, why not use
'ret $0x4' instead of the two statements 'popl %ecx'; 'xchgl %ecx, 0(%esp)' ?

John

@jpsinthemix
Copy link
Author

Oops, clicked 'close issue' by accident...

@xianyi xianyi closed this as completed in 7b410b7 Sep 14, 2011
@xianyi
Copy link
Collaborator

xianyi commented Sep 14, 2011

Hi John,

I don't know "ret" can accept immediately number to adjust the stack. Thus, I used your patch in the library.

Thank you for this patch.

Xianyi

martin-frbg added a commit that referenced this issue May 18, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants