Skip to content

[cmake][arm64] compilation is broken #1908

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ayounes-nviso opened this issue Dec 7, 2018 · 26 comments · Fixed by #1930
Closed

[cmake][arm64] compilation is broken #1908

ayounes-nviso opened this issue Dec 7, 2018 · 26 comments · Fixed by #1930
Milestone

Comments

@ayounes-nviso
Copy link

with dynamic arch

FAILED: /opt/gcc-linaro-6.4.1-2018.05-x86_64_aarch64-linux-gnu/bin/aarch64-linux-gnu-gcc  -I/home/ayounes/devel/nv3dfi/deps-base/pkg-openblas/src -I. -fPIC -DDYNAMIC_ARCH -DNO_LAPACK -DNO_LAPACKE -DMAX_CPU_NUMBER=64 -DMAX_PARALLEL_NUMBER=1 -DNO_AFFINITY -DVERSION="\"0.3.5.dev\"" -O3 -DNDEBUG -fPIC -MD -MT driver/others/CMakeFiles/driver_others.dir/dynamic.c.o -MF driver/others/CMakeFiles/driver_others.dir/dynamic.c.o.d -o driver/others/CMakeFiles/driver_others.dir/dynamic.c.o   -c /home/ayounes/devel/nv3dfi/deps-base/pkg-openblas/src/driver/others/dynamic.c
/home/ayounes/devel/nv3dfi/deps-base/pkg-openblas/src/driver/others/dynamic.c: In function 'support_avx':
/home/ayounes/devel/nv3dfi/deps-base/pkg-openblas/src/driver/others/dynamic.c:294:3: warning: implicit declaration of function 'cpuid' [-Wimplicit-function-declaration]
   cpuid(1, &eax, &ebx, &ecx, &edx);
   ^~~~~
/home/ayounes/devel/nv3dfi/deps-base/pkg-openblas/src/driver/others/dynamic.c:284:3: error: impossible constraint in 'asm'
   __asm__ __volatile__
   ^~~~~~~
ninja: build stopped: subcommand failed.
ninja: build stopped: subcommand failed.

With CMAKE_SYSTEM_PROCESSOR = aarch64:

FAILED: /opt/gcc-linaro-6.4.1-2018.05-x86_64_aarch64-linux-gnu/bin/aarch64-linux-gnu-gcc  -I/home/ayounes/devel/nv3dfi/deps-base/pkg-openblas/src -I. -fPIC -DNO_LAPACK -DNO_LAPACKE -DMAX_CPU_NUMBER=64 -DMAX_PARALLEL_NUMBER=1 -DNO_AFFINITY -DVERSION="\"0.3.5.dev\"" -O3 -DNDEBUG -fPIC -MD -MT kernel/CMakeFiles/kernel.dir/CMakeFiles/dgemm_kernel.S.o -MF kernel/CMakeFiles/kernel.dir/CMakeFiles/dgemm_kernel.S.o.d -o kernel/CMakeFiles/kernel.dir/CMakeFiles/dgemm_kernel.S.o -c kernel/CMakeFiles/dgemm_kernel.S
kernel/CMakeFiles/dgemm_kernel.S:8:97: fatal error: /home/ayounes/devel/nv3dfi/deps-base/pkg-openblas/src/kernel/arm64/dgemm_kernel_2x2.S: No such file or directory
 #include "/home/ayounes/devel/nv3dfi/deps-base/pkg-openblas/src/kernel/arm64/dgemm_kernel_2x2.S"
                                                                                                 ^

@martin-frbg
Copy link
Collaborator

Is this with 0.3.4, or current develop ? This looks like it could be related to what I fixed for ARMV8 in #1870 just yesterday (cmake stumbilng over alternative kernels for osx as our Makefile parser does not handle gmake conditionals - on the other hand, this may have exposed a new bug as it is now no longer the generic C kernels that "win" ).

@ayounes-nviso
Copy link
Author

This is with latest develop: #ff3eb1d47401f
I tried because I've noticed your ARMV8 fixes indeed.

Is there any plan to officially switch to cmake at some point? Otherwise we will maybe switch to autoconf even though cmake support is much easier to integrate into our cross platform framework.

Thanks a lot!

@martin-frbg
Copy link
Collaborator

martin-frbg commented Dec 7, 2018

Too bad. Somehow the DYNAMIC_ARCH code appears to assume GEMM_UNROLL_M and _N to be 2, when provisions are only made for 4x4, 4x8 or 8x4. I do not know who could proclaim any official plans at this point, but myself I am not inclined to switch to cmake as the default anytime soon.

I do wonder if it would be possible to work around this problem with a simple kludge - provide a "dgemm_kernel_2x2.S" in kernel/arm64 that has only #include "../generic/gemm_kernel_2x2.c" ?

@martin-frbg
Copy link
Collaborator

BTW there appears to be nothing in the "ARM dynamic arch" PR #1829 that sets GEMM_UNROLL_M and GEMM_UNROLL_N, perhaps this is just some older fallback in the cmake files that gets triggered now.

@martin-frbg
Copy link
Collaborator

martin-frbg commented Dec 7, 2018

Quick glance found nothing in the cmake code either, and the only parameter set in param.h that includes 2x2 seems to be thunderx, which has its own KERNEL file that uses the generic c kernel for dgemm.

(Unless you happen to be doing your cross-compiling on an OSX box, that is ? In that case yesterday's PR would be at fault , as I actually considered breaking that unique combination the lesser evil compared to breaking cmake/ARMV8 for everybody else. And I have no idea how to make
the parser in cmake/utils.cmake understand Makefile conditionals - let alone nested ones)

@ayounes-nviso
Copy link
Author

I am building within a Ubuntu 16.04 docker with a Linaro toolchain:
gcc-linaro-6.4.1-2018.05-x86_64_aarch64-linux-gnu.tar.xz

@martin-frbg
Copy link
Collaborator

Hmm. That should certainly not trigger the OS_DARWIN conditional.

@martin-frbg
Copy link
Collaborator

Seems to leave ThunderX as the only core for which 2x2 unrolling should be defined (but as I stated above, that one should use the definitions from its own KERNEL.THUNDERX file). Can you see from the log which kernel it is trying to build at that time ?

@ayounes-nviso
Copy link
Author

ayounes-nviso commented Dec 7, 2018

With dynamyc arch:

-- Targeting the ARMV8 architecture.
-- GEMM multithread threshold set to 4.
-- Building Single Precision
-- Building Double Precision
-- Building Complex Precision
-- Building Double Complex Precision

With CMAKE_SYSTEM_PROCESSOR = aarch64

-- Targeting the ARMV8 architecture.

-- GEMM multithread threshold set to 4.
-- Building Single Precision
-- Building Double Precision
-- Building Complex Precision
-- Building Double Complex Precision
-- Reading vars from /home/ayounes/devel/nv3dfi/deps-base/pkg-openblas/src/kernel/arm64/KERNEL...
-- Reading vars from /home/ayounes/devel/nv3dfi/deps-base/pkg-openblas/src/kernel/arm64/KERNEL.ARMV8...

@ayounes-nviso
Copy link
Author

ayounes-nviso commented Dec 7, 2018

With commit #2b355592e34b07f4d0c5f81c275c902c0578236d I got this:

dynamic arch now compiles (as a static library) but I have some missing symbols at link stage of my app:

ext/ai-app/proc_lpdnn/libproc_lpdnn.so: undefined reference to `gotoblas_dynamic_quit'
ext/ai-app/proc_lpdnn/libproc_lpdnn.so: undefined reference to `gotoblas_dynamic_init'

with specific arch I still have the same compilation error.

@martin-frbg
Copy link
Collaborator

Looks like arch.cmake is not yet ready for arm64 dynamic_arch as well.

martin-frbg added a commit that referenced this issue Dec 7, 2018
@martin-frbg
Copy link
Collaborator

martin-frbg commented Dec 7, 2018

Sorry, I had mis-spelled "SOURcES" in the CMakeLists.txt earlier (coding on the phone I use for arm64 testing). DYNAMIC_ARCH build should work now with both fixes.
Not sure if autodetection is expected to work when cross-compiling with CMAKE_SYSTEM_PROCESSOR set to aarch64 (and actually arm64 might be a better choice here), but TARGET=ARMV8 should work I think.

@ayounes-nviso
Copy link
Author

ayounes-nviso commented Dec 10, 2018

Hello, I am sorry but this may have uncovered another bug... I have this cmake error for all architecture (ARMV8, CORTEXA53, etc):

CMake Error at cmake/prebuild.cmake:166 (file):
  file RENAME failed to rename

    /home/ayounes/openblas_project-build/config_kernel.h.tmp

  to

    /home/ayounes/openblas_project-build/kernel_config/ARMV8/config_kernel.h

  because: No such file or directory

Call Stack (most recent call first):
  cmake/system.cmake:128 (include)
  kernel/CMakeLists.txt:10 (include)
  kernel/CMakeLists.txt:537 (build_core)

@ayounes-nviso
Copy link
Author

This should fix it:

diff --git a/cmake/prebuild.cmake b/cmake/prebuild.cmake
index f29bc3a7..d9199bec 100644
--- a/cmake/prebuild.cmake
+++ b/cmake/prebuild.cmake
@@ -163,6 +163,7 @@ if (DEFINED CORE AND CMAKE_CROSSCOMPILING AND NOT (${HOST_OS} STREQUAL "WINDOWSS
   file(APPEND ${TARGET_CONF_TEMP}
     "#define GEMM_MULTITHREAD_THRESHOLD\t${GEMM_MULTITHREAD_THRESHOLD}\n")
   # Move to where gen_config_h would place it
+  file(MAKE_DIRECTORY ${TARGET_CONF_DIR})
   file(RENAME ${TARGET_CONF_TEMP} "${TARGET_CONF_DIR}/${TARGET_CONF}")  
 
 else(NOT CMAKE_CROSSCOMPILING)

@ayounes-nviso
Copy link
Author

The compilation then goes on and fails at CORTEXA72:

[1346/4436] Building ASM object kernel/CMakeFiles/kernel_CORTEXA72.dir/CMakeFiles/dgemm_kernel_CORTEXA72.S.o
FAILED: /opt/gcc-linaro-6.4.1-2018.05-x86_64_aarch64-linux-gnu/bin/aarch64-linux-gnu-gcc  -I/home/ayounes/devel/nv3dfi/deps-base/pkg-openblas/src -Ikernel_config/CORTEXA72 -fPIC -DDYNAMIC_ARCH -DNO_LAPACK -DNO_LAPACKE -DMAX_CPU_NUMBER=64 -DMAX_PARALLEL_NUMBER=1 -DNO_AFFINITY -DVERSION="\"0.3.5.dev\"" -O3 -DNDEBUG -fPIC   -DBUILD_KERNEL -DTABLE_NAME=gotoblas_CORTEXA72 -DTS=_CORTEXA72 -MD -MT kernel/CMakeFiles/kernel_CORTEXA72.dir/CMakeFiles/dgemm_kernel_CORTEXA72.S.o -MF kernel/CMakeFiles/kernel_CORTEXA72.dir/CMakeFiles/dgemm_kernel_CORTEXA72.S.o.d -o kernel/CMakeFiles/kernel_CORTEXA72.dir/CMakeFiles/dgemm_kernel_CORTEXA72.S.o -c kernel/CMakeFiles/dgemm_kernel_CORTEXA72.S
kernel/CMakeFiles/dgemm_kernel_CORTEXA72.S:8:97: fatal error: /home/ayounes/devel/nv3dfi/deps-base/pkg-openblas/src/kernel/arm64/dgemm_kernel_2x2.S: No such file or directory
 #include "/home/ayounes/devel/nv3dfi/deps-base/pkg-openblas/src/kernel/arm64/dgemm_kernel_2x2.S"

@martin-frbg
Copy link
Collaborator

Not sure I understand why TARGET_CONF_DIR would not exist... The dynamic_arch compilation passes for me so it is probably something about cross-compilation not picking suitable defaults.

@martin-frbg
Copy link
Collaborator

This should hopefully be fixed now - when the ARMV8 targets were rearranged a few weeks ago, the specifics of the now separate CortexA7x targets had not been added to the part of prebuild.cmake that handles cross-compilation.

@ayounes-nviso
Copy link
Author

I am sorry but, while the TARGET_CONF_DIR issue is fixed, I still have the compilation error:

FAILED: /opt/gcc-linaro-6.4.1-2018.05-x86_64_aarch64-linux-gnu/bin/aarch64-linux-gnu-gcc -I/home/ayounes/devel/nv3dfi/deps-base/pkg-openblas/src -Ikernel_config/CORTEXA72 -fPIC -DDYNAMIC_ARCH -DNO_LAPACK -DNO_LAPACKE -DMAX_CPU_NUMBER=64 -DMAX_PARALLEL_NUMBER=1 -DNO_AFFINITY -DVERSION=""0.3.6.dev"" -O3 -DNDEBUG -fPIC -DBUILD_KERNEL -DTABLE_NAME=gotoblas_CORTEXA72 -DTS=_CORTEXA72 -MD -MT kernel/CMakeFiles/kernel_CORTEXA72.dir/CMakeFiles/dgemm_kernel_CORTEXA72.S.o -MF kernel/CMakeFiles/kernel_CORTEXA72.dir/CMakeFiles/dgemm_kernel_CORTEXA72.S.o.d -o kernel/CMakeFiles/kernel_CORTEXA72.dir/CMakeFiles/dgemm_kernel_CORTEXA72.S.o -c kernel/CMakeFiles/dgemm_kernel_CORTEXA72.S
kernel/CMakeFiles/dgemm_kernel_CORTEXA72.S:8:97: fatal error: /home/ayounes/devel/nv3dfi/deps-base/pkg-openblas/src/kernel/arm64/dgemm_kernel_2x2.S: No such file or directory
#include "/home/ayounes/devel/nv3dfi/deps-base/pkg-openblas/src/kernel/arm64/dgemm_kernel_2x2.S"

@martin-frbg
Copy link
Collaborator

Hmm. That is not supposed to happen after #1930 (CortexA72 should get DGEMM_UNROLL_M 8, DGEMM_UNROLL_N 4 so should go looking for dgemm_kernel_8x4.S). I now see that my patch as committed had a few quotes missing on target names following A73, which may have led to unexpected behaviour. Will fix this in a few minutes.

@martin-frbg martin-frbg reopened this Jan 2, 2019
@ayounes-nviso
Copy link
Author

Just tried latest commit, still the same compilation error, sorry :-(

@martin-frbg
Copy link
Collaborator

I must be testing something else rather than your actual problem then, sorry. Which arguments are you giving to cmake ?

@ayounes-nviso
Copy link
Author

ayounes-nviso commented Jan 3, 2019

Here are my cmake options:

    -DDYNAMIC_ARCH=ON 
    -DUSE_THREAD=OFF
    -DNUM_THREADS=64
    -DBUILD_WITHOUT_LAPACK=ON
    -DBUILD_RELAPACK=OFF
    -DNOFORTRAN=ON
    -DBUILD_SHARED_LIBS=OFF
    -DCMAKE_INSTALL_PREFIX=${CMAKE_BINARY_DIR}/deps
    -DCMAKE_INSTALL_MESSAGE=LAZY
    -DCMAKE_BUILD_TYPE=RELEASE
    -DCMAKE_POSITION_INDEPENDENT_CODE:BOOL=true
   -DCMAKE_TOOLCHAIN_FILE:STRING=${TOOLCHAIN_FILE}

And here is my toolchain file:

set(CMAKE_SYSTEM_NAME Linux)
set(CMAKE_SYSTEM_PROCESSOR aarch64)
set(CMAKE_C_COMPILER "aarch64-linux-gnu-gcc")
set(CMAKE_CXX_COMPILER "aarch64-linux-gnu-g++")

I can also provide a docker file if you want to try that by yourself. The docker file is using this compiler:

https://releases.linaro.org/components/toolchain/binaries/6.4-2018.05/aarch64-linux-gnu/gcc-linaro-6.4.1-2018.05-x86_64_aarch64-linux-gnu.tar.xz

@brada4
Copy link
Contributor

brada4 commented Jan 3, 2019

Could you share cmake version, cmake logs around ,and emitted Makefile, leading to failure?
It is a bit of overkill to reverse-engineer your build env if you perfectly know how you put it together.

@ayounes-nviso
Copy link
Author

cmake-3.12.3
Ninja generator (with -G Ninja, sorry I forgot to mention that)
cmake log:

-- The C compiler identification is GNU 6.4.1
-- The ASM compiler identification is GNU
-- Found assembler: /opt/gcc-linaro-6.4.1-2018.05-x86_64_aarch64-linux-gnu/bin/aarch64-linux-gnu-gcc
-- Check for working C compiler: /opt/gcc-linaro-6.4.1-2018.05-x86_64_aarch64-linux-gnu/bin/aarch64-linux-gnu-gcc
-- Check for working C compiler: /opt/gcc-linaro-6.4.1-2018.05-x86_64_aarch64-linux-gnu/bin/aarch64-linux-gnu-gcc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
CMake Warning at CMakeLists.txt:46 (message):
  CMake support is experimental.  It does not yet support all build options
  and may not produce the same Makefiles that OpenBLAS ships with.


-- Targeting the ARMV8 architecture.
-- GEMM multithread threshold set to 4.
-- Building Single Precision
-- Building Double Precision
-- Building Complex Precision
-- Building Double Complex Precision
-- Targeting the ARMV8 architecture.
-- GEMM multithread threshold set to 4.
-- Reading vars from /home/ayounes/devel/nv3dfi/deps-base/pkg-openblas/src/kernel/arm64/KERNEL...
-- Reading vars from /home/ayounes/devel/nv3dfi/deps-base/pkg-openblas/src/kernel/arm64/KERNEL.ARMV8...
-- Targeting the CORTEXA53 architecture.
-- GEMM multithread threshold set to 4.
-- Reading vars from /home/ayounes/devel/nv3dfi/deps-base/pkg-openblas/src/kernel/arm64/KERNEL...
-- Reading vars from /home/ayounes/devel/nv3dfi/deps-base/pkg-openblas/src/kernel/arm64/KERNEL.CORTEXA53...
-- Reading vars from /home/ayounes/devel/nv3dfi/deps-base/pkg-openblas/src/kernel/arm64/KERNEL.ARMV8...
-- Targeting the CORTEXA57 architecture.
-- GEMM multithread threshold set to 4.
-- Reading vars from /home/ayounes/devel/nv3dfi/deps-base/pkg-openblas/src/kernel/arm64/KERNEL...
-- Reading vars from /home/ayounes/devel/nv3dfi/deps-base/pkg-openblas/src/kernel/arm64/KERNEL.CORTEXA57...
-- Targeting the CORTEXA72 architecture.
-- GEMM multithread threshold set to 4.
-- Reading vars from /home/ayounes/devel/nv3dfi/deps-base/pkg-openblas/src/kernel/arm64/KERNEL...
-- Reading vars from /home/ayounes/devel/nv3dfi/deps-base/pkg-openblas/src/kernel/arm64/KERNEL.CORTEXA72...
-- Reading vars from /home/ayounes/devel/nv3dfi/deps-base/pkg-openblas/src/kernel/arm64/KERNEL.CORTEXA57...
-- Targeting the CORTEXA73 architecture.
-- GEMM multithread threshold set to 4.
-- Reading vars from /home/ayounes/devel/nv3dfi/deps-base/pkg-openblas/src/kernel/arm64/KERNEL...
-- Reading vars from /home/ayounes/devel/nv3dfi/deps-base/pkg-openblas/src/kernel/arm64/KERNEL.CORTEXA73...
-- Reading vars from /home/ayounes/devel/nv3dfi/deps-base/pkg-openblas/src/kernel/arm64/KERNEL.CORTEXA57...
-- Targeting the FALKOR architecture.
-- GEMM multithread threshold set to 4.
-- Reading vars from /home/ayounes/devel/nv3dfi/deps-base/pkg-openblas/src/kernel/arm64/KERNEL...
-- Reading vars from /home/ayounes/devel/nv3dfi/deps-base/pkg-openblas/src/kernel/arm64/KERNEL.FALKOR...
-- Reading vars from /home/ayounes/devel/nv3dfi/deps-base/pkg-openblas/src/kernel/arm64/KERNEL.CORTEXA57...
-- Targeting the THUNDERX architecture.
-- GEMM multithread threshold set to 4.
-- Reading vars from /home/ayounes/devel/nv3dfi/deps-base/pkg-openblas/src/kernel/arm64/KERNEL...
-- Reading vars from /home/ayounes/devel/nv3dfi/deps-base/pkg-openblas/src/kernel/arm64/KERNEL.THUNDERX...
-- Targeting the THUNDERX2T99 architecture.
-- GEMM multithread threshold set to 4.
-- Reading vars from /home/ayounes/devel/nv3dfi/deps-base/pkg-openblas/src/kernel/arm64/KERNEL...
-- Reading vars from /home/ayounes/devel/nv3dfi/deps-base/pkg-openblas/src/kernel/arm64/KERNEL.THUNDERX2T99...
ARMV8
CORTEXA53
CORTEXA57
CORTEXA72
CORTEXA73
FALKOR
THUNDERX
THUNDERX2T99
-- Generating openblas_config.h in include/openblas
-- Generating cblas.h in include/openblas
-- Could NOT find PkgConfig (missing: PKG_CONFIG_EXECUTABLE) 
-- Configuring done
-- Generating done
-- Build files have been written to: /home/ayounes/devel/nv3dfi-build/linux_target/src/openblas_project-build

Here are generated files (with additional .txt so that they can be uploaded here)

build.ninja.txt
CMakeCache.txt
config.h.txt
openblas_config.h.txt
rules.ninja.txt
CMakeOutput.log
TargetDirectories.txt

@martin-frbg
Copy link
Collaborator

I have reproduced the problem and am working on a fix. Not sure why I did not see this earlier.

@ayounes-nviso
Copy link
Author

Fixed in #1946, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants