Skip to content

OpenBLAS built with USE_OPENMP=1 for windows 64 bit #1989

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
thedude2019 opened this issue Jan 28, 2019 · 62 comments
Closed

OpenBLAS built with USE_OPENMP=1 for windows 64 bit #1989

thedude2019 opened this issue Jan 28, 2019 · 62 comments

Comments

@thedude2019
Copy link

Dear OpenBLAS team,

Can you please send me OpenBLAS latest release built with USE_OPENMP=1 for windows 64 bit (Intel Haswell processor)? I don't have a lot of experience building for windows and was unsuccessful...the standard release on your site works but I want to work with OpenBLAS & OpenMP.

gcc version: gcc (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 8.1.0.
gfortran version: GNU Fortran (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 8.1.0.

Thank you!

@brada4
Copy link
Contributor

brada4 commented Jan 29, 2019

It is just adding USE_OPENMP=1 to make command line. It will use gomp , not native omp, so other programs need to use gomp

@thedude2019
Copy link
Author

Hi! maybe you can explain what I'm doing wrong...I did the following:

  1. Installed MSYS-1.0.11.
  2. Installed msysDTK-1.0.1.
  3. Downloaded OpenBLAS Windows x86/x86_64 latest binary package: OpenBLAS 0.3.5 version.
  4. Extracted the binary and I get the folder: xianyi-OpenBLAS-eebc189.
  5. Opened MSYS and move to OpenBLAS folder with: $ cd /d/OpenBLAS/xianyi-OpenBLAS-eebc189.
  6. When I build with $ make command I receive the following error:
    $ make Can't locate object method "new" via package "File::Temp" (perhaps you forgot to load "File::Temp"?) at ./c_check line 15. make: *** [config.h] Error 9 Makefile.system:201: Makefile.conf: No such file or directory C:\msys\1.0\bin\make.exe: *** couldn't commit memory for cygwin heap, Win32 erro r 0

What am I doing wrong?

Thank you!

@brada4
Copy link
Contributor

brada4 commented Jan 29, 2019

You do everything correctly.
Your perl is broken/incomplete. It would generate makefile.conf that is found missing one step later.
If you have some strawbery or activestate windows perl installed and working - try to make it appear before mingw/cygwin perl in path

@brada4
Copy link
Contributor

brada4 commented Jan 29, 2019

The essence of problem is tat perl.exe invoked from make.exe finds different versions of core perl modules because of PERL5LIB vatiable pointing to other perl installation and justly rejects to load those.

@thedude2019
Copy link
Author

What is the easiest way to resolve this issue? don't understand what you mean when you say "try to make it appear before mingw/cygwin perl in path".

@thedude2019
Copy link
Author

This is my PATH (printed with cmd):
PATH=C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32 \WindowsPowerShell\v1.0;C:\Program Files\Microsoft Windows Performance Toolkit\ ;C:\Program Files\MiKTeX 2.9\miktex\bin\x64;C:\Program Files (x86)\Calibre2;C: \Program Files (x86)\Notepad++;C:\Windows\System32\WindowsPowerShell\v1.0;C:\W indows\System32\WindowsPowerShell\v1.0;C:\Program Files (x86)\NVIDIA Corporatio n\PhysX\Common;C:\Program Files\Microsoft VS Code\bin;C:\Program Files\mingw-w64 \x86_64-8.1.0-posix-seh-rt_v6-rev0\mingw64\bin;C:\Windows\System32\WindowsPowerS hell\v1.0;C:\msys\1.0\bin;C:\Program Files\mingw-w64\x86_64-8.1.0-posix-seh-rt_ v6-rev0\mingw64\bin;C:\msys\1.0\bin

C:\msys\1.0\bin and C:\Program Files\mingw-w64\x86_64-8.1.0-posix-seh-rt_ v6-rev0\mingw64\bin appear twice...C:\Windows\System32\WindowsPowerShell\v1.0 appears three times...

@brada4
Copy link
Contributor

brada4 commented Jan 30, 2019

What is inside PERL5LIB variable?
@martin-frbg what do you think about reverting partially d1c6469#diff-7cf3cca9d0a67dfeb94e7187f4adba11R4 and permitting script to work with typical broken PERL installations?

@brada4
Copy link
Contributor

brada4 commented Jan 30, 2019

What was done for sf.net libraries:
make HOSTCC=gcc CC=x86_64-w64-mingw32-gcc FC=x86_64-w64-mingw32-gfortran DYNAMIC_ARCH=1 NUM_THREADS=64
Inside Linux virtual machine
Fedora 28/29 or Ubuntu 18.04 will provide these compilers with AVX512 support.
You will need few redist DLLs copied off the building Linux installation
Shortcoming is that DLL will use GOMP provided on that Linux and it will certainly fail to work with very different gcc native to windows.

@martin-frbg
Copy link
Collaborator

@brada4 reverting that is not so easy now that the temporary file is also needed on x86_64 (AVX512 compatibility test). I will check later if conditional loading is possible (tried it once and did not get it to work, but my perl is a bit rusty), but I suspect File::Basename (used in the cross-compile check) will be equally problematic when PERL5LIB points to a conflicting or incomplete module collection.

@brada4
Copy link
Contributor

brada4 commented Jan 30, 2019

@martin-frbg I know it. Though no idea how to detect problem "safely"

@thedude2019
Copy link
Author

What is the MSYS command to print the PERL5LIB variable? this is my perl include path:

$ perl -e "print qq(@inc)"
/usr/lib/perl5/5.6.1/msys /usr/lib/perl5/5.6.1 /usr/lib/perl5/site_perl/5.6.1/ms
ys /usr/lib/perl5/site_perl/5.6.1 /usr/lib/perl5/site_perl .

Thank you!

@thedude2019
Copy link
Author

If I clear the user PATH variable and put C:\msys\1.0\bin before C:\Program Files\mingw-w64\x86_64-8.1.0-posix-seh-rt_ v6-rev0\mingw64\bin in Path system variable I get the same error...my Path system variable is now:

PATH=C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\msys\1.0\bin;C:
Windows\System32\WindowsPowerShell\v1.0;C:\Program Files\Microsoft Windows Perf
ormance Toolkit;C:\Program Files\MiKTeX 2.9\miktex\bin\x64;C:\Program Files (x
86)\Calibre2;C:\Program Files (x86)\Notepad++;C:\Program Files (x86)\NVIDIA Co
rporation\PhysX\Common;C:\Program Files\Microsoft VS Code\bin;C:\Program Files\m
ingw-w64\x86_64-8.1.0-posix-seh-rt_v6-rev0\mingw64\bin;

@martin-frbg
Copy link
Collaborator

  1. What does perl -v return - is it 5.6.1 ? (I think we also saw a few cases where PERL5DIR was set correctly for the perl version in the PATH, but the File::Temp module (File/Temp.pm) was not installed).

  2. Can you try with the attached version of the c_check script (needs renaming back to c_check without the .txt ending - github will not let me attach it unless it is named .txt)
    c_check.txt

@thedude2019
Copy link
Author

  1. this is the output of perl -v :
    $ perl -v

This is perl, v5.6.1 built for msys

Copyright 1987-2001, Larry Wall

Perl may be copied only under the terms of either the Artistic License or the
GNU General Public License, which may be found in the Perl 5 source kit.

Complete documentation for Perl, including FAQ lists, should be found on
this system using man perl' or perldoc perl'. If you have access to the
Internet, point your browser at http://www.perl.com/, the Perl Home Page.

  1. I renamed the file to c_check, put in on the desktop, moved to desktop with cd /c/Users/Zev/Desktop and this is the output:
    $ c_check
    C Compiler () is something wrong.
    1 at ./c_check line 29.

Thank you!

@martin-frbg
Copy link
Collaborator

martin-frbg commented Jan 30, 2019

Sorry, I guess I should have been more verbose - the idea with the c_check file is to replace the c_check file in the OpenBLAS folder with it and retry the build. (c_check is called by one of the Makefiles and expects a number of arguments, including the name of the C compiler, so running it on its own is not really expected to work)
What I did in the new file is try to work around any problems with wrong or unavailable perl modules)

@thedude2019
Copy link
Author

Replaced the c_check file and this is the new error with make:
$ make
Can't locate object method "new" via package "File::Temp" (perhaps you forgot to
load "File::Temp"?) at ./c_check line 231.
make: *** [config.h] Error 2
Makefile.system:201: Makefile.conf: No such file or directory
C:\msys\1.0\bin\make.exe: *** couldn't commit memory for cygwin heap, Win32 erro
r 0

@martin-frbg
Copy link
Collaborator

Weird. Apparently it managed to load the File::Temp module, but for some reason it does not work.

@martin-frbg
Copy link
Collaborator

martin-frbg commented Jan 30, 2019

Disabled the function that needs the File::Temp module on Cygwin (this is "only" needed on Skylake X
hardware to ensure that the compiler can handle AVX512 commands)...
c_check.txt
(On second thought no longer sure that your MSYS setup identifies as Cygwin, but hopefully it does...)

@thedude2019
Copy link
Author

New error:
$ make
Can't locate object method "new" via package "File::Temp" (perhaps you forgot to
load "File::Temp"?) at ./c_check line 232.
make: *** [config.h] Error 2
Makefile.system:201: Makefile.conf: No such file or directory
C:\msys\1.0\bin\make.exe: *** couldn't commit memory for cygwin heap, Win32 erro
r 0

@martin-frbg
Copy link
Collaborator

Hm. Same error, just pushed down one line by the added "if". My mistake, should have tested for OS_CYGWIN_NT rather than just CYGWIN_NT...
c_check.txt

@thedude2019
Copy link
Author

New error (I think it is the same error):
$ make
Can't locate object method "new" via package "File::Temp" (perhaps you forgot to
load "File::Temp"?) at ./c_check line 232.
make: *** [config.h] Error 2
Makefile.system:201: Makefile.conf: No such file or directory
C:\msys\1.0\bin\make.exe: *** couldn't commit memory for cygwin heap, Win32 erro
r 0

@martin-frbg
Copy link
Collaborator

martin-frbg commented Jan 30, 2019

Disabling the test on "OS_WINNT" (anything Microsoft) now
c_check.txt

@martin-frbg
Copy link
Collaborator

Wait, should have been "WiNNT" in the code...
c_check.txt

@thedude2019
Copy link
Author

No error, finally it is doing something :) if it builds with no errors should I try $ make USE_OPENMP=1?

Thank you!

@martin-frbg
Copy link
Collaborator

Yes, but please do make clean before you try that.

@thedude2019
Copy link
Author

It just finished building, this is the final lines of the output:
OpenBLAS build complete. (BLAS CBLAS LAPACK LAPACKE)

OS ... WINNT
Architecture ... x86_64
BINARY ... 64bit
C compiler ... GCC (command line : gcc)
Fortran compiler ... GFORTRAN (command line : gfortran)
Library Name ... libopenblas_haswellp-r0.3.5.a (Multi threaded; Max num-th
reads is 4)

To install the library, you can run "make PREFIX=/path/to/your/installation inst
all".

Trying again with $ make clean and then $ make USE_OPENMP=1...

@thedude2019
Copy link
Author

Finished building with $ make USE_OPENMP=1, these are the final lines of the output:
OpenBLAS build complete. (BLAS CBLAS LAPACK LAPACKE)

OS ... WINNT
Architecture ... x86_64
BINARY ... 64bit
C compiler ... GCC (command line : gcc)
Fortran compiler ... GFORTRAN (command line : gfortran)
Library Name ... libopenblas_haswellp-r0.3.5.a (Multi threaded; Max num-th
reads is 4)

Use OpenMP in the multithreading. Because of ignoring OPENBLAS_NUM_THREADS and
GOTO_NUM_THREADS flags,
you should use OMP_NUM_THREADS environment variable to control the number of th
reads.

To install the library, you can run "make PREFIX=/path/to/your/installation inst
all".

When I compiled it said that it did not find cblas_mangling.h (this file was not generated with the build, it only generated lapacke_mangling.h, lapacke_mangling_with_flags.h.in & cblas_mangling_with_flags.h.in). I fixed the problem by renaming cblas_mangling_with_flags.h.in to cblas_mangling.h (added cblas_mangling.h to include folder). Everything works now...compiled with:
gcc -L "C:...\lib" -I "C:...\include" "C:...\z1.c" -Ofast -fopenmp -lopenblas -lpthread -lgfortran -o "C:...\z1.exe"

The academy is happy to announce that you have just won the Windows Golden Build award :)

Final question, is OpenBLAS now single threaded with USE_OPENMP=1 build? goto_set_num_threads() & openblas_set_num_threads() are disabled?

Thank you for your help!!!

@brada4
Copy link
Contributor

brada4 commented Jan 31, 2019

The thread management functions are still there, just that one swallows parameter and other returns number one, so that libs are drop-in replacement for eachother.
For simple usage, considering future, you probably do not want to call those functions anyway and at least try MKL and ATLAS instead.

@martin-frbg
Copy link
Collaborator

@thedude2019 the USE_OPENMP=1 just tells it to use OpenMP for thread management, it will still be multithreaded (with the number of threads autodetected from the number of cores in the system you built it on). Only the environment variable for setting the number of threads has changed.
@brada4 I am not sure I understand your suggestion ?

@brada4
Copy link
Contributor

brada4 commented Jan 31, 2019

@martin-frbg suggestion is to stick to generic BLAS API-s only.

@brada4
Copy link
Contributor

brada4 commented Feb 4, 2019

Test case is sufficient to try to profile on Linux either way.
At least BLAS part unless buggy should be 10x or so faster than reference code.
Check 2nd graph from end of page here , regression could be in 2nd column from the right as much as OpenBLAS can influence it.

@thedude2019
Copy link
Author

Can you run a test with your Linux system? I only have a windows system...with the default settings of the latest BLAS & OpenBLAS releases I get a better result with BLAS, LAPACK, & LAPACKE (with OpenMP). I will run another test with openblas_set_num_threads(1) & num_threads(4) with the OpenBLAS USE_OPENMP=1 build.

Thank you!

@brada4
Copy link
Contributor

brada4 commented Feb 4, 2019

2 variables just a tip how you may speed it up, sure i will try your test case

@thedude2019
Copy link
Author

When I add openblas_set_num_threads(1) and compile with:
gcc -L "C:...\xianyi-OpenBLAS-eebc189\lib" -I "C:...\xianyi-OpenBLAS-eebc189\include" "C:...\z1.c" -Ofast -fopenmp -lopenblas -lpthread -lgfortran -o "C:...\z1.exe"
I get the following warning:
warning: implicit declaration of function 'openblas_set_num_threads'; did you mean 'omp_set_num_threads'? [-Wimplicit-function-declaration]
openblas_set_num_threads(1);
^~~~~~~~~~~~~~~~~~~~~~~~
omp_set_num_threads
Any idea what is the warning?

Thank you!

@brada4
Copy link
Contributor

brada4 commented Feb 5, 2019

That is defined in (openblas header) cblas.h, since you do not include the file, I suggested to use external variables.

@brada4
Copy link
Contributor

brada4 commented Feb 5, 2019

computation is dominated (60-70%) by dsymv, there is no difference if blas or caller is threaded or not. About 1% thread management overhead adds if threaded without significant change in total run time.

@thedude2019
Copy link
Author

If I add the header cblas.h to the code I get the same warning...I'm attaching the cblas.h of your latest release built with USE_OPENMP=1 (OpenBLAS 0.3.5).
cblas.txt

So what your saying is that because dsymv is a LAPACK function and OpenBLAS is a BLAS optimization I don't see better results?

Thank you!

@thedude2019
Copy link
Author

I think I understand what you mean, LAPACK calls BLAS (or OpenBLAS) but dsymv is still the bottleneck of the computation.

@brada4
Copy link
Contributor

brada4 commented Feb 5, 2019

... _symv which does not do much computation and essentially bottlenecks against memory bandwidth on modern machines....

@thedude2019
Copy link
Author

I've replaced LAPACKE_dsyev with LAPACKE_dsyev_2stage (doesn't have dsymv) and it finished in 619.5 s instead of 2361.8 s (OpenBLAS with OpenMP). will compare the three again...have you looked at the cblas header of the latest release?

@martin-frbg
Copy link
Collaborator

martin-frbg commented Feb 5, 2019

What you uploaded appears to be the cblas.h of the reference implementation, not the one that comes with OpenBLAS ? (The OpenBLAS source contains both, as it incorporates all of "lapack-netlib" although it does not build or use the BLAS or CBLAS parts - the one that gets installed should be from the toplevel folder, not from lapack-netlib/CBLAS/include)

@thedude2019
Copy link
Author

Yes, replaced the build cblas.h with the package cblas.h and also added all the header files from the package to the include folder and I get a very long error output...attached the error.

error.txt

Thank you!

@martin-frbg
Copy link
Collaborator

and also added all the header files from the package to the include folder

That is your problem right there - most of these headers are internal and not meant to be included from other programs. Only cblas.h, f77blas.h, lapacke.h, lapacke_config.h, lapacke_mangling.h, lapacke_utils.h and openblas_config.h should get installed.

@thedude2019
Copy link
Author

After I renamed openblas_config_template.h to openblas_config.h (it also compiles after renaming config.h to openblas_config.h) it compiles without errors with the files you mentioned. I also had to add cblas_mangling.h to the include folder as mentioned previously.

Which file should I rename to openblas_config.h, openblas_config_template.h or config.h?

Thank you!

@martin-frbg
Copy link
Collaborator

Normally that would happen in the install step. openblas_config.h combines both config.h and openblas_config_template.h (with a version string added and a header guard against double inclusion around it, i.e.

#ifndef OPENBLAS_CONFIG_H
#define OPENBLAS_CONFIG_H
#define OPENBLAS_VERSION "OpenBLAS 0.3.5"
<contents of config.h>
<contents of openblas_config_template.h>
#endif

@martin-frbg
Copy link
Collaborator

And the cblas_mangling.h is only required by the (unused) netlib version of cblas.h, not by the OpenBLAS cblas.h so you would not normally need to copy it to the include folder.

@thedude2019
Copy link
Author

I made a mistake...I can't compile with the package cblas.h, I can only compile with lapack-netlib cblas.h & cblas_mangling.h...the build does not generate openblas_config.h and if I paste config.h & openblas_config_template.h in one file and rename to openblas_config.h (and compile with package cblas.h) there is still a header error. Attached config.h & openblas_config_template.h build files.
config.txt
openblas_config_template.txt

Thank you!

@thedude2019
Copy link
Author

Finished to compare LAPACKE_dsyev_2stage with the three builds:

  1. 592.8 s - OpenBLAS regular build (OpenBLAS 0.3.5).
  2. 590.3 s - OpenBLAS build with USE_OPENMP=1 (OpenBLAS 0.3.5).
  3. 1971.3 s - BLAS, LAPACK & LAPACKE (LAPACK 3.7.0).

OpenBLAS is definitely faster...is it crucial to work with the package cblas.h? results with lapack-netlib cblas.h & cblas_mangling.h (without openblas_config.h) are very good.

@martin-frbg
Copy link
Collaborator

The only relevant difference in the OpenBLAS cblas.h is the addition of the prototypes for the openblas-specific utility functions like openblas_set_num_threads(), if you do not use them your current setup should be sufficient. (Not sure why copying the two files together to create openblas_config.h did not work for you though, perhaps you did not add the OPENBLAS_VERSION line ?)

@thedude2019
Copy link
Author

Can you send me the combined openblas_config.h file so I can test? this is what I put in the include folder:
openblas_config.txt

Thank you!

@martin-frbg
Copy link
Collaborator

Ah, I forgot that the constants from the initial config.h get "OPENBLAS_" prepended when copied to openblas_config.h
openblas_config.txt

@thedude2019
Copy link
Author

I get the same error:
In file included from C:...\z1.c:9:
C:...\xianyi-OpenBLAS-eebc189\include/cblas.h:5:10: fatal error: common.h: No such file or directory
#include "common.h"
^~~~~~~~~~
compilation terminated.

@martin-frbg
Copy link
Collaborator

martin-frbg commented Feb 5, 2019

That should be openblas_config.h instead of common.h in the cblas.h (normally converted in the make install step).
BTW the f77blas.h is generated from common_interface.h
f77blas.txt

@thedude2019
Copy link
Author

Replaced f77blas.h and It gives the same error...

@martin-frbg
Copy link
Collaborator

Err, the f77blas.h was just mentioned in passing as it is the last of the header files that are normally generated rather than copied.
You need to change the "common.h" in cblas.h to "openblas_config.h" to make the error go away.

@thedude2019
Copy link
Author

Compiles without errors with your openblas_config.txt & after changing "common.h" in cblas.h to "openblas_config.h" :)

Thank you both for your help!!!

@brada4
Copy link
Contributor

brada4 commented Feb 5, 2019

I've dropped a line in general threading issue so that if there is a measurable improvement reached with _symv it can be verified in this context of deep lapack calls too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants