-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Target Make #1204
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
For question 1) please see similar issue #1203 from yesterday - Kaby Lake cpus (which were just starting to become available when 0.2.19 was released) are best supported by the HASWELL target |
What's the main difference between ATOM and NEHALEM? When would one be used over another? |
Also, when you refer to taking a git snapshot, are you referring to using the current "develop" version? Or are you talking of another branch? We need a stable branch for our software, so I'd be ok with taking "develop" as long as it's deemed "stable" (or maybe another branch which is considered stable) Thanks |
You use sagemath, genuine 0.2.19 |
I'm one of the developers for sagemath and I'm trying to improve the linking in regards to openblas =) Hence the questions above. We already have an implementation where a target can be specified, but I'm trying to make it so that when a target isn't specified and our attempts to make fails, then we retry with a specified target. Additionally, I want to update our version of openblas however possible, but we'd prefer a stable version, hence my asking of whether or not there is a stable version or not other than the master version (which hasn't been updated for a while) As to documentation... we're working on it. We're open source so it can be hard to get decent documentation, but we try our best =) |
Probably you need to build with DYNAMIC_ARCH=1 and port at least 2 cpuid files (cpuid_x86.c and dynamic.c aliasing ZEN to HASWELL) to somewhat support newer CPUs (i.e make it p2 where you have p0), also top level option to use -lblas (debian/ubuntu) -lopenblas (suse/fedora/epel) will not hurt anyone. |
If I understand correctly: Instead of specifying the TARGET we can just use DYNAMIC_ARCH=1 in order to force find a target? And the two cpuid files that you suggested (cpuid_x86.c and drivers/other/dynamic.c) would need to be imported into 0.2.19 for newer CPUs to be properly handled? I don't understand what you mean by the p2 vs p0 thing in reference to the two files. What do we need to make p2? (I'm also confused to the top level options you're talking about. Do you mean when I'm actually constructing openBlas? or at some other isntance?) |
DYNAMIC_ARCH=1 will enable runtime core detection (compiling all possible kernels) |
Or... not doing general revolution just change cpuid_x86.c replacing |
I'm not good enough with the rest of the sage build in order to remove the march=native component quite yet. I know we are using it in other places so even though openblas might not use it anymore, others might so I'm hesitant for now to remove it willy-nilly, but I'll see about removing it at least in the openblas case. So from what I understand you're now stating I just need to update the cpuid_x86.c file with the latest one and continue to use HASWELL for the target for "default" builds?
|
To support people with their own builds you need to backport cpuid_x86.c to v0.2.19, namely rebranding all new AMD ZEN series as HASWELL. It is not absolutely imperative to remove march=native, just that such build will not work on slightly different/a bit older CPU than build CPU. DYNAMIC_ARCH and dynamic.c would be necessary to have universal OpenBLAS for universal build (but not usable while rest of code is built with march=native)
|
Okle dokle. I think that helps a lot. I'm gonna leave the cpuid_x86.c file as is with our current version as we are already husing v0.2.19 with minimal patches impelmented. I think in the end I'll be suggesting we just resort to ATOM for target if the initial "auto target" doesn't work properly for whatever reason. Thanks for all your help. |
Remember what I said about system -lblas and -lopenblas (if you are very handy with dlopen you may try inteligent wrapper for BLAS) ATOM will be super slow for haswell/zen really (like 3-5x) Take this to have optimal user builds (download current file version, then |
Yup. I made note of the -lblas / -lopenblas stuff. Basically I'm making ATOM the default with the ability for a user to use HASWELL if they so deem necessary. It would near impossible for us to know whether a random users CPU is "new" enough for HASWELL or needs to revert to ATOM so it's easier to just push everyone to ATOM even if it's slower and allow the ability to switch to HASWELL if they desire. For the changing of the file. The problem in our case is that the inclusion of the program is automated. I'll need to make a patch file in order to update the file, so I'll be doing that in the coming days to get it up and running, but I wouldn't be able to just upload the current file as is unfortunately. |
regarding patch - probably ask ones who added p0 after openblas version number. it must be there for a practical reason. |
So I have two questions. I'll start with some basic info and the errors I'm getting and then I'll ask the questions at the very bottom.
Makefile:123: *** OpenBLAS: Detecting CPU failed. Please set TARGET explicitly, e.g. make TARGET=your_cpu_target. Please read README for the detail.. Stop. make[3]: Leaving directory '/sage/sage/local/var/tmp/sage/build/openblas-0.2.19.p0/src' Error building OpenBLAS
Here is some info from the log file itself:
`
Host system:
Linux aram-UX330UAK 4.10.0-22-generic #24-Ubuntu SMP Mon May 22 17:43:20 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
C compiler: gcc
C compiler version:
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/sage/sage/local/libexec/gcc/x86_64-unknown-linux-gnu/5.4.0/lto-wrapper
Target: x86_64-unknown-linux-gnu
Configured with: ../src/configure --prefix=/sage/sage/local --with-local-prefix=/sage/sage/local --with-gmp=/sage/sage/local --with-mpfr=/sage/sage/local --with-mpc=/sage/sage/local --with-system-zlib --disable-multilib --disable-nls --enable-languages=c,c++,fortran --disable-libitm
Thread model: posix
gcc version 5.4.0 (GCC)
Building OpenBLAS: make USE_THREAD=0
make[3]: Entering directory '/sage/sage/local/var/tmp/sage/build/openblas-0.2.19.p0/src'
`
Also, here is the cpu info
$ lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 4 On-line CPU(s) list: 0-3 Thread(s) per core: 2 Core(s) per socket: 2 Socket(s): 1 NUMA node(s): 1 Vendor ID: GenuineIntel CPU family: 6 Model: 142 Model name: Intel(R) Core(TM) i7-7500U CPU @ 2.70GHz Stepping: 9 CPU MHz: 607.824 CPU max MHz: 3500.0000 CPU min MHz: 400.0000 BogoMIPS: 5808.00 Virtualization: VT-x L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 4096K NUMA node0 CPU(s): 0-3 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch epb intel_pt tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid mpx rdseed adx smap clflushopt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp
Question 1) I'm able to get the make to work by doing TARGET=ATOM but I'm not sure that's what I'm supposed to be setting the target to. What is the recommended target for my cpu?
Question 2) We're using openblas in a software that gets made on multiple computers and we've noticed that occasionally it won't make due to target failures. Is there any way for us to find what the target should be? Or some way we can make a "close enough" guess so that the number of issues raised is minimized? We've had at least 5 instances of this problem over the last year brought up to us which means there are likely many more that have not been mentioned.
Thanks.
The text was updated successfully, but these errors were encountered: