Skip to content

Target Make #1204

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
thecaligarmo opened this issue Jun 14, 2017 · 15 comments
Closed

Target Make #1204

thecaligarmo opened this issue Jun 14, 2017 · 15 comments

Comments

@thecaligarmo
Copy link

So I have two questions. I'll start with some basic info and the errors I'm getting and then I'll ask the questions at the very bottom.

Makefile:123: *** OpenBLAS: Detecting CPU failed. Please set TARGET explicitly, e.g. make TARGET=your_cpu_target. Please read README for the detail.. Stop. make[3]: Leaving directory '/sage/sage/local/var/tmp/sage/build/openblas-0.2.19.p0/src' Error building OpenBLAS

Here is some info from the log file itself:
`


Host system:
Linux aram-UX330UAK 4.10.0-22-generic #24-Ubuntu SMP Mon May 22 17:43:20 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux


C compiler: gcc
C compiler version:
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/sage/sage/local/libexec/gcc/x86_64-unknown-linux-gnu/5.4.0/lto-wrapper
Target: x86_64-unknown-linux-gnu
Configured with: ../src/configure --prefix=/sage/sage/local --with-local-prefix=/sage/sage/local --with-gmp=/sage/sage/local --with-mpfr=/sage/sage/local --with-mpc=/sage/sage/local --with-system-zlib --disable-multilib --disable-nls --enable-languages=c,c++,fortran --disable-libitm
Thread model: posix
gcc version 5.4.0 (GCC)


Building OpenBLAS: make USE_THREAD=0
make[3]: Entering directory '/sage/sage/local/var/tmp/sage/build/openblas-0.2.19.p0/src'
`

Also, here is the cpu info
$ lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 4 On-line CPU(s) list: 0-3 Thread(s) per core: 2 Core(s) per socket: 2 Socket(s): 1 NUMA node(s): 1 Vendor ID: GenuineIntel CPU family: 6 Model: 142 Model name: Intel(R) Core(TM) i7-7500U CPU @ 2.70GHz Stepping: 9 CPU MHz: 607.824 CPU max MHz: 3500.0000 CPU min MHz: 400.0000 BogoMIPS: 5808.00 Virtualization: VT-x L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 4096K NUMA node0 CPU(s): 0-3 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch epb intel_pt tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid mpx rdseed adx smap clflushopt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp

Question 1) I'm able to get the make to work by doing TARGET=ATOM but I'm not sure that's what I'm supposed to be setting the target to. What is the recommended target for my cpu?

Question 2) We're using openblas in a software that gets made on multiple computers and we've noticed that occasionally it won't make due to target failures. Is there any way for us to find what the target should be? Or some way we can make a "close enough" guess so that the number of issues raised is minimized? We've had at least 5 instances of this problem over the last year brought up to us which means there are likely many more that have not been mentioned.

Thanks.

@martin-frbg
Copy link
Collaborator

For question 1) please see similar issue #1203 from yesterday - Kaby Lake cpus (which were just starting to become available when 0.2.19 was released) are best supported by the HASWELL target
For question 2) the best solution would probably be a 0.2.20 release but it is unclear at the moment when the project leader will resurface to create it. Next best is to use a current git snapshot (which already has the newer cpu model numbers and several other improvements over 0.2.19). In general,
for any reasonably powerful Intel cpu (and the recently released AMD Ryzen) HASWELL should be a suitable default if autodetection fails while ATOM or NEHALEM are expected to work for low-end hardware.

@thecaligarmo
Copy link
Author

What's the main difference between ATOM and NEHALEM? When would one be used over another?

@thecaligarmo
Copy link
Author

Also, when you refer to taking a git snapshot, are you referring to using the current "develop" version? Or are you talking of another branch? We need a stable branch for our software, so I'd be ok with taking "develop" as long as it's deemed "stable" (or maybe another branch which is considered stable)

Thanks

@brada4
Copy link
Contributor

brada4 commented Jun 14, 2017

You use sagemath, genuine 0.2.19
You should ask their forums for improvement (like linking to recent libopenblas.so.0)
Their documentation looks abandoned at best.

@thecaligarmo
Copy link
Author

I'm one of the developers for sagemath and I'm trying to improve the linking in regards to openblas =) Hence the questions above. We already have an implementation where a target can be specified, but I'm trying to make it so that when a target isn't specified and our attempts to make fails, then we retry with a specified target.

Additionally, I want to update our version of openblas however possible, but we'd prefer a stable version, hence my asking of whether or not there is a stable version or not other than the master version (which hasn't been updated for a while)

As to documentation... we're working on it. We're open source so it can be hard to get decent documentation, but we try our best =)

@brada4
Copy link
Contributor

brada4 commented Jun 14, 2017

Probably you need to build with DYNAMIC_ARCH=1 and port at least 2 cpuid files (cpuid_x86.c and dynamic.c aliasing ZEN to HASWELL) to somewhat support newer CPUs (i.e make it p2 where you have p0), also top level option to use -lblas (debian/ubuntu) -lopenblas (suse/fedora/epel) will not hurt anyone.

@thecaligarmo
Copy link
Author

If I understand correctly: Instead of specifying the TARGET we can just use DYNAMIC_ARCH=1 in order to force find a target? And the two cpuid files that you suggested (cpuid_x86.c and drivers/other/dynamic.c) would need to be imported into 0.2.19 for newer CPUs to be properly handled? I don't understand what you mean by the p2 vs p0 thing in reference to the two files. What do we need to make p2? (I'm also confused to the top level options you're talking about. Do you mean when I'm actually constructing openBlas? or at some other isntance?)

@brada4
Copy link
Contributor

brada4 commented Jun 14, 2017

DYNAMIC_ARCH=1 will enable runtime core detection (compiling all possible kernels)
I see directory name 0.2.19p0, so you apply your patch to improve CPUID runtime detection (ZEN is new after 0.2.19, and it is effectively a copy of HASWELL now)
Probably you can remove march=native from openblas build - it has almost no effect, as all processing is done with assembly kernels where C compiler options has no effect.

@brada4
Copy link
Contributor

brada4 commented Jun 14, 2017

Or... not doing general revolution just change cpuid_x86.c replacing *_ZEN with *_HASWELL and put file in place in source tree. That will get you past error in first post (and other ppl too)
DYNAMIC_ARCH is not a good fit where you build almost everything else with -march-native

@thecaligarmo
Copy link
Author

I'm not good enough with the rest of the sage build in order to remove the march=native component quite yet. I know we are using it in other places so even though openblas might not use it anymore, others might so I'm hesitant for now to remove it willy-nilly, but I'll see about removing it at least in the openblas case.

So from what I understand you're now stating I just need to update the cpuid_x86.c file with the latest one and continue to use HASWELL for the target for "default" builds?
Although this still doesn't answer my two previous questions:

  1. what's the difference between ATOM and NEHALEM and when should one be used over another?
  2. Other than master, is there a stable version of openblas that is more up to date? I know it was mentioned that the project leader seems to not be around (or something?) so not sure what the status is of the package at this point.

@brada4
Copy link
Contributor

brada4 commented Jun 15, 2017

To support people with their own builds you need to backport cpuid_x86.c to v0.2.19, namely rebranding all new AMD ZEN series as HASWELL. It is not absolutely imperative to remove march=native, just that such build will not work on slightly different/a bit older CPU than build CPU.

DYNAMIC_ARCH and dynamic.c would be necessary to have universal OpenBLAS for universal build (but not usable while rest of code is built with march=native)

  1. While newer and with same instruction set , ATOM is simpler than NEHALEM, they will run othre's codes but with some performance penalty
  2. develop branch of this repository is the best version, I told you to cherry-pick one file to help manual builds on last year's CPUs. See New release #1118 regarding new releases.

@thecaligarmo
Copy link
Author

Okle dokle. I think that helps a lot. I'm gonna leave the cpuid_x86.c file as is with our current version as we are already husing v0.2.19 with minimal patches impelmented. I think in the end I'll be suggesting we just resort to ATOM for target if the initial "auto target" doesn't work properly for whatever reason. Thanks for all your help.

@brada4
Copy link
Contributor

brada4 commented Jun 16, 2017

Remember what I said about system -lblas and -lopenblas (if you are very handy with dlopen you may try inteligent wrapper for BLAS)

ATOM will be super slow for haswell/zen really (like 3-5x)

Take this to have optimal user builds (download current file version, then sed -i s/_ZEN/_HASWELL/g)
It should support all to date *trail and *zen CPUs with Haswell kernels (like one in your first post)
cpuid_x86.c.TXT

@thecaligarmo
Copy link
Author

Yup. I made note of the -lblas / -lopenblas stuff. Basically I'm making ATOM the default with the ability for a user to use HASWELL if they so deem necessary. It would near impossible for us to know whether a random users CPU is "new" enough for HASWELL or needs to revert to ATOM so it's easier to just push everyone to ATOM even if it's slower and allow the ability to switch to HASWELL if they desire.

For the changing of the file. The problem in our case is that the inclusion of the program is automated. I'll need to make a patch file in order to update the file, so I'll be doing that in the coming days to get it up and running, but I wouldn't be able to just upload the current file as is unfortunately.

@brada4
Copy link
Contributor

brada4 commented Jun 16, 2017

regarding patch - probably ask ones who added p0 after openblas version number. it must be there for a practical reason.
if you make -lblas it is netlib on redhat/fedora and user flexible on debian/suse/ubuntu.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants