Skip to content

Honor cgroup/cpuset limits when enumerating cpus #1249

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 25, 2017
Merged

Honor cgroup/cpuset limits when enumerating cpus #1249

merged 1 commit into from
Jul 25, 2017

Conversation

martin-frbg
Copy link
Collaborator

Fix for #1155 redone, with ifdefs reworked to build on non-glibc systems as well

@martin-frbg martin-frbg merged commit ae93532 into OpenMathLib:develop Jul 25, 2017
@jirutka
Copy link
Contributor

jirutka commented Jul 27, 2017

No, it’s not fixed at all, memory.c is the same as before.

@jirutka
Copy link
Contributor

jirutka commented Jul 28, 2017

Hm, I’m not sure now, develop branch passed on Alpine on Travis #1255. Maybe I’ve made some mistake when backporting the fix. I’ll look at it later. (EDIT: No, it’s really broken, see next comment)

@martin-frbg
Copy link
Collaborator Author

If you do not even want the cgroups functionality, you can checkout master instead of develop. I simply reverted the original PR there with the understanding that xianyi will release 0.2.21 as soon as possible to resolve the issue.

@jirutka
Copy link
Contributor

jirutka commented Jul 28, 2017

I was actually right, it’s still broken. However, it fails only with specific flags: BINARY=64 USE_OPENMP=0 NO_LAPACK=0 NO_AFFINITY=1 TARGET=core2. I guess that NO_AFFINITY=1 is the one that reveals this error. See https://travis-ci.org/xianyi/OpenBLAS/jobs/258627266.

@jirutka
Copy link
Contributor

jirutka commented Jul 28, 2017

I didn’t say that I don’t want cgroups functionality, musl of course supports cgroups.

All I want is to update Alpine’s OpenBLAS package from 0.2.19 to the latest version (currently 0.2.20). I can’t, due to 29fc429 that breaks compatibility with non-glibc systems. I’ve tried to pick v0.2.20 and port changes introduced in this PR that is supposed to rework the previous faulty commit, but it has exactly the same problem as the previous one.

After I removed irrelevant changes in whitespaces, this is what I got:

--- a/driver/others/init.c
+++ b/driver/others/init.c
@@ -837,8 +837,12 @@ void gotoblas_affinity_init(void) {
     //returns the number of processors which are currently online

     nums = sysconf(_SC_NPROCESSORS_CONF);
-
-#if !defined(__GLIBC_PREREQ) || !__GLIBC_PREREQ(2, 3)
+
+#if !defined(__GLIBC_PREREQ)
+    common->num_procs = nums;
+#else
+
+#if !__GLIBC_PREREQ(2, 3)
     common->num_procs = nums;
 #elif __GLIBC_PREREQ(2, 7)
     cpusetp = CPU_ALLOC(nums);
@@ -870,8 +874,8 @@ void gotoblas_affinity_init(void) {
     common->num_procs = CPU_COUNT(sizeof(cpu_set_t),cpusetp);
 #endif

-#endif
-
+#endif
+#endif
     if(common -> num_procs > MAX_CPUS) {
       fprintf(stderr, "\nOpenBLAS Warning : The number of CPU/Cores(%d) is beyond the limit(%d). Terminated.\n", common->num_procs, MAX_CPUS);
       exit(1);

It does not fix memory.c.

I didn’t want to revert 29fc429 completely in the package, because it introduces larger change.

@martin-frbg
Copy link
Collaborator Author

If you pull from master instead of develop, you should currently get a 0.2.20 with just the (broken) cgroups patch reverted - that PR was not intended for 0.2.20 anyway as I had done only (too) limited testing. Difference to current develop besides the (apparently still broken) cgroups patch would be the L1 cache detection fix for recent Intel chips. I'll see if I can do (and test!) the fix for memory.c on develop this weekend - past week has been quite stressful for me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants