Skip to content

Fix and document the two HUGETLB options for buffer allocation in Makefile.rule #4662

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
May 7, 2024

Conversation

martin-frbg
Copy link
Collaborator

prompted by a recent question - these are undocumented holdovers from GotoBLAS

@XiWeiGu
Copy link
Contributor

XiWeiGu commented Apr 25, 2024

Regarding the option HUGETLB_ALLOCATION, setting it to 1 will define ALLOC_HUGETLB. However, this macro does not change the form of memory allocation. I feel that this option seems to have no effect.

@martin-frbg
Copy link
Collaborator Author

I won't merge this without further testing, this is here so I won't forget.
ALLOC_HUGETLB ends up somewhere in the ALLOC_SHM section of memory.c but it could be that already the surrounding code is never used. Sadly the old GotoBLAS had many plausible-looking hooks for features that were never implemented

@martin-frbg
Copy link
Collaborator Author

Seems one also needs to define ALLOC_SHM, but after that the new code path is actually taken.

@XiWeiGu
Copy link
Contributor

XiWeiGu commented Apr 26, 2024

Seems one also needs to define ALLOC_SHM, but after that the new code path is actually taken.

Indeed, it is necessary to define ALLOC_SHM, but the new path does not enable large page memory allocation. This is because it prioritizes alloc_shm over alloc_hugetlb, and alloc_shm does not have the flag for large pages. It seems a bit confusing (perhaps also due to my misunderstanding of the original intent).

@XiWeiGu
Copy link
Contributor

XiWeiGu commented Apr 26, 2024

Commit ebb9eba9875d68ec7efe3f9e362f0a65384a8277 made the following changes:

@@ -917,12 +917,13 @@ void *blas_memory_alloc(int procpos){
 #ifdef ALLOC_DEVICEDRIVER
     alloc_devicedirver,
 #endif
-#if defined OS_LINUX  || defined OS_AIX  || defined __sun__  || defined OS_WINDOWS
-    alloc_hugetlb,
-#endif
+/* Hugetlb implicitly assumes ALLOC_SHM */
 #ifdef ALLOC_SHM
     alloc_shm,
 #endif
+#if ((defined ALLOC_SHM) && (defined OS_LINUX  || defined OS_AIX  || defined __sun__  || defined OS_WINDOWS))
+    alloc_hugetlb,
+#endif
 #ifdef ALLOC_MMAP
     alloc_mmap,
 #endif

Using ALLOC_SHM to control alloc_shm. alloc_hugetlb also internally uses shmget for large page allocation, So it also uses ALLOC_SHM to control, which leads to never executing alloc_hugetlb (because ALLOC_SHM must be defined, and according to priority, it enters alloc_shm).

@martin-frbg
Copy link
Collaborator Author

Yes, and I'm getting the impression that the PR566 was bogus, there should be no dependency on ALLOC_SHM

@martin-frbg martin-frbg changed the title Document the two HUGETLB options for buffer allocation in Makefile.rule Fix and document the two HUGETLB options for buffer allocation in Makefile.rule May 4, 2024
@martin-frbg martin-frbg added this to the 0.3.28 milestone May 7, 2024
@martin-frbg martin-frbg merged commit 8735b54 into OpenMathLib:develop May 7, 2024
40 of 74 checks passed
@XiWeiGu
Copy link
Contributor

XiWeiGu commented May 7, 2024

Hello, this PR removes the test for compiler options -march=loongarch64 on loongarch64, which will cause compilation failure when using clang.

@XiWeiGu
Copy link
Contributor

XiWeiGu commented May 7, 2024

Another issue is that when COMPILE_TLS is defined, the alloc_hugetlb interface is still controlled by the ALLOC_SHM macro, which prevents the use of large page functionality. (Please correct me if I'm mistaken.)

@martin-frbg
Copy link
Collaborator Author

Oops, sorry, missed that difference in the merge conflict resolution. Restoring in #4679. I will look into cleaning up the USE_TLS side of memory.c later today. (And maybe memory.o should be conditionally compiled from either of two separate C files instead of being the giant Frankenstein monster I created when the TLS PRs landed)

@XiWeiGu
Copy link
Contributor

XiWeiGu commented May 7, 2024

While I'm not intimately familiar with the structure of memory.c, it's clear that it's currently quite bloated with a lot of redundant code. Splitting it into two files seems like a good choice based on what I can see.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants