Skip to content

NUMA bindings support for Shm Segments #161

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from

Conversation

vinser52
Copy link
Contributor

@vinser52 vinser52 commented Sep 13, 2022

This set of changes allows binding particular memory segments to the specified NUMA node(s).

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 13, 2022
Copy link
Contributor

@haowu14 haowu14 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for making this PR. It looks good overall. Mostly just coding style nits.

@vinser52 vinser52 force-pushed the upstream_numa_bindings branch 2 times, most recently from 79bbf1a to 21b8cbb Compare September 22, 2022 19:19
return retAddr;
}

void PosixShmSegment::unMap(void* addr) const {
detail::munmapImpl(addr, getSize());
}

static void forcePageAllocation(void* addr, size_t size, size_t pageSize) {
for(volatile char* curAddr = (char*)addr; curAddr < (char*)addr+size; curAddr += pageSize) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use reinterpret_cast instead of c-style casting.

@vinser52 vinser52 force-pushed the upstream_numa_bindings branch from 21b8cbb to 682cc63 Compare September 29, 2022 12:16
@vinser52 vinser52 force-pushed the upstream_numa_bindings branch from 682cc63 to 9a6bbb1 Compare October 24, 2022 16:58
return;
}

switch (errno) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need a wrapper for handling different error case here just to compile the error message? What would the strerror() have to say here?

Also, it seems that we don't use the any other flag except MPOL_BIND.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

strerror() is context independent and too abstract.

Regarding flag variable. Today it is 0 (MPOL_BIND is a mode parameter) when we call mbindImpl(), but it is passed as a parameter to the mbindImpl() function. What if in the future we will set flag with some values?

If we assume that flag is always 0, like in current implementation, the switch block could be simplified to the following:

switch (errno) {
  case EFAULT:
    util::throwSystemError(errno);
    break;
  case EINVAL:
    util::throwSystemError(errno, "Invalid parameters when bind segment to NUMA node(s)");
    break;
  case ENOMEM:
    util::throwSystemError(errno, "Could not bind memory. Insufficient kernel memory was available");
    break;
  default:
    XDCHECK(false);
    util::throwSystemError(errno, "Invalid errno");
  }

Should we simply the implementation for now?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just an error path, which is unlikely or unexpected to be hit. Would just printing errno along with strerror() just work for debugging?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I agree. I have replaced this switch block with strerror()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

if(opts_.memBindNumaNodes.empty()) return;

struct bitmask *oldNodeMask = numa_allocate_nodemask();
auto guard = folly::makeGuard([&] { numa_bitmask_free(oldNodeMask); });
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You've already added NumaBitMask wrapping the struct bitmask. Why not use it instead of having this destructor? Also, I think you can override the cast operator to struct bitmask and pass the NumaBitMask instead of struct bitmask

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@vinser52 vinser52 force-pushed the upstream_numa_bindings branch from 9a6bbb1 to edd034d Compare November 4, 2022 13:32
Copy link
Contributor

@jaesoo-fb jaesoo-fb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only minor issues.

return;
}

switch (errno) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just an error path, which is unlikely or unexpected to be hit. Would just printing errno along with strerror() just work for debugging?

@jaesoo-fb
Copy link
Contributor

nit. I think it will be fixed automatically when we pull those, but could you remove the trailing whitespaces and tabs?

$ git diff $(git merge-base --fork-point intern/main) --check
cachelib/cachebench/test_configs/simple_tiers_test.json:15: trailing whitespace.
+
cachelib/cachebench/test_configs/simple_tiers_test.json:23: trailing whitespace.
+
cachelib/cachebench/test_configs/simple_tiers_test.json:26: trailing whitespace.
+
cachelib/cachebench/test_configs/simple_tiers_test.json:29: trailing whitespace.
+
cachelib/cachebench/util/CacheConfig.cpp:86: trailing whitespace.
+
cachelib/shm/ShmCommon.h:103: trailing whitespace.
+
cachelib/shm/SysVShmSegment.cpp:193: trailing whitespace.
+  struct bitmask *nodesMask = static_cast<struct bitmask*>(memBindNumaNodes);
cachelib/shm/SysVShmSegment.cpp:194: trailing whitespace.
+
examples/single_tier_cache/main.cpp:83: space before tab in indent.
+       auto res = put(key, value);

@vinser52 vinser52 force-pushed the upstream_numa_bindings branch from edd034d to 24996c4 Compare November 21, 2022 15:05
Copy link
Contributor

@jaesoo-fb jaesoo-fb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only minor issues

@vinser52 vinser52 force-pushed the upstream_numa_bindings branch from 24996c4 to 1fbc002 Compare November 30, 2022 17:44
Copy link
Contributor

@jaesoo-fb jaesoo-fb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Could you resolve Hao's comments?

@vinser52 vinser52 force-pushed the upstream_numa_bindings branch from 1fbc002 to bda5e07 Compare December 2, 2022 17:42
@facebook-github-bot
Copy link
Contributor

@jaesoo-fb has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

1 similar comment
@facebook-github-bot
Copy link
Contributor

@jaesoo-fb has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@jaesoo-fb has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

2 similar comments
@facebook-github-bot
Copy link
Contributor

@jaesoo-fb has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@jaesoo-fb has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@jaesoo-fb
Copy link
Contributor

@vinser52 Hi. There are some merge conflicts encountered while importing this (cachebench/util/CacheConfig.cpp). Could you do a rebase on top of main?

Copy link
Contributor

@jaesoo-fb jaesoo-fb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While you are doing the rebase, could you add space after for and if?

@vinser52 vinser52 force-pushed the upstream_numa_bindings branch from bda5e07 to 3b3a733 Compare December 15, 2022 12:40
@facebook-github-bot
Copy link
Contributor

@vinser52 has updated the pull request. You must reimport the pull request before landing.

@vinser52
Copy link
Contributor Author

@vinser52 Hi. There are some merge conflicts encountered while importing this (cachebench/util/CacheConfig.cpp). Could you do a rebase on top of main?

Done

@facebook-github-bot
Copy link
Contributor

@jaesoo-fb has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@vinser52 vinser52 force-pushed the upstream_numa_bindings branch from 3b3a733 to e041015 Compare December 16, 2022 11:05
@facebook-github-bot
Copy link
Contributor

@vinser52 has updated the pull request. You must reimport the pull request before landing.

@vinser52
Copy link
Contributor Author

Hi @jaesoo-fb,

I just fixed one minor issue after the merge. Please re-import this PR again. Sorry for the inconvenience.

@facebook-github-bot
Copy link
Contributor

@jaesoo-fb has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@jaesoo-fb
Copy link
Contributor

Somehow, I cannot clearly see what has been changed, and also I cannot checkout the intermediate version 3b3a733. But comparing to the rebased version of bda5e07, it seems that below is the change you mentioned.

diff --git a/cachelib/cachebench/util/CacheConfig.cpp b/cachelib/cachebench/util/CacheConfig.cpp
index e08198d0..a3a742d2 100644
--- a/cachelib/cachebench/util/CacheConfig.cpp
+++ b/cachelib/cachebench/util/CacheConfig.cpp
@@ -102,7 +102,7 @@ CacheConfig::CacheConfig(const folly::dynamic& configJson) {
   // if you added new fields to the configuration, update the JSONSetVal
   // to make them available for the json configs and increment the size
   // below
-  checkCorrectSize<CacheConfig, 696>();
+  checkCorrectSize<CacheConfig, 728>();

   if (numPools != poolSizes.size()) {
     throw std::invalid_argument(folly::sformat(

Can you confirm?

Also, can you remove this white space?

$ git diff 519f664f --check
cachelib/cachebench/util/CacheConfig.h:49: trailing whitespace.
+

@vinser52 vinser52 force-pushed the upstream_numa_bindings branch from e041015 to b5422cf Compare December 16, 2022 23:22
@facebook-github-bot
Copy link
Contributor

@vinser52 has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@jaesoo-fb has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@vinser52
Copy link
Contributor Author

Somehow, I cannot clearly see what has been changed, and also I cannot checkout the intermediate version 3b3a733. But comparing to the rebased version of bda5e07, it seems that below is the change you mentioned.

diff --git a/cachelib/cachebench/util/CacheConfig.cpp b/cachelib/cachebench/util/CacheConfig.cpp
index e08198d0..a3a742d2 100644
--- a/cachelib/cachebench/util/CacheConfig.cpp
+++ b/cachelib/cachebench/util/CacheConfig.cpp
@@ -102,7 +102,7 @@ CacheConfig::CacheConfig(const folly::dynamic& configJson) {
   // if you added new fields to the configuration, update the JSONSetVal
   // to make them available for the json configs and increment the size
   // below
-  checkCorrectSize<CacheConfig, 696>();
+  checkCorrectSize<CacheConfig, 728>();

   if (numPools != poolSizes.size()) {
     throw std::invalid_argument(folly::sformat(

Can you confirm?

Also, can you remove this white space?

$ git diff 519f664f --check
cachelib/cachebench/util/CacheConfig.h:49: trailing whitespace.
+

Yeah, I confirm. After rebase checkCorrectSize<CacheConfig, 696>(); was incorrect. I beleieve you cannot see intermediate version because I do squash every time I address the comment from your side (to avoid a lot of small commits). Is it a bad practice?

Trailing whitespace in cachelib/cachebench/util/CacheConfig.h:49 is removed.

@jaesoo-fb
Copy link
Contributor

Yeah, I confirm. After rebase checkCorrectSize<CacheConfig, 696>(); was incorrect. I beleieve you cannot see intermediate version because I do squash every time I address the comment from your side (to avoid a lot of small commits). Is it a bad practice?

No, I don't think squashing the change is the problem; I expected the git repository should have the old commit accessible until being GCed. Anyway, no worries for this.

@facebook-github-bot
Copy link
Contributor

@jaesoo-fb merged this pull request in 26e02bf.

facebook-github-bot pushed a commit that referenced this pull request Feb 27, 2023
Summary:
Fix OSS builds by adding numa deps to build files. Currently some fail on missing `numa.h`.

Context: #161 added the dependencies to the centOS, debian, and ubuntu18 build files. The PR was opened in Sep 2022 but only landed in Dec 2022, and so probably missed out on the fedora, rocky and arch build files which were added in-between those dates. Having had those build actions run on PRs would have caught this (currently, they are only scheduled.)

Pull Request resolved: #197

Test Plan: Github Actions builds (ideally, #198 would be landed first.) I've checked that those packages exist for the respective repositories but didn't run them myself.

Reviewed By: jaesoo-fb

Differential Revision: D43587970

Pulled By: jiayuebao

fbshipit-source-id: 8c59e48528042350e576a45ffc3bf2520699f5a9
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants