-
Notifications
You must be signed in to change notification settings - Fork 9.1k
YARN-10863: CGroupElasticMemoryController is not work #3781
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: trunk
Are you sure you want to change the base?
Conversation
🎊 +1 overall
This message was automatically generated. |
@toddlipcon @phunt @eddyxu can somebody help to code review? |
if ((strictMemoryEnforcement && !elasticMemoryEnforcement) || | ||
(!strictMemoryEnforcement && elasticMemoryEnforcement)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the use case of strictMemoryEnforcement && elasticMemoryEnforcement
not matching this condition (and thus continuing with the checkLimit). Should this condition just be strictMemoryEnforcement || elasticMemoryEnforcement
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for your review, according to the checkLimit notes, strictMemoryEnforcement && elasticMemoryEnforcement condition should fall back to the polling-based mechanism.
// When cgroup-based strict memory enforcement is used alone without // elastic memory control, the oom-kill would take care of it. // However, when elastic memory control is also enabled, the oom killer // would be disabled at the root yarn container cgroup level (all child // cgroups would inherit that setting). Hence, we fall back to the // polling-based mechanism.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Kimahriman please let me know if there is anything to improve, I really want to contribute to the community,thanks very much。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So if the elastic memory control is enabled, it basically disables the CGroup based strict enforcement for some reason, so it has to use the polling mechanism here for that instead. Is that right? I'm trying to find where or why that would be true, but I guess that is what that comment is suggesting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Kimahriman thanks for your comment, yes, elastic memory control will disabled strict enforcement, so when both elastic memory control and strict enforcement are enabled, containers won’t be killed by the oom killer when they go over their memory limit, it's better falls back to the polling-based mechanism
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @luoge457 - The fix itself looks good, but I think the current comment does not apply to the new condition !strictMemoryEnforcement && elasticMemoryEnforcement
. Would you update the comment as follows?
if (strictMemoryEnforcement && !elasticMemoryEnforcement) {
// When cgroup-based strict memory enforcement is used alone without
// elastic memory control, the oom-kill would take care of it.
// However, when elastic memory control is also enabled, the oom killer
// would be disabled at the root yarn container cgroup level (all child
// cgroups would inherit that setting). Hence, we fall back to the
// polling-based mechanism.
return;
}
if (!strictMemoryEnforcement && elasticMemoryEnforcement) {
// TODO: add your comment
return;
}
Sorry for the late response.
@aajisaka would you please take a look? thanks |
@szilard-nemeth would you please take a look? thanks |
Description of PR
https://issues.apache.org/jira/browse/YARN-10863
How was this patch tested?
For code changes:
LICENSE
,LICENSE-binary
,NOTICE-binary
files?