Skip to content

Conversation

luoge457
Copy link

@luoge457 luoge457 commented Dec 10, 2021

Description of PR

https://issues.apache.org/jira/browse/YARN-10863

How was this patch tested?

For code changes:

  • Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')?
  • Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE, LICENSE-binary, NOTICE-binary files?

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 50s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 shelldocs 0m 0s Shelldocs was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 2 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 35m 33s trunk passed
+1 💚 compile 1m 29s trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04
+1 💚 compile 1m 27s trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
+1 💚 checkstyle 0m 34s trunk passed
+1 💚 mvnsite 0m 44s trunk passed
+1 💚 javadoc 0m 47s trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04
+1 💚 javadoc 0m 32s trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
+1 💚 spotbugs 1m 30s trunk passed
+1 💚 shadedclient 23m 41s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 38s the patch passed
+1 💚 compile 1m 24s the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04
+1 💚 javac 1m 24s the patch passed
+1 💚 compile 1m 19s the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
+1 💚 javac 1m 19s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 24s the patch passed
+1 💚 mvnsite 0m 37s the patch passed
+1 💚 shellcheck 0m 1s No new issues.
+1 💚 javadoc 0m 30s the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04
+1 💚 javadoc 0m 27s the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
+1 💚 spotbugs 1m 27s the patch passed
+1 💚 shadedclient 23m 15s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 22m 57s hadoop-yarn-server-nodemanager in the patch passed.
+1 💚 asflicense 0m 29s The patch does not generate ASF License warnings.
120m 47s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3781/1/artifact/out/Dockerfile
GITHUB PR #3781
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell shellcheck shelldocs
uname Linux f8c18c2dd6bf 4.15.0-163-generic #171-Ubuntu SMP Fri Nov 5 11:55:11 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / b2d17f0
Default Java Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3781/1/testReport/
Max. process+thread count 519 (vs. ulimit of 5500)
modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3781/1/console
versions git=2.25.1 maven=3.6.3 shellcheck=0.7.0 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org

This message was automatically generated.

@luoge457
Copy link
Author

luoge457 commented Jan 7, 2022

@toddlipcon @phunt @eddyxu can somebody help to code review?

Comment on lines +745 to +746
if ((strictMemoryEnforcement && !elasticMemoryEnforcement) ||
(!strictMemoryEnforcement && elasticMemoryEnforcement)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the use case of strictMemoryEnforcement && elasticMemoryEnforcement not matching this condition (and thus continuing with the checkLimit). Should this condition just be strictMemoryEnforcement || elasticMemoryEnforcement?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for your review, according to the checkLimit notes, strictMemoryEnforcement && elasticMemoryEnforcement condition should fall back to the polling-based mechanism.

// When cgroup-based strict memory enforcement is used alone without // elastic memory control, the oom-kill would take care of it. // However, when elastic memory control is also enabled, the oom killer // would be disabled at the root yarn container cgroup level (all child // cgroups would inherit that setting). Hence, we fall back to the // polling-based mechanism.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Kimahriman please let me know if there is anything to improve, I really want to contribute to the community,thanks very much。

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if the elastic memory control is enabled, it basically disables the CGroup based strict enforcement for some reason, so it has to use the polling mechanism here for that instead. Is that right? I'm trying to find where or why that would be true, but I guess that is what that comment is suggesting.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Kimahriman thanks for your comment, yes, elastic memory control will disabled strict enforcement, so when both elastic memory control and strict enforcement are enabled, containers won’t be killed by the oom killer when they go over their memory limit, it's better falls back to the polling-based mechanism

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @luoge457 - The fix itself looks good, but I think the current comment does not apply to the new condition !strictMemoryEnforcement && elasticMemoryEnforcement. Would you update the comment as follows?

if (strictMemoryEnforcement && !elasticMemoryEnforcement) {
  // When cgroup-based strict memory enforcement is used alone without
  // elastic memory control, the oom-kill would take care of it.
  // However, when elastic memory control is also enabled, the oom killer
  // would be disabled at the root yarn container cgroup level (all child
  // cgroups would inherit that setting). Hence, we fall back to the
  // polling-based mechanism.
  return;
}
if (!strictMemoryEnforcement && elasticMemoryEnforcement) {
  // TODO: add your comment
  return;
}

Sorry for the late response.

@luoge457
Copy link
Author

@aajisaka would you please take a look? thanks

@luoge457
Copy link
Author

@szilard-nemeth would you please take a look? thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants