-
Notifications
You must be signed in to change notification settings - Fork 9.1k
YARN-11015. Decouple queue capacity with ability to run OPPORTUNISTIC container #3779
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
* Adds queueing policies `BY_RESOURCES` and `BY_QUEUE_LEN` at the NM * If `BY_RESOURCES` is specified, the NM will queue as long as it has enough resources to run all pending + running containers, otherwise, it will reject the OPPORTUNISTIC container * If BY_QUEUE_LEN is specified, the NM will only accept as many containers as its queue capacity is configured * Restructure `TestContainerSchedulerQueueing` to accommodate different queueing policies at the NM
🎊 +1 overall
This message was automatically generated. |
queuePolicy = YarnConfiguration | ||
.DEFAULT_NM_OPPORTUNISTIC_CONTAINERS_QUEUE_POLICY; | ||
} else { | ||
queuePolicy = context.getConf().get( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think Configuration#getEnum() could be an option
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml
Show resolved
Hide resolved
...server/nodemanager/containermanager/scheduler/AllocationBasedResourceUtilizationTracker.java
Show resolved
Hide resolved
(int) (pMemBytes >> 20) > | ||
(int) (getContainersMonitor() | ||
.getPmemAllocatedForContainers() >> 20)) { | ||
(int) convertBytesToMB(pMemBytes) > |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to cast?
...org/apache/hadoop/yarn/server/nodemanager/containermanager/scheduler/ContainerScheduler.java
Show resolved
Hide resolved
|
||
@Override | ||
public void preTransition(ContainerImpl op, | ||
org.apache.hadoop.yarn.server.nodemanager.containermanager.container. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didnt you import this? ContainerState should be correct.
...a/org/apache/hadoop/yarn/server/nodemanager/containermanager/BaseContainerSchedulerTest.java
Show resolved
Hide resolved
...a/org/apache/hadoop/yarn/server/nodemanager/containermanager/BaseContainerSchedulerTest.java
Show resolved
Hide resolved
} | ||
|
||
private static boolean isSuccessfulRun(final ContainerStatus containerStatus) { | ||
return containerStatus.getContainerSubState() == ContainerSubState.RUNNING |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we split this?
if (containerStatus.getContainerSubState() == ContainerSubState.RUNNING) {
return true;
}
...
|
||
// Wait for the OContainer to get killed | ||
BaseContainerManagerTest.waitForNMContainerState(containerManager, | ||
createContainerId(1), ContainerState.DONE, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Single line instead of dropping the 40
@goiri thanks for the review! I've updated the PR to address your comments, please have another look. |
🎊 +1 overall
This message was automatically generated. |
...a/org/apache/hadoop/yarn/server/nodemanager/containermanager/BaseContainerSchedulerTest.java
Show resolved
Hide resolved
...a/org/apache/hadoop/yarn/server/nodemanager/containermanager/BaseContainerSchedulerTest.java
Show resolved
Hide resolved
} | ||
|
||
private static OpportunisticContainersQueuePolicy | ||
getOppContainersQueuePolicyFromConf(final Context context) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getOppContainersQueuePolicyFromConf(final Context context) {: 'getOppContainersQueuePolicyFromConf' has incorrect indentation level 2, expected level should be 6. [Indentation]
|
||
// Wait for the OContainer to get killed | ||
BaseContainerManagerTest.waitForNMContainerState(containerManager, | ||
createContainerId(1), ContainerState.DONE,40); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
createContainerId(1), ContainerState.DONE,40);:50: ',' is not followed by whitespace. [WhitespaceAfter]
@goiri updated to fix the checkstyle violations, thanks for the review! |
🎊 +1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@afchung could you please provide details on the testing that was done apart from the unit tests? It would be great if you could add the details to the How was this patch tested
section in description of this PR.
@GauthamBanasandra thanks for taking a look! In addition to unit tests, we also deployed the change to a production cluster. I've added more details on what we observed in our deployment to the description of the PR. Thanks! |
Thanks @afchung. Please allow me 2-3 days for reviewing this PR. |
*/ | ||
BY_QUEUE_LEN, | ||
/** | ||
* Determines wheether or not to run a container based on the amount of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo "wheether"
...a/org/apache/hadoop/yarn/server/nodemanager/containermanager/BaseContainerSchedulerTest.java
Show resolved
Hide resolved
if (subState == ContainerSubState.DONE) { | ||
return state == | ||
org.apache.hadoop.yarn.api.records.ContainerState.COMPLETE; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The if
check here is redundant, since the condition is already satisfied by switch-case
. You can just have
case DONE:
return state == org.apache.hadoop.yarn.api.records.ContainerState.COMPLETE;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not necessarily, the subState
can be COMPLETING
, in which case we should not check state == org.apache.hadoop.yarn.api.records.ContainerState.COMPLETE
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Thanks for clarifying. It seems a bit convoluted at first 😅. How about this?
case RUNNING:
case COMPLETING:
return true;
case DONE:
return state == org.apache.hadoop.yarn.api.records.ContainerState.COMPLETE;
default:
return false;
...r/nodemanager/containermanager/scheduler/TestContainerSchedulerOppContainersByResources.java
Show resolved
Hide resolved
...r/nodemanager/containermanager/scheduler/TestContainerSchedulerOppContainersByResources.java
Show resolved
Hide resolved
@Test | ||
public void testOpportunisticRunsWhenResourcesAvailable() throws Exception { | ||
containerManager.start(); | ||
List<StartContainerRequest> list = new ArrayList<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please rename the variable with a more specific one. The name list
is quite generic.
|
||
/** | ||
* Tests that newly arrived containers after the resources are filled up | ||
* get killed and never get killed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please the part get killed and never get killed
a little more?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, this was a typo.
@Test | ||
public void testKillOpportunisticWhenNoResourcesAvailable() throws Exception { | ||
containerManager.start(); | ||
List<StartContainerRequest> list = new ArrayList<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use a more specific variable name here.
*/ | ||
@Test | ||
public void testOpportunisticRunsWhenResourcesAvailable() throws Exception { | ||
containerManager.start(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is repeated in both the tests and can be moved to @before (the test setup method in junit).
...r/nodemanager/containermanager/scheduler/TestContainerSchedulerOppContainersByResources.java
Show resolved
Hide resolved
@GauthamBanasandra thanks for the review! I've updated the PR to address your comments, please have another pass. |
🎊 +1 overall
This message was automatically generated. |
return this.utilizationTracker.hasResourcesAvailable(container); | ||
} | ||
|
||
private boolean resourceAvailableToQueueOppContainer( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC, we are doing this check after localization is done. Shouldn't this check be done at the time of container start to avoid wasting NM resources.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC I didn't do that since it's much more complicated and requires the ContainerScheduler
to expose internal state.
Could you point me to where this can be done cleanly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked code and looks like it is same case with existing code also. I think there is scope of improvement there and that can be handle as separate jira. @goiri what do you think ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this JIRA we can leave it as is for consistency with the rest.
We can open a JIRA to move it everywhere (it might have side effects).
...a/org/apache/hadoop/yarn/server/nodemanager/containermanager/BaseContainerSchedulerTest.java
Show resolved
Hide resolved
@minni31 thanks for the review! Have replied to your comments, please have a look. |
if (subState == ContainerSubState.DONE) { | ||
return state == | ||
org.apache.hadoop.yarn.api.records.ContainerState.COMPLETE; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Thanks for clarifying. It seems a bit convoluted at first 😅. How about this?
case RUNNING:
case COMPLETING:
return true;
case DONE:
return state == org.apache.hadoop.yarn.api.records.ContainerState.COMPLETE;
default:
return false;
@GauthamBanasandra have updated the PR to address your latest comments. Please have a look. |
@afchung I think the build failure is temporary and unrelated. I've retriggered the CI run now. |
It has got past the failed stage now. |
🎊 +1 overall
This message was automatically generated. |
return this.utilizationTracker.hasResourcesAvailable(container); | ||
} | ||
|
||
private boolean resourceAvailableToQueueOppContainer( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this JIRA we can leave it as is for consistency with the rest.
We can open a JIRA to move it everywhere (it might have side effects).
Description of PR
BY_RESOURCES
andBY_QUEUE_LEN
at the NMBY_RESOURCES
is specified, the NM will queue as long as it hasenough resources to run all pending + running containers, otherwise, it
will reject the OPPORTUNISTIC container
BY_QUEUE_LEN
is specified, the NM will only accept as manycontainers as its queue capacity is configured
TestContainerSchedulerQueueing
to accommodate differentqueueing policies at the NM
How was this patch tested?
BY_QUEUE_LEN
, the behavior is the same as before and forBY_RESOURCES
, we observe that the NM does not accept more containers than its resources can handle