Skip to content

Conversation

mbelenfa
Copy link

No description provided.

@mbelenfa mbelenfa closed this Jun 22, 2017
shanthoosh pushed a commit to shanthoosh/hadoop that referenced this pull request Oct 15, 2019
…RM fail-over

1. Improved our container handling logic to be resilient to phantom notifications.
2. Added a new metric to Samza's ContainerProcessManager module that tracks the number of such invalid notifications.
3. Add a couple of tests that simulate this exact scenario above that we encountered during the cluster upgrade. (container starts -> container fails -> legitimate notification for the failure - container re-start -> RM fail-over -> phantom notification with a different exit code)
4. As an aside, there are a whole bunch of tests in ContainerProcessManager that rely on Thread.sleep to ensure that threads get to run in a certain order. Removed this non-determinism and made them predictable.

Author: Jagadish Venkatraman <[email protected]>

Reviewers: Jake Maes <[email protected]>

Closes apache#243 from vjagadish1989/am-bug
shanthoosh added a commit to shanthoosh/hadoop that referenced this pull request Oct 15, 2019
Changes
* Fix checkstyle errors from apache#243
* Fix failure after bad merge in apache#244

Author: Shanthoosh Venkataraman <[email protected]>

Reviewers: Navina Ramesh <[email protected]>

Closes apache#252 from shanthoosh/fix_NPE_after_master_merge
steveloughran pushed a commit to steveloughran/hadoop that referenced this pull request Aug 5, 2025
…Builder reports an issue with the container-executor (apache#7290) (apache#232) (apache#243)

Change-Id: Iaaa94c8f46faa4feaede27de36e0d94483ae0229
(cherry picked from commit e6ad8c4)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant