Initialize normalizer with mean from first trajectory #4299

andrewcoh · 2020-08-03T22:34:36Z

Proposed change(s)

To address MLA-1213. The issue is essentially that if the difference between the initial average (0) and first trajectory average (~1800) (1) too large and (2) skewed negative i.e. the resulting change in variance will be negative and too large. With the observations of ~1800 on walker, we are essentially updating the variance with 1800 * -x < -1 so that the change results in variance < 0 leading to NaNs. The correct way to initialize this algorithm is to use the mean of the first samples as the initial mean.

The assumption is that samples are coming from a normal distribution, so 'seeing' a 0 and then all 1800s is, from the algo's perspective, an incredibly low probability event.

Useful links (Github issues, JIRA tickets, ML-Agents forum threads etc.)

Types of change(s)

Checklist

Added tests that prove my fix is effective or that my feature works
Updated the changelog (if applicable)
Updated the documentation (if applicable)
Updated the migration guide (if applicable)

Other comments

chriselion

Looks good. Unit test would be great, but not a requirement if it's too hard to do (or too overfit to tensorflow).

ervteng

LGTM for a TF fix - let's separately figure out how to do it for PyTorch.

Note for posterity: if the first update has none of the large observations, it's still possible to get a NaN. But this should fix the issue for most cases.

andrewcoh · 2020-08-04T17:42:55Z

LGTM for a TF fix - let's separately figure out how to do it for PyTorch.

Note for posterity: if the first update has none of the large observations, it's still possible to get a NaN. But this should fix the issue for most cases.

To clarify, it's possible to get a NaN if the values in successive trajectories are really different and don't follow anything near to a normal distribution (an assumption of the welford algorithm).

andrewcoh · 2020-08-11T21:27:20Z

ml-agents/mlagents/trainers/tests/mock_brain.py

@@ -123,6 +123,9 @@ def make_fake_trajectory(
            memory=memory,
        )
        steps_list.append(experience)
+    obs = []
+    for _shape in observation_shapes:
+        obs.append(np.ones(_shape, dtype=np.float32))


Done so that changing the last obs doesn't overwrite the second to last obs

andrewcoh added 2 commits August 3, 2020 15:18

test initalize steps to 100

8f1a7a0

use mean of first trajectory to initialize the normalizer

8a441f8

andrewcoh changed the title ~~Increase initial normalize steps to 100~~ Initialize normalizer with mean from first trajectory Aug 4, 2020

remove blank line

021be5f

andrewcoh requested review from ervteng and chriselion August 4, 2020 15:58

update changelog

e89f929

chriselion approved these changes Aug 4, 2020

View reviewed changes

ervteng approved these changes Aug 4, 2020

View reviewed changes

andrewcoh added 2 commits August 11, 2020 13:09

cleaned up initialization of variance/mean

af5ed47

large normalization obs unit test

5ba7765

andrewcoh commented Aug 11, 2020

View reviewed changes

chriselion approved these changes Aug 11, 2020

View reviewed changes

add --upgrade to pip to get newer downloader (#4338)

ee8021f

andrewcoh mentioned this pull request Aug 12, 2020

Release 6 fix nan #4343

Merged

10 tasks

andrewcoh merged commit 01cccd8 into master Aug 12, 2020

andrewcoh mentioned this pull request Sep 8, 2020

Prevent init normalize on --resume #4463

Merged

10 tasks

github-actions bot locked as resolved and limited conversation to collaborators Aug 12, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Initialize normalizer with mean from first trajectory #4299

Initialize normalizer with mean from first trajectory #4299

Uh oh!

andrewcoh commented Aug 3, 2020 •

edited

Loading

Uh oh!

chriselion left a comment

Uh oh!

ervteng left a comment

Uh oh!

andrewcoh commented Aug 4, 2020

Uh oh!

andrewcoh Aug 11, 2020

Uh oh!

Uh oh!

Initialize normalizer with mean from first trajectory #4299

Initialize normalizer with mean from first trajectory #4299

Uh oh!

Conversation

andrewcoh commented Aug 3, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Proposed change(s)

Useful links (Github issues, JIRA tickets, ML-Agents forum threads etc.)

Types of change(s)

Checklist

Other comments

Uh oh!

chriselion left a comment

Choose a reason for hiding this comment

Uh oh!

ervteng left a comment

Choose a reason for hiding this comment

Uh oh!

andrewcoh commented Aug 4, 2020

Uh oh!

andrewcoh Aug 11, 2020

Choose a reason for hiding this comment

Uh oh!

Uh oh!

andrewcoh commented Aug 3, 2020 •

edited

Loading