Skip to content

Investigate Exponential Moving Average result in classification script #4391

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
prabhat00155 opened this issue Sep 10, 2021 · 2 comments · Fixed by #4406
Closed

Investigate Exponential Moving Average result in classification script #4391

prabhat00155 opened this issue Sep 10, 2021 · 2 comments · Fixed by #4406

Comments

@prabhat00155
Copy link
Contributor

prabhat00155 commented Sep 10, 2021

🐛 Describe the bug

Verify that the EMA added to classification script in #4381 works as expected.

Versions

N/A

cc @datumbox

@datumbox
Copy link
Contributor

The script might not be working as expected in all use-cases. Here is the results I get by running the following:

$ PYTHONPATH=$PYTHONPATH:`pwd` python -u run_with_submitit.py --ngpus 8 --nodes 1 --model mobilenet_v3_large --epochs 600 --opt rmsprop --batch-size 128 --lr 0.064 --wd 0.00001 --lr-step-size 2 --lr-gamma 0.973 --auto-augment ta_wide --random-erase 0.2 --model-ema
Submitted job_id: 46058186


$ grep "Acc@1" ./46058186/46058186_0_log.out
 * Acc@1 5.300 Acc@5 15.888
 * Acc@1 0.126 Acc@5 0.490
 * Acc@1 16.598 Acc@5 37.364
 * Acc@1 0.120 Acc@5 0.496
 * Acc@1 24.016 Acc@5 47.174
 * Acc@1 0.122 Acc@5 0.502
 * Acc@1 27.202 Acc@5 51.712
 * Acc@1 0.114 Acc@5 0.494
 * Acc@1 33.306 Acc@5 59.140
 * Acc@1 0.112 Acc@5 0.498
 * Acc@1 34.138 Acc@5 60.178
 * Acc@1 0.108 Acc@5 0.512
 * Acc@1 36.414 Acc@5 62.486
 * Acc@1 0.104 Acc@5 0.528
 * Acc@1 37.536 Acc@5 64.172
 * Acc@1 0.102 Acc@5 0.516
 * Acc@1 40.744 Acc@5 66.738
 * Acc@1 0.100 Acc@5 0.528
 * Acc@1 40.974 Acc@5 67.410
 * Acc@1 0.100 Acc@5 0.516
 * Acc@1 40.950 Acc@5 67.206
 * Acc@1 0.102 Acc@5 0.494
 * Acc@1 41.506 Acc@5 67.552
 * Acc@1 0.104 Acc@5 0.500
 * Acc@1 39.652 Acc@5 65.506
 * Acc@1 0.106 Acc@5 0.494
 * Acc@1 44.440 Acc@5 70.518
 * Acc@1 0.104 Acc@5 0.500
 * Acc@1 44.886 Acc@5 71.206
 * Acc@1 0.106 Acc@5 0.496
 * Acc@1 45.132 Acc@5 71.030
 * Acc@1 0.108 Acc@5 0.502
 * Acc@1 43.892 Acc@5 69.802
 * Acc@1 0.100 Acc@5 0.516
 * Acc@1 45.218 Acc@5 71.394
 * Acc@1 0.092 Acc@5 0.494
 * Acc@1 41.976 Acc@5 68.116
 * Acc@1 0.112 Acc@5 0.486
 * Acc@1 42.848 Acc@5 68.496
 * Acc@1 0.106 Acc@5 0.508
 * Acc@1 45.450 Acc@5 71.240
 * Acc@1 0.108 Acc@5 0.532
 * Acc@1 24.512 Acc@5 46.460
 * Acc@1 0.114 Acc@5 0.524
 * Acc@1 48.718 Acc@5 73.980
 * Acc@1 0.116 Acc@5 0.534
 * Acc@1 44.496 Acc@5 70.116
 * Acc@1 0.126 Acc@5 0.556
 * Acc@1 45.574 Acc@5 70.988
 * Acc@1 0.112 Acc@5 0.560
 * Acc@1 46.466 Acc@5 72.250
 * Acc@1 0.114 Acc@5 0.548
 * Acc@1 48.764 Acc@5 74.648
 * Acc@1 0.114 Acc@5 0.564
 * Acc@1 50.228 Acc@5 75.884
 * Acc@1 0.120 Acc@5 0.608
 * Acc@1 49.590 Acc@5 75.208
 * Acc@1 0.130 Acc@5 0.586
 * Acc@1 41.764 Acc@5 67.562
 * Acc@1 0.130 Acc@5 0.598
 * Acc@1 50.360 Acc@5 76.004
 * Acc@1 0.128 Acc@5 0.600
 * Acc@1 51.812 Acc@5 77.184
 * Acc@1 0.132 Acc@5 0.574
 * Acc@1 44.828 Acc@5 70.184
 * Acc@1 0.126 Acc@5 0.568
 * Acc@1 52.424 Acc@5 77.644
 * Acc@1 0.096 Acc@5 0.532
 * Acc@1 50.702 Acc@5 76.002
 * Acc@1 0.086 Acc@5 0.514
 * Acc@1 50.656 Acc@5 75.650
 * Acc@1 0.092 Acc@5 0.496
 * Acc@1 50.154 Acc@5 75.644
 * Acc@1 0.114 Acc@5 0.478
 * Acc@1 49.356 Acc@5 75.020
 * Acc@1 0.104 Acc@5 0.488
 * Acc@1 50.056 Acc@5 74.896
 * Acc@1 0.104 Acc@5 0.506
 * Acc@1 50.690 Acc@5 76.010
 * Acc@1 0.102 Acc@5 0.504
 * Acc@1 52.908 Acc@5 77.936
 * Acc@1 0.100 Acc@5 0.488
 * Acc@1 52.276 Acc@5 77.336
 * Acc@1 0.094 Acc@5 0.476
 * Acc@1 53.676 Acc@5 78.296
 * Acc@1 0.096 Acc@5 0.468
 * Acc@1 51.862 Acc@5 76.756
 * Acc@1 0.100 Acc@5 0.468
 * Acc@1 55.424 Acc@5 79.630
 * Acc@1 0.096 Acc@5 0.458
 * Acc@1 54.006 Acc@5 78.668
 * Acc@1 0.094 Acc@5 0.482
 * Acc@1 54.586 Acc@5 78.948
 * Acc@1 0.094 Acc@5 0.514
 * Acc@1 55.012 Acc@5 79.680
 * Acc@1 0.108 Acc@5 0.558
 * Acc@1 54.222 Acc@5 79.086
 * Acc@1 0.106 Acc@5 0.580
 * Acc@1 51.640 Acc@5 76.638
 * Acc@1 0.098 Acc@5 0.514
 * Acc@1 54.524 Acc@5 78.906
 * Acc@1 0.104 Acc@5 0.530
 * Acc@1 51.532 Acc@5 76.740
 * Acc@1 0.100 Acc@5 0.524
 * Acc@1 55.182 Acc@5 79.360
 * Acc@1 0.100 Acc@5 0.524
 * Acc@1 54.732 Acc@5 79.396
 * Acc@1 0.100 Acc@5 0.528
 * Acc@1 56.076 Acc@5 80.070
 * Acc@1 0.104 Acc@5 0.538
 * Acc@1 56.018 Acc@5 80.564
 * Acc@1 0.104 Acc@5 0.510
 * Acc@1 55.942 Acc@5 80.160
 * Acc@1 0.104 Acc@5 0.504
 * Acc@1 54.952 Acc@5 79.432
 * Acc@1 0.104 Acc@5 0.534
 * Acc@1 57.650 Acc@5 81.418
 * Acc@1 0.104 Acc@5 0.484
 * Acc@1 58.104 Acc@5 82.006
 * Acc@1 0.102 Acc@5 0.516
 * Acc@1 56.736 Acc@5 81.012
 * Acc@1 0.104 Acc@5 0.506
 * Acc@1 57.782 Acc@5 81.420
 * Acc@1 0.104 Acc@5 0.496
 * Acc@1 57.052 Acc@5 81.002
 * Acc@1 0.104 Acc@5 0.500
 * Acc@1 58.086 Acc@5 81.590
 * Acc@1 0.104 Acc@5 0.526
 * Acc@1 58.820 Acc@5 82.362
 * Acc@1 0.102 Acc@5 0.538
 * Acc@1 57.266 Acc@5 80.918
 * Acc@1 0.102 Acc@5 0.540
 * Acc@1 58.036 Acc@5 81.586
 * Acc@1 0.100 Acc@5 0.546
 * Acc@1 57.584 Acc@5 81.694
 * Acc@1 0.102 Acc@5 0.504
 * Acc@1 58.902 Acc@5 82.552
 * Acc@1 0.098 Acc@5 0.486
 * Acc@1 59.196 Acc@5 82.556
 * Acc@1 0.106 Acc@5 0.496
 * Acc@1 57.190 Acc@5 81.062
 * Acc@1 0.100 Acc@5 0.526
 * Acc@1 60.208 Acc@5 83.392
 * Acc@1 0.098 Acc@5 0.524
 * Acc@1 57.600 Acc@5 81.580
 * Acc@1 0.100 Acc@5 0.516
 * Acc@1 58.390 Acc@5 81.842
 * Acc@1 0.110 Acc@5 0.516
 * Acc@1 59.280 Acc@5 82.512
 * Acc@1 0.112 Acc@5 0.492
 * Acc@1 59.002 Acc@5 82.406
 * Acc@1 0.102 Acc@5 0.470
 * Acc@1 60.408 Acc@5 83.112
 * Acc@1 0.106 Acc@5 0.478
 * Acc@1 58.576 Acc@5 82.130
 * Acc@1 0.112 Acc@5 0.476
 * Acc@1 61.234 Acc@5 84.058
 * Acc@1 0.108 Acc@5 0.486
 * Acc@1 59.014 Acc@5 82.578
 * Acc@1 0.120 Acc@5 0.498
 * Acc@1 60.532 Acc@5 83.126
 * Acc@1 0.122 Acc@5 0.506
 * Acc@1 59.526 Acc@5 82.596
 * Acc@1 0.112 Acc@5 0.514
 * Acc@1 59.126 Acc@5 82.430
 * Acc@1 0.110 Acc@5 0.524
 * Acc@1 61.422 Acc@5 84.312
 * Acc@1 0.114 Acc@5 0.496
 * Acc@1 62.744 Acc@5 85.194
 * Acc@1 0.126 Acc@5 0.508
 * Acc@1 61.024 Acc@5 83.788
 * Acc@1 0.106 Acc@5 0.512
 * Acc@1 62.234 Acc@5 84.768
 * Acc@1 0.102 Acc@5 0.480
 * Acc@1 59.128 Acc@5 82.006
 * Acc@1 0.110 Acc@5 0.520

Every other line is the EMA model, which seems a bit random. This was run on top of current main branch. Worth investigating if this is related to GPUs/parallelization etc.

@NicolasHug
Copy link
Member

Verify that the EMA added to classification script in #4381 works as expected.

Sorry if I'm missing some context here, this might be a strange question but I'm wondering: was #4381 merged prior to checking whether it was properly working or not? If yes, was there a specific reason for that?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants