-
Notifications
You must be signed in to change notification settings - Fork 7.1k
[BC-breaking] Fix initialisation bug on FeaturePyramidNetwork #2954
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
639d06f
to
f042027
Compare
Codecov Report
@@ Coverage Diff @@
## master #2954 +/- ##
==========================================
+ Coverage 73.41% 73.44% +0.03%
==========================================
Files 99 99
Lines 8812 8812
Branches 1389 1389
==========================================
+ Hits 6469 6472 +3
+ Misses 1917 1915 -2
+ Partials 426 425 -1
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for the PR and the investigation Vasilis!
I agree with all your points, I think it's not necessary for now to run more tests for keypoint rcnn.
We should at some point though consider uploading newer weights, but we will need to handle versioning before.
* Change children() to modules() to ensure init happens in all blocks. * Update expected values of all detection models. * Revert "Update expected values of all detection models." This reverts commit 050b64a * Update expecting values.
* Change children() to modules() to ensure init happens in all blocks. * Update expected values of all detection models. * Revert "Update expected values of all detection models." This reverts commit 050b64a * Update expecting values.
Fixes #2326
The FeaturePyramidNetwork intended to initialize all the
Conv2d
layer weights using the following distribution:vision/torchvision/ops/feature_pyramid_network.py
Lines 89 to 92 in 971c3e4
Unfortunately due to the use of
children()
instead ofmodules()
we only initialize the direct blocks of the FPN and not the ones of its nested modules. This means that those nested modules use the default distribution from PyTorch:This PR changes the use of
children()
tomodules()
to ensure that the weight initialization is consistent across all nested blocks of FeaturePyramidNetwork.The above change, though trivial in terms of coding it has side-effects:
The first issue is addressed on two separate PRs. See #2965 and #2966 from more information. To address the second, I retrained the Detection models on master and on this branch and compared their stats. Overall we see that fixing the bug and applying initialization correctly in all blocks leads to small improvements for all models except keypointrcnn.
Master
fasterrcnn_resnet50_fpn: boxAP=37.6
retinanet_resnet50_fpn: boxAP=36.3
maskrcnn_resnet50_fpn: boxAP=38.5, maskAP=34.5
keypointrcnn_resnet50_fpn: boxAP=55.7, kpAP=65.1
Branch
fasterrcnn_resnet50_fpn: boxAP=38.0
retinanet_resnet50_fpn: boxAP=36.4
maskrcnn_resnet50_fpn: boxAP=38.7, maskAP=34.7
keypointrcnn_resnet50_fpn: boxAP=55.5, kpAP=65.1
The deterioration in accuracy might be because of:
To fully investigate the situation, we would need to do multiple runs using different seeds for each initialization scheme and then conduct a statistical test to assess if our change has statistically significant effects on the accuracy. If the change turns out to have statistically significant negative results, we might want to support different init schemes per model.
I believe that the above is not worth the effort because:
Based on the above, I recommend merging the change into master.