You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am performing accuracy validations for several pre-trained models with all 50k images in the ImageNet LSVRC-2012 Validation dataset. And I find that in some results the validation accuracy has a significant difference from the accuracy claimed by the documentation. Here are the data I collect and the plots.
Pytorch TOP-1
Pytorch TOP-1 Validation
Pytorch TOP-1 Difference
Pytorch TOP-5
Pytorch TOP-5 Validation
Pytorch TOP-5 Difference
ResNet 50
76.13%
76.06%
0.07%
92.86%
92.67%
0.19%
ResNet 101
77.37%
77.30%
0.07%
93.55%
93.36%
0.18%
ResNet 152
78.31%
78.25%
0.06%
94.05%
93.85%
0.19%
EfficientNet B0
77.69%
77.63%
0.06%
93.53%
93.40%
0.13%
EfficientNet B1
78.64%
77.55%
1.10%
94.19%
93.40%
0.78%
EfficientNet B2
80.61%
77.72%
2.89%
95.31%
93.55%
1.76%
EfficientNet B3
82.01%
78.48%
3.53%
96.05%
94.17%
1.88%
EfficientNet B4
83.38%
79.21%
4.18%
96.59%
94.33%
2.27%
EfficientNet B5
83.44%
73.07%
10.37%
96.63%
90.76%
5.87%
EfficientNet B6
84.01%
74.34%
9.66%
96.92%
91.67%
5.25%
EfficientNet B7
84.12%
73.86%
10.26%
96.91%
91.38%
5.53%
Inception V3
77.29%
69.47%
7.82%
93.45%
88.48%
4.97%
MobileNetV2
71.88%
71.80%
0.07%
90.29%
90.10%
0.18%
The result shows that the accuracy validations for ResNet 50, 101, 152, and MobileNetV2 are almost the same as the accuracy claimed on the documentation, but the rest of them have different levels of discrepancy. The accuracy validation scripts are available here. https://github.com/yqihao/PTMValidations All validations are performed on Google Colab with the same ImageNet dataset(https://image-net.org/data/ILSVRC/2012/ILSVRC2012_img_val.tar).
Versions
PyTorch version: 1.11.0+cu113
Is debug build: False
CUDA used to build PyTorch: 11.3
ROCM used to build PyTorch: N/A
The minor differences on the second decimal is the result for batch padding done by the scripts on a multi-GPU setup. Basically when the total-batch size doesn't divide the dataset size exactly, it is padded with additional repeated values and this can lead to minor deviations (you can read more here #4559). This is why when we estimate the statistics of the models, we use batch-size=1 and gpus=1.
The big differences are the result of using incorrect preprocessing methods to verify the accuracy of the models. EfficientNetsB1-B7 and InceptionV3 use larger resolutions and different interpolation values. Using the wrong preprocessing can lead to up to 10% accuracy points reduction.
To make it easier for users to use the correct preprocessing, we've bundled the transforms in the pre-trained weights. See this blog post for more details. All the changes have now graduated from prototype and are in TorchVision on the main branch. We expect to release them on the next version.
The recommended way to verify the accuracy of our models using TorchVision is executing the following command:
🐛 Describe the bug
I am performing accuracy validations for several pre-trained models with all 50k images in the ImageNet LSVRC-2012 Validation dataset. And I find that in some results the validation accuracy has a significant difference from the accuracy claimed by the documentation. Here are the data I collect and the plots.
The result shows that the accuracy validations for ResNet 50, 101, 152, and MobileNetV2 are almost the same as the accuracy claimed on the documentation, but the rest of them have different levels of discrepancy. The accuracy validation scripts are available here. https://github.com/yqihao/PTMValidations All validations are performed on Google Colab with the same ImageNet dataset(https://image-net.org/data/ILSVRC/2012/ILSVRC2012_img_val.tar).


Versions
PyTorch version: 1.11.0+cu113
Is debug build: False
CUDA used to build PyTorch: 11.3
ROCM used to build PyTorch: N/A
OS: Ubuntu 18.04.5 LTS (x86_64)
GCC version: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
Clang version: 6.0.0-1ubuntu2 (tags/RELEASE_600/final)
CMake version: version 3.22.4
Libc version: glibc-2.26
Python version: 3.7.13 (default, Apr 24 2022, 01:04:09) [GCC 7.5.0] (64-bit runtime)
Python platform: Linux-5.4.188+-x86_64-with-Ubuntu-18.04-bionic
Is CUDA available: True
CUDA runtime version: 11.1.105
GPU models and configuration: GPU 0: Tesla K80
Nvidia driver version: 460.32.03
cuDNN version: Probably one of the following:
/usr/lib/x86_64-linux-gnu/libcudnn.so.7.6.5
/usr/lib/x86_64-linux-gnu/libcudnn.so.8.0.5
/usr/lib/x86_64-linux-gnu/libcudnn_adv_infer.so.8.0.5
/usr/lib/x86_64-linux-gnu/libcudnn_adv_train.so.8.0.5
/usr/lib/x86_64-linux-gnu/libcudnn_cnn_infer.so.8.0.5
/usr/lib/x86_64-linux-gnu/libcudnn_cnn_train.so.8.0.5
/usr/lib/x86_64-linux-gnu/libcudnn_ops_infer.so.8.0.5
/usr/lib/x86_64-linux-gnu/libcudnn_ops_train.so.8.0.5
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True
Versions of relevant libraries:
[pip3] numpy==1.21.6
[pip3] torch==1.11.0+cu113
[pip3] torchaudio==0.11.0+cu113
[pip3] torchsummary==1.5.1
[pip3] torchtext==0.12.0
[pip3] torchvision==0.12.0+cu113
[conda] Could not collect
The text was updated successfully, but these errors were encountered: