Skip to content

EfficientNet B1 to B7 and InceptionV3 Pre-Trained Models Accuracy Discrepancies #5958

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
yqihao opened this issue May 6, 2022 · 2 comments
Closed

Comments

@yqihao
Copy link

yqihao commented May 6, 2022

🐛 Describe the bug

I am performing accuracy validations for several pre-trained models with all 50k images in the ImageNet LSVRC-2012 Validation dataset. And I find that in some results the validation accuracy has a significant difference from the accuracy claimed by the documentation. Here are the data I collect and the plots.

Pytorch TOP-1 Pytorch TOP-1 Validation Pytorch TOP-1 Difference Pytorch TOP-5 Pytorch TOP-5 Validation Pytorch TOP-5 Difference
ResNet 50 76.13% 76.06% 0.07% 92.86% 92.67% 0.19%
ResNet 101 77.37% 77.30% 0.07% 93.55% 93.36% 0.18%
ResNet 152 78.31% 78.25% 0.06% 94.05% 93.85% 0.19%
EfficientNet B0 77.69% 77.63% 0.06% 93.53% 93.40% 0.13%
EfficientNet B1 78.64% 77.55% 1.10% 94.19% 93.40% 0.78%
EfficientNet B2 80.61% 77.72% 2.89% 95.31% 93.55% 1.76%
EfficientNet B3 82.01% 78.48% 3.53% 96.05% 94.17% 1.88%
EfficientNet B4 83.38% 79.21% 4.18% 96.59% 94.33% 2.27%
EfficientNet B5 83.44% 73.07% 10.37% 96.63% 90.76% 5.87%
EfficientNet B6 84.01% 74.34% 9.66% 96.92% 91.67% 5.25%
EfficientNet B7 84.12% 73.86% 10.26% 96.91% 91.38% 5.53%
Inception V3 77.29% 69.47% 7.82% 93.45% 88.48% 4.97%
MobileNetV2 71.88% 71.80% 0.07% 90.29% 90.10% 0.18%

The result shows that the accuracy validations for ResNet 50, 101, 152, and MobileNetV2 are almost the same as the accuracy claimed on the documentation, but the rest of them have different levels of discrepancy. The accuracy validation scripts are available here. https://github.com/yqihao/PTMValidations All validations are performed on Google Colab with the same ImageNet dataset(https://image-net.org/data/ILSVRC/2012/ILSVRC2012_img_val.tar).
chart (1)
chart

Versions

PyTorch version: 1.11.0+cu113
Is debug build: False
CUDA used to build PyTorch: 11.3
ROCM used to build PyTorch: N/A

OS: Ubuntu 18.04.5 LTS (x86_64)
GCC version: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
Clang version: 6.0.0-1ubuntu2 (tags/RELEASE_600/final)
CMake version: version 3.22.4
Libc version: glibc-2.26

Python version: 3.7.13 (default, Apr 24 2022, 01:04:09) [GCC 7.5.0] (64-bit runtime)
Python platform: Linux-5.4.188+-x86_64-with-Ubuntu-18.04-bionic
Is CUDA available: True
CUDA runtime version: 11.1.105
GPU models and configuration: GPU 0: Tesla K80
Nvidia driver version: 460.32.03
cuDNN version: Probably one of the following:
/usr/lib/x86_64-linux-gnu/libcudnn.so.7.6.5
/usr/lib/x86_64-linux-gnu/libcudnn.so.8.0.5
/usr/lib/x86_64-linux-gnu/libcudnn_adv_infer.so.8.0.5
/usr/lib/x86_64-linux-gnu/libcudnn_adv_train.so.8.0.5
/usr/lib/x86_64-linux-gnu/libcudnn_cnn_infer.so.8.0.5
/usr/lib/x86_64-linux-gnu/libcudnn_cnn_train.so.8.0.5
/usr/lib/x86_64-linux-gnu/libcudnn_ops_infer.so.8.0.5
/usr/lib/x86_64-linux-gnu/libcudnn_ops_train.so.8.0.5
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

Versions of relevant libraries:
[pip3] numpy==1.21.6
[pip3] torch==1.11.0+cu113
[pip3] torchaudio==0.11.0+cu113
[pip3] torchsummary==1.5.1
[pip3] torchtext==0.12.0
[pip3] torchvision==0.12.0+cu113
[conda] Could not collect

@datumbox
Copy link
Contributor

datumbox commented May 9, 2022

@yqihao Thanks for reporting.

The minor differences on the second decimal is the result for batch padding done by the scripts on a multi-GPU setup. Basically when the total-batch size doesn't divide the dataset size exactly, it is padded with additional repeated values and this can lead to minor deviations (you can read more here #4559). This is why when we estimate the statistics of the models, we use batch-size=1 and gpus=1.

The big differences are the result of using incorrect preprocessing methods to verify the accuracy of the models. EfficientNetsB1-B7 and InceptionV3 use larger resolutions and different interpolation values. Using the wrong preprocessing can lead to up to 10% accuracy points reduction.

To make it easier for users to use the correct preprocessing, we've bundled the transforms in the pre-trained weights. See this blog post for more details. All the changes have now graduated from prototype and are in TorchVision on the main branch. We expect to release them on the next version.

The recommended way to verify the accuracy of our models using TorchVision is executing the following command:

torchrun --nproc_per_node=1 train.py --test-only --weights EfficientNet_B1_Weights.IMAGENET1K_V1 --model efficientnet_b1 -b 1

I'm going to close this issue as I believe my comment addresses your concern but if you continue facing problems feel free to reopen.

@datumbox datumbox closed this as completed May 9, 2022
@Pong-97
Copy link

Pong-97 commented Mar 17, 2023

由于环境问题,不能使用torchvision.prototype。胆汁可以参照这个预处理方式

transforms=partial(ImageClassification, crop_size=299, resize_size=342),

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants