Skip to content

Commit e6d82f7

Browse files
authored
Adding EfficientNetV2 architecture (#5450)
* Extend the EfficientNet class to support v1 and v2. * Refactor config/builder methods and add prototype builders * Refactoring weight info. * Update dropouts based on TF config ref * Update BN eps on TF base_config * Use Conv2dNormActivation. * Adding pre-trained weights for EfficientNetV2-s * Add Medium and Large weights * Update stats with single batch run. * Add accuracies in the docs.
1 parent a2b7075 commit e6d82f7

File tree

8 files changed

+507
-115
lines changed

8 files changed

+507
-115
lines changed

docs/source/models.rst

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@ architectures for image classification:
3838
- `ResNeXt`_
3939
- `Wide ResNet`_
4040
- `MNASNet`_
41-
- `EfficientNet`_
41+
- `EfficientNet`_ v1 & v2
4242
- `RegNet`_
4343
- `VisionTransformer`_
4444
- `ConvNeXt`_
@@ -70,6 +70,9 @@ You can construct a model with random weights by calling its constructor:
7070
efficientnet_b5 = models.efficientnet_b5()
7171
efficientnet_b6 = models.efficientnet_b6()
7272
efficientnet_b7 = models.efficientnet_b7()
73+
efficientnet_v2_s = models.efficientnet_v2_s()
74+
efficientnet_v2_m = models.efficientnet_v2_m()
75+
efficientnet_v2_l = models.efficientnet_v2_l()
7376
regnet_y_400mf = models.regnet_y_400mf()
7477
regnet_y_800mf = models.regnet_y_800mf()
7578
regnet_y_1_6gf = models.regnet_y_1_6gf()
@@ -122,6 +125,9 @@ These can be constructed by passing ``pretrained=True``:
122125
efficientnet_b5 = models.efficientnet_b5(pretrained=True)
123126
efficientnet_b6 = models.efficientnet_b6(pretrained=True)
124127
efficientnet_b7 = models.efficientnet_b7(pretrained=True)
128+
efficientnet_v2_s = models.efficientnet_v2_s(pretrained=True)
129+
efficientnet_v2_m = models.efficientnet_v2_m(pretrained=True)
130+
efficientnet_v2_l = models.efficientnet_v2_l(pretrained=True)
125131
regnet_y_400mf = models.regnet_y_400mf(pretrained=True)
126132
regnet_y_800mf = models.regnet_y_800mf(pretrained=True)
127133
regnet_y_1_6gf = models.regnet_y_1_6gf(pretrained=True)
@@ -238,6 +244,9 @@ EfficientNet-B4 83.384 96.594
238244
EfficientNet-B5 83.444 96.628
239245
EfficientNet-B6 84.008 96.916
240246
EfficientNet-B7 84.122 96.908
247+
EfficientNetV2-s 84.228 96.878
248+
EfficientNetV2-m 85.112 97.156
249+
EfficientNetV2-l 85.810 97.792
241250
regnet_x_400mf 72.834 90.950
242251
regnet_x_800mf 75.212 92.348
243252
regnet_x_1_6gf 77.040 93.440
@@ -439,6 +448,9 @@ EfficientNet
439448
efficientnet_b5
440449
efficientnet_b6
441450
efficientnet_b7
451+
efficientnet_v2_s
452+
efficientnet_v2_m
453+
efficientnet_v2_l
442454

443455
RegNet
444456
------------

hubconf.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,9 @@
1313
efficientnet_b5,
1414
efficientnet_b6,
1515
efficientnet_b7,
16+
efficientnet_v2_s,
17+
efficientnet_v2_m,
18+
efficientnet_v2_l,
1619
)
1720
from torchvision.models.googlenet import googlenet
1821
from torchvision.models.inception import inception_v3

references/classification/README.md

Lines changed: 21 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -88,7 +88,7 @@ Then we averaged the parameters of the last 3 checkpoints that improved the Acc@
8888
and [#3354](https://github.com/pytorch/vision/pull/3354) for details.
8989

9090

91-
### EfficientNet
91+
### EfficientNet-V1
9292

9393
The weights of the B0-B4 variants are ported from Ross Wightman's [timm repo](https://github.com/rwightman/pytorch-image-models/blob/01cb46a9a50e3ba4be167965b5764e9702f09b30/timm/models/efficientnet.py#L95-L108).
9494

@@ -114,6 +114,26 @@ torchrun --nproc_per_node=8 train.py --model efficientnet_b7 --interpolation bic
114114
--val-resize-size 600 --val-crop-size 600 --train-crop-size 600 --test-only --pretrained
115115
```
116116

117+
118+
### EfficientNet-V2
119+
```
120+
torchrun --nproc_per_node=8 train.py \
121+
--model $MODEL --batch-size 128 --lr 0.5 --lr-scheduler cosineannealinglr \
122+
--lr-warmup-epochs 5 --lr-warmup-method linear --auto-augment ta_wide --epochs 600 --random-erase 0.1 \
123+
--label-smoothing 0.1 --mixup-alpha 0.2 --cutmix-alpha 1.0 --weight-decay 0.00002 --norm-weight-decay 0.0 \
124+
--train-crop-size $TRAIN_SIZE --model-ema --val-crop-size $EVAL_SIZE --val-resize-size $EVAL_SIZE \
125+
--ra-sampler --ra-reps 4
126+
```
127+
Here `$MODEL` is one of `efficientnet_v2_s` and `efficientnet_v2_m`.
128+
Note that the Small variant had a `$TRAIN_SIZE` of `300` and a `$EVAL_SIZE` of `384`, while the Medium `384` and `480` respectively.
129+
130+
Note that the above command corresponds to training on a single node with 8 GPUs.
131+
For generatring the pre-trained weights, we trained with 4 nodes, each with 8 GPUs (for a total of 32 GPUs),
132+
and `--batch_size 32`.
133+
134+
The weights of the Large variant are ported from the original paper rather than trained from scratch. See the `EfficientNet_V2_L_Weights` entry for their exact preprocessing transforms.
135+
136+
117137
### RegNet
118138

119139
#### Small models
Binary file not shown.
Binary file not shown.
Binary file not shown.

0 commit comments

Comments
 (0)