🌄 View-Robust Backbone and Discriminative Reconstruction for Few-Shot Fine-Grained Image Classification
⭐ If you find our code useful, please consider starring this repository!
We study few-shot fine-grained image classification, a task that faces two key challenges: (1) the scarcity of labeled samples amplifies the model’s sensitivity to viewpoint variations, resulting in feature inconsistency, and (2) reconstruction-based methods, while improving inter-class separability, inadvertently introduce intra-class variations, further complicating discrimination. To address these challenges, we propose the View-Robust Attention Selector (VRAS), a feature enhancement backbone designed to mitigate viewpoint-induced misclassifications. By integrating cross-scale feature interaction and adaptive selection mechanisms, VRAS effectively reduces spatial sensitivity arising from the limited viewpoint diversity in few-shot support sets. This approach not only preserves intra-class consistency but also enhances inter-class discriminability, ensuring robust feature representations. Furthermore, we introduce the Enhancement and Reconstruction (ER) module, designed to strengthen discriminative learning. ER achieves this by maximizing inter-class divergence while enhancing intra-class compactness through a regularized Ridge Regression optimization strategy. By dynamically suppressing low-saliency dimensions, ER maintains geometric coherence and effectively filters out semantic noise. Extensive experiments on three fine-grained datasets show that our method significantly outperforms state-of-the-art few-shot classification methods.
🔧 This repository provides implementations for two VRAS backbone variants:
- 🧩 VRAS-Conv-4 – lightweight, simple convolutional baseline.
- 🧱 VRAS-ResNet-12 – deeper architecture with stronger representation capacity.
📦 To get started, create and activate the Conda environment using the provided YAML file:
conda env create -f environment.yml
conda activate VRAS
✅ Environment ready!
🎯 Run the training and evaluation scripts on different datasets:
# VRAS-Conv-4
python experiments/CUB/VRAS-Conv-4/train.py
# VRAS-ResNet-12
python experiments/CUB/VRAS-ResNet-12/train.py
# VRAS-Conv-4
python experiments/cars/VRAS-Conv-4/train.py
# VRAS-ResNet-12
python experiments/cars/VRAS-ResNet-12/train.py
# VRAS-Conv-4
python experiments/dogs/VRAS-Conv-4/train.py
# VRAS-ResNet-12
python experiments/dogs/VRAS-ResNet-12/train.py
Table 1: Comparison with state-of-the-art methods.
Table 3: Viewpoint-Robust Ablation Study.
Special thanks to the open-source community, especially FRN, whose work inspired part of this project. 💡