Add SWAG model weight that only the linear head is finetuned to ImageNet1K by YosuaMichael · Pull Request #5793 · pytorch/vision (original) (raw)
subtask of #5708
Model Description
This model has trunk weight from weakly supervised learning described in https://arxiv.org/pdf/2201.08371.pdf. The linear head is fine-tuned to IMAGENET1K dataset while the pre-trained trunk weights are frozen.
This model is suitable for users that want to fine tune the pre-trained trunk on other downstream datasets
Linear head Fine-tuning parameters on IMAGENET1K:
Regnet model (for all size 16gf, 32gf, 128gf):
- Num epochs: 28
- Trained on 1 nodes with 8 voltas GPU (32Gb) each
- Batch size per GPU: 32
- image size: 224
- SGD Optimizer with params:
- weight decay: 0.001
- momentum: 0.9
- use Nesterov: True
- Learning Rate param:
- scheduler: CosineAnnealingLR
- Start value: 0.001
- ImageAugmentation transforms:
- RandomResizeCrop of size 224 with interpolation 3
- RandomHorizontalFlip
- Normalize
- Note: Trained with pytorch mixed precision
VIsion Transformer (for all size b/16, l/16, h/14):
- Num epochs: 28
- Trained on 4 nodes with 8 voltas GPU (32Gb) each
- Batch size per GPU: 32
- image size: 224
- SGD Optimizer with params:
- weight decay: 1.00 E-09
- momentum: 0.9
- use Nesterov: True
- Learning Rate param:
- scheduler: CosineAnnealingLR
- Start value: 0.04
- ImageAugmentation transforms:
- RandomResizeCrop of size 224 with interpolation 3
- RandomHorizontalFlip
- Normalize
- Note: Trained with pytorch mixed precision
Validation script and result
## RegNet_Y_16GF
python -u ~/script/run_with_submitit.py --timeout 3000 --ngpus 1 --nodes 1 --partition train --model regnet_y_16gf --data-path="/datasets01_ontap/imagenet_full_size/061417" --test-only --batch-size=1 --weights="RegNet_Y_16GF_Weights.IMAGENET1K_SWAG_LINEAR_V1"
# Acc@1 83.976 Acc@5 97.244
## RegNet_Y_32GF
python -u ~/script/run_with_submitit.py --timeout 3000 --ngpus 1 --nodes 1 --partition train --model regnet_y_32gf --data-path="/datasets01_ontap/imagenet_full_size/061417" --test-only --batch-size=1 --weights="RegNet_Y_32GF_Weights.IMAGENET1K_SWAG_LINEAR_V1"
# Acc@1 84.622 Acc@5 97.480
## RegNet_Y_128GF
python -u ~/script/run_with_submitit.py --timeout 3000 --ngpus 1 --nodes 1 --partition train --model regnet_y_128gf --data-path="/datasets01_ontap/imagenet_full_size/061417" --test-only --batch-size=1 --weights="RegNet_Y_128GF_Weights.IMAGENET1K_SWAG_LINEAR_V1"
# Acc@1 86.068 Acc@5 97.844
## ViT_B_16
python -u ~/script/run_with_submitit.py --timeout 3000 --ngpus 1 --nodes 1 --partition train --model vit_b_16 --data-path="/datasets01_ontap/imagenet_full_size/061417" --test-only --batch-size=1 --weights="ViT_B_16_Weights.IMAGENET1K_SWAG_LINEAR_V1"
# Acc@1 81.886 Acc@5 96.180
## ViT_L_16
python -u ~/script/run_with_submitit.py --timeout 3000 --ngpus 1 --nodes 1 --partition train --model vit_l_16 --data-path="/datasets01_ontap/imagenet_full_size/061417" --test-only --batch-size=1 --weights="ViT_L_16_Weights.IMAGENET1K_SWAG_LINEAR_V1"
# Acc@1 85.146 Acc@5 97.422
## ViT_H_14
python -u ~/script/run_with_submitit.py --timeout 3000 --ngpus 1 --nodes 1 --partition train --model vit_h_14 --data-path="/datasets01_ontap/imagenet_full_size/061417" --test-only --batch-size=1 --weights="ViT_H_14_Weights.IMAGENET1K_SWAG_LINEAR_V1"
# Acc@1 85.708 Acc@5 97.730
Sample script to load model
from torchvision.models.vision_transformer import vit_b_16, ViT_B_16_Weights
m = vit_b_16(weights=ViT_B_16_Weights.IMAGENET1K_SWAG_LINEAR_V1)