Add shufflenetv2 1.5 and 2.0 weights by YosuaMichael · Pull Request #5906 · pytorch/vision (original) (raw)
Resolve #3257
We train the model using similar with the improved recipe. Here are the commands:
# Choose the model
MODEL = shufflenet_v2_x1_5
# MODEL = shufflenet_v2_x2_0
# Training command
python \
-u ~/script/run_with_submitit.py \
--timeout 3000 --ngpus 8 --nodes 1 --batch-size=128 \
--partition train --model $MODEL \
--data-path="/datasets01_ontap/imagenet_full_size/061417" \
--lr=0.5 --lr-scheduler=cosineannealinglr --lr-warmup-epochs=5 --lr-warmup-method=linear \
--auto-augment=ta_wide --epochs=600 --random-erase=0.1 --weight-decay=0.00002 \
--norm-weight-decay=0.0 --label-smoothing=0.1 --mixup-alpha=0.2 --cutmix-alpha=1.0 \
--train-crop-size=176 --model-ema --val-resize-size=232 --ra-sampler --ra-reps=4
Once the training finished, we take the checkpoint of epoch with the highest Acc@1
accuracy, for shufflenetv2_x2_0 we take epoch 595 and for shufflenetv2_x1_5 we take epoch 594, we take the non-ema models for both. Then, we test again the checkpoints with 1 gpu and batch_size=1, here are the commands and results:
# For shufflenetv2_x1_5
python -u ~/script/run_with_submitit.py \
--timeout 3000 --nodes 1 --ngpus 1 --batch-size=1 \
--partition train --model shufflenet_v2_x1_5 \
--data-path="/datasets01_ontap/imagenet_full_size/061417" \
--weights="ShuffleNet_V2_X1_5_Weights.IMAGENET1K_V1" \
--test-only
# Test: Acc@1 Acc@1 72.996 Acc@5 91.086
# For shufflenetv2_x2_0
python -u ~/script/run_with_submitit.py \
--timeout 3000 --nodes 1 --ngpus 1 --batch-size=1 \
--partition train --model shufflenet_v2_x1_5 \
--data-path="/datasets01_ontap/imagenet_full_size/061417" \
--weights="ShuffleNet_V2_X1_5_Weights.IMAGENET1K_V1" \
--test-only
# Test: Acc@1 76.230 Acc@5 93.006
We also provide quantized model using post training quantization. Here are the commands to do it:
# For shufflenet_v2_x1_5
python train_quantization.py --device='cpu' --post-training-quantize --backend='fbgemm' \
--model=shufflenet_v2_x1_5 --weights="ShuffleNet_V2_X1_5_Weights.IMAGENET1K_V1" \
--train-crop-size 176 --val-resize-size 232 --data-path /datasets01_ontap/imagenet_full_size/061417/
# For shufflenet_v2_x2_0
python train_quantization.py --device='cpu' --post-training-quantize --backend='fbgemm' \
--model=shufflenet_v2_x1_5 --weights="ShuffleNet_V2_X1_5_Weights.IMAGENET1K_V1" \
--train-crop-size 176 --val-resize-size 232 --data-path /datasets01_ontap/imagenet_full_size/061417/
And once we have the quantized model, we do evaluation with 1 gpu and batch_size=1. Here are the commands for evaluation and the corresponding result:
python train_quantization.py --device='cpu' --eval-batch-size=1 --test-only --backend='fbgemm' \
--model='shufflenet_v2_x2_0' \
--data-path="/datasets01_ontap/imagenet_full_size/061417" \
--weights="ShuffleNet_V2_X2_0_QuantizedWeights.IMAGENET1K_FBGEMM_V1"
# Acc@1 72.052 Acc@5 90.700
python train_quantization.py --device='cpu' --eval-batch-size=1 --test-only --backend='fbgemm' \
--model='shufflenet_v2_x1_5' \
--data-path="/datasets01_ontap/imagenet_full_size/061417" \
--weights="ShuffleNet_V2_X1_5_QuantizedWeights.IMAGENET1K_FBGEMM_V1"
# Acc@1 75.354 Acc@5 92.488