Add shufflenetv2 1.5 and 2.0 weights by YosuaMichael · Pull Request #5906 · pytorch/vision (original) (raw)

Resolve #3257

We train the model using similar with the improved recipe. Here are the commands:

# Choose the model
MODEL = shufflenet_v2_x1_5
# MODEL = shufflenet_v2_x2_0

# Training command
python \
    -u ~/script/run_with_submitit.py \
    --timeout 3000 --ngpus 8 --nodes 1 --batch-size=128 \
    --partition train --model $MODEL \
    --data-path="/datasets01_ontap/imagenet_full_size/061417" \
    --lr=0.5 --lr-scheduler=cosineannealinglr --lr-warmup-epochs=5 --lr-warmup-method=linear \
    --auto-augment=ta_wide --epochs=600 --random-erase=0.1 --weight-decay=0.00002 \
    --norm-weight-decay=0.0 --label-smoothing=0.1 --mixup-alpha=0.2 --cutmix-alpha=1.0 \
    --train-crop-size=176 --model-ema --val-resize-size=232 --ra-sampler --ra-reps=4

Once the training finished, we take the checkpoint of epoch with the highest Acc@1 accuracy, for shufflenetv2_x2_0 we take epoch 595 and for shufflenetv2_x1_5 we take epoch 594, we take the non-ema models for both. Then, we test again the checkpoints with 1 gpu and batch_size=1, here are the commands and results:

# For shufflenetv2_x1_5
python -u ~/script/run_with_submitit.py \
    --timeout 3000 --nodes 1 --ngpus 1 --batch-size=1 \
    --partition train --model shufflenet_v2_x1_5 \
    --data-path="/datasets01_ontap/imagenet_full_size/061417" \
    --weights="ShuffleNet_V2_X1_5_Weights.IMAGENET1K_V1" \
    --test-only
# Test:  Acc@1 Acc@1 72.996 Acc@5 91.086

# For shufflenetv2_x2_0
python -u ~/script/run_with_submitit.py \
    --timeout 3000 --nodes 1 --ngpus 1 --batch-size=1 \
    --partition train --model shufflenet_v2_x1_5 \
    --data-path="/datasets01_ontap/imagenet_full_size/061417" \
    --weights="ShuffleNet_V2_X1_5_Weights.IMAGENET1K_V1" \
    --test-only
# Test:  Acc@1 76.230 Acc@5 93.006

We also provide quantized model using post training quantization. Here are the commands to do it:

# For shufflenet_v2_x1_5
python train_quantization.py --device='cpu' --post-training-quantize --backend='fbgemm' \
    --model=shufflenet_v2_x1_5 --weights="ShuffleNet_V2_X1_5_Weights.IMAGENET1K_V1" \
    --train-crop-size 176 --val-resize-size 232 --data-path /datasets01_ontap/imagenet_full_size/061417/

# For shufflenet_v2_x2_0
python train_quantization.py --device='cpu' --post-training-quantize --backend='fbgemm' \
    --model=shufflenet_v2_x1_5 --weights="ShuffleNet_V2_X1_5_Weights.IMAGENET1K_V1" \
    --train-crop-size 176 --val-resize-size 232 --data-path /datasets01_ontap/imagenet_full_size/061417/

And once we have the quantized model, we do evaluation with 1 gpu and batch_size=1. Here are the commands for evaluation and the corresponding result:

python train_quantization.py --device='cpu' --eval-batch-size=1 --test-only --backend='fbgemm' \
    --model='shufflenet_v2_x2_0' \
    --data-path="/datasets01_ontap/imagenet_full_size/061417" \
    --weights="ShuffleNet_V2_X2_0_QuantizedWeights.IMAGENET1K_FBGEMM_V1"
# Acc@1 72.052 Acc@5 90.700

python train_quantization.py --device='cpu' --eval-batch-size=1 --test-only --backend='fbgemm' \
    --model='shufflenet_v2_x1_5' \
    --data-path="/datasets01_ontap/imagenet_full_size/061417" \
    --weights="ShuffleNet_V2_X1_5_QuantizedWeights.IMAGENET1K_FBGEMM_V1"
# Acc@1 75.354 Acc@5 92.488