Add pretrained Wide ResNet by szagoruyko · Pull Request #912 · pytorch/vision (original) (raw)

I trained WRN-50-2 and WRN-101-2 with master torchvision, which now allows making WRN models with simple width_per_group argument. I did not use the standard training procedure for ResNet though, here are the differences:

SGD with cosine learning rate and warm restarts for 256 epochs (~0.2% to top1 accuracy)
FP16 training with batchnorm in FP32 with apex O2

so the checkpoints are in torch.float16 to save space.

model	top1, top5 error
WRN-50-2	21.49, 5.91
WRN-101-2	21.16, 5.72

idk do we want these in torchvision? I could put them in wide-residual-networks instead.