Add pretrained Wide ResNet by szagoruyko · Pull Request #912 · pytorch/vision (original) (raw)
I trained WRN-50-2 and WRN-101-2 with master torchvision, which now allows making WRN models with simple width_per_group argument. I did not use the standard training procedure for ResNet though, here are the differences:
- SGD with cosine learning rate and warm restarts for 256 epochs (~0.2% to top1 accuracy)
- FP16 training with batchnorm in FP32 with apex O2
so the checkpoints are in torch.float16 to save space.
model | top1, top5 error |
---|---|
WRN-50-2 | 21.49, 5.91 |
WRN-101-2 | 21.16, 5.72 |
idk do we want these in torchvision? I could put them in wide-residual-networks instead.