Model Zoo — MMSelfSup 1.0.0 documentation (original) (raw)
Shortcuts
All models and part of benchmark results are recorded below.
Benchmarks¶
ImageNet¶
ImageNet has multiple versions, but the most commonly used one is ILSVRC 2012. The classification results below are reported by linear evaluation or fine-tuning with pre-trained weights provided by various algorithms.
Algorithm | Backbone | Epoch | Batch Size | Results (Top-1 %) | Links | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Linear Eval | Fine-tuning | Pretrain | Linear Eval | Fine-tuning | |||||||
Relative-Loc | ResNet50 | 70 | 512 | 40.4 | / | config | model | log | config | model | log | / | |
Rotation-Pred | ResNet50 | 70 | 128 | 47.0 | / | config | model | log | config | model | log | / | |
NPID | ResNet50 | 200 | 256 | 58.3 | / | config | model | log | config | model | log | / | |
SimCLR | ResNet50 | 200 | 256 | 62.7 | / | config | model | log | config | model | log | / | |
ResNet50 | 200 | 4096 | 66.9 | / | config | model | log | config | model | log | / | ||
ResNet50 | 800 | 4096 | 69.2 | / | config | model | log | config | model | log | / | ||
MoCo v2 | ResNet50 | 200 | 256 | 67.5 | / | config | model | log | config | model | log | / | |
BYOL | ResNet50 | 200 | 4096 | 71.8 | / | config | model | log | config | model | log | / | |
SwAV | ResNet50 | 200 | 256 | 70.5 | / | config | model | log | config | model | log | / | |
DenseCL | ResNet50 | 200 | 256 | 63.5 | / | config | model | log | config | model | log | / | |
SimSiam | ResNet50 | 100 | 256 | 68.3 | / | config | model | log | config | model | log | / | |
ResNet50 | 200 | 256 | 69.8 | / | config | model | log | config | model | log | / | ||
BarlowTwins | ResNet50 | 300 | 2048 | 71.8 | / | config | model | log | config | model | log | / | |
MoCo v3 | ResNet50 | 100 | 4096 | 69.6 | / | config | model | log | config | model | log | / | |
ResNet50 | 300 | 4096 | 72.8 | / | config | model | log | config | model | log | / | ||
ResNet50 | 800 | 4096 | 74.4 | / | config | model | log | config | model | log | / | ||
ViT-small | 300 | 4096 | 73.6 | / | config | model | log | config | model | log | / | ||
ViT-base | 300 | 4096 | 76.9 | 83.0 | config | model | log | config | model | log | config | model | log | |
ViT-large | 300 | 4096 | / | 83.7 | config | model | log | / | config | model | log | ||
MAE | ViT-base | 300 | 4096 | 60.8 | 82.8 | config | model | log | config | model | log | config | model | log |
ViT-base | 400 | 4096 | 62.5 | 83.3 | config | model | log | config | model | log | config | model | log | |
ViT-base | 800 | 4096 | 65.1 | 83.3 | config | model | log | config | model | log | config | model | log | |
ViT-base | 1600 | 4096 | 67.1 | 83.5 | config | model | log | config | model | log | config | model | log | |
ViT-large | 400 | 4096 | 70.7 | 85.2 | config | model | log | config | model | log | config | model | log | |
ViT-large | 800 | 4096 | 73.7 | 85.4 | config | model | log | config | model | log | config | model | log | |
ViT-large | 1600 | 4096 | 75.5 | 85.7 | config | model | log | config | model | log | config | model | log | |
ViT-huge-FT-224 | 1600 | 4096 | / | 86.9 | config | model | log | / | config | model | log | ||
ViT-huge-FT-448 | 1600 | 4096 | / | 87.3 | config | model | log | / | config | model | log | ||
CAE | ViT-base | 300 | 2048 | / | 83.3 | config | model | log | / | config | model | log | |
SimMIM | Swin-base-FT192 | 100 | 2048 | / | 82.7 | config | model | log | / | config | model | log | |
Swin-base-FT224 | 100 | 2048 | / | 83.5 | config | model | log | / | config | model | log | ||
Swin-base-FT224 | 800 | 2048 | / | 83.7 | config | model | log | / | config | model | log | ||
Swin-large-FT224 | 800 | 2048 | / | 84.8 | config | model | log | / | config | model | log | ||
MaskFeat | ViT-base | 300 | 2048 | / | 83.4 | config | model | log | / | config | model | log | |
BEiT | ViT-base | 300 | 2048 | / | 83.1 | config | model | log | / | config | model | log | |
MILAN | ViT-base | 400 | 4096 | 78.9 | 85.3 | config | model | log | config | model | log | config | model | log |
BEiT v2 | ViT-base | 300 | 2048 | / | 85.0 | config | model | log | / | config | model | log | |
EVA | ViT-base | 400 | 4096 | 69.0 | 83.7 | config | model | log | config | model | log | config | model | log |
MixMIM | MixMIM-Base | 400 | 2048 | / | 84.6 | config | model | log | / | config | model | log | |
PixMIM | ViT-base | 300 | 4096 | 63.3 | 83.1 | config | model | log | config | model | log | config | model | log |
ViT-base | 800 | 4096 | 67.5 | 83.5 | config | model | log | config | model | log | config | model | log |