Model Zoo — detrex documentation (original) (raw)
COCO Object Detection Baselines
Here we provides our pretrained baselines with detrex. And more pretrained weights will be released in the future version. We also provide our converted pretrained weights for the users which will be marked as (converted)
.
DETR
Name | Backbone | Pretrained | Epochs | boxAP | Download |
---|---|---|---|---|---|
DETR-R50 (converted) | R-50 | IN1k | 500 | 42.0 | model |
DETR-R50-DC5 (converted) | R-50 | IN1k | 500 | 43.4 | model |
DETR-R101 (converted) | R-101 | IN1k | 500 | 43.5 | model |
DETR-R101-DC5 (converted) | R-101 | IN1k | 500 | 44.9 | model |
Deformable-DETR
Name | Backbone | Pretrained | Epochs | boxAP | Download |
---|---|---|---|---|---|
Deformable-DETR + Box Refinement | R50 | IN1k | 50 | 47.0 | model |
Deformable-DETR + Box Refinement + Two Stage | R50 | IN1k | 50 | 48.2 | model |
Anchor-DETR
Name | Backbone | Pretrain | Epochs | boxAP | download |
---|---|---|---|---|---|
Anchor-DETR-R50 | R-50 | IN1k | 50 | 41.9 | model |
Anchor-DETR-R50 (converted) | R-50 | IN1k | 50 | 42.2 | model |
Anchor-DETR-R50-DC5 (converted) | R-50 | IN1k | 50 | 44.2 | model |
Anchor-DETR-R101 (converted) | R-101 | IN1k | 50 | 43.5 | model |
Anchor-DETR-R101-DC5 (converted) | R-101 | IN1k | 50 | 45.1 | model |
Conditional-DETR
Name | Backbone | Pretrain | Epochs | boxAP | download |
---|---|---|---|---|---|
Conditional-DETR-R50 | R-50 | IN1k | 50 | 41.6 | model |
Conditional-DETR-R50-DC5 (converted) | R-50-DC5 | IN1k | 50 | 43.8 | model |
Conditional-DETR-R101 (converted) | R-101 | IN1k | 50 | 43.0 | model |
Conditional-DETR-R101-DC5 (converted) | R-101-DC5 | IN1k | 50 | 45.1 | model |
DAB-DETR
Name | Backbone | Pretrained | Epochs | boxAP | Download |
---|---|---|---|---|---|
DAB-DETR-R50 | R50 | IN1k | 50 | 43.3 | model |
DAB-DETR-R50-3patterns (converted) | R-50 | IN1k | 50 | 42.8 | model |
DAB-DETR-R50-DC5 (converted) | R-50 | IN1k | 50 | 44.6 | model |
DAB-DETR-R50-DC5-3patterns (converted) | R-50 | IN1k | 50 | 45.7 | model |
DAB-DETR-R101 | R101 | IN1k | 50 | 44.0 | model |
DAB-DETR-R101-DC5 (converted) | R-101 | IN1k | 50 | 45.7 | model |
DAB-DETR-Swin-T | Swin-Tiny-224 | IN1k | 50 | 45.2 | model |
DAB-Deformable-DETR-R50 | R50 | IN1k | 50 | 49.0 | model |
DAB-Deformable-DETR-R50-Two-Stage | R50 | IN1k | 50 | 49.7 | model |
DN-DETR
Name | Backbone | Pretrained | Epochs | boxAP | Download |
---|---|---|---|---|---|
DN-DETR-R50 | R50 | IN1k | 50 | 44.7 | model |
DN-DETR-R50-DC5 (converted) | R50 | IN1k | 50 | 46.3 | model |
DINO
Pretrained DINO with ResNet Backbone
Name | Backbone | Pretrained | Epochs | Denoising Queries | boxAP | Download |
---|---|---|---|---|---|---|
DINO-R50-4scale | R50 | IN1k | 12 | 100 | 49.2 | model |
DINO-R50-4scale (hacked trainer) | R-50 | IN1k | 12 | 100 | 49.4 | model |
DINO-R50-4scale with EMA | R-50 | IN1k | 12 | 100 | 49.4 | model |
DINO-R50-5scale | R50 | IN1k | 12 | 100 | 49.6 | model |
DINO-R50-4scale | R50 | IN1k | 12 | 300 | 49.5 | model |
DINO-R50-4scale | R50 | IN1k | 24 | 100 | 50.6 | model |
DINO-R101-4scale | R101 | IN1k | 12 | 100 | 50.0 | model |
Pretrained DINO with Swin-Transformer Backbone
Name | Backbone | Pretrained | Epochs | Denoising Queries | boxAP | Download |
---|---|---|---|---|---|---|
DINO-Swin-T-224-4scale | Swin-Tiny-224 | IN1k | 12 | 100 | 51.3 | model |
DINO-Swin-T-224-4scale | Swin-Tiny-224 | IN22k to IN1k | 12 | 100 | 52.5 | model |
DINO-Swin-S-224-4scale | Swin-Small-224 | IN1k | 12 | 100 | 53.0 | model |
DINO-Swin-B-384-4scale | Swin-Base-384 | IN22k to IN1k | 12 | 100 | 55.8 | model |
DINO-Swin-L-224-4scale | Swin-Large-224 | IN22k to IN1k | 12 | 100 | 56.9 | model |
DINO-Swin-L-384-4scale | Swin-Large-384 | IN22k to IN1k | 12 | 100 | 56.9 | model |
DINO-Swin-L-384-5scale | Swin-Large-384 | IN22k to IN1k | 12 | 100 | 57.5 | model |
DINO-Swin-L-384-4scale | Swin-Large-384 | IN22k to IN1k | 36 | 100 | 58.1 | model |
DINO-Swin-L-384-5scale | Swin-Large-384 | IN22k to IN1k | 36 | 100 | 58.5 | model |
Pretrained DINO with FocalNet Backbone
Name | Backbone | Pretrained | Epochs | Denoising Queries | boxAP | Download |
---|---|---|---|---|---|---|
DINO-FocalNet-Large-4scale | FocalNet-384-LRF-3Level | IN22k | 12 | 100 | 57.5 | model |
DINO-FocalNet-Large-4scale | FocalNet-384-LRF-4Level | IN22k | 12 | 100 | 58.0 | model |
DINO-FocalNet-Large-5scale | FocalNet-384-LRF-4Level | IN22k | 12 | 100 | 58.5 | model |
Pretrained DINO with ViTDet Backbone
Name | Backbone | Pretrained | Epochs | Denoising Queries | boxAP | Download |
---|---|---|---|---|---|---|
DINO-ViTDet-Base-4scale | ViT | IN1k, MAE | 12 | 100 | 50.2 | model |
DINO-ViTDet-Base-4scale | ViT | IN1k, MAE | 50 | 100 | 55.0 | model |
DINO-ViTDet-Large-4scale | ViT | IN1k, MAE | 12 | 100 | 52.9 | model |
DINO-ViTDet-Large-4scale | ViT | IN1k, MAE | 50 | 100 | 57.5 | model |
H-Deformable-DETR
Name | Backbone | Pretrained | Query | Epochs | boxAP | Download |
---|---|---|---|---|---|---|
H-Deformable-DETR-R50 + tricks (detrex) | R50 | IN1k | 300 | 12 | 49.1 | model |
H-Deformable-DETR-R50 + tricks (converted) | R50 | IN1k | 300 | 12 | 48.9 | model |
H-Deformable-DETR-R50 + tricks (converted) | R50 | IN1k | 300 | 36 | 50.3 | model |
H-Deformable-DETR-Swin-T + tricks (converted) | Swin-Tiny | IN1k | 300 | 12 | 50.6 | model |
H-Deformable-DETR-Swin-T + tricks (converted) | Swin-Tiny | IN1k | 300 | 36 | 53.5 | model |
H-Deformable-DETR-Swin-L + tricks (converted) | Swin-Large | IN22k | 300 | 12 | 56.2 | model |
H-Deformable-DETR-Swin-L + tricks (converted) | Swin-Large | IN22k | 300 | 36 | 57.5 | model |
H-Deformable-DETR-Swin-L + tricks (converted) | Swin-Large | IN22k | 900 | 12 | 56.4 | model |
H-Deformable-DETR-Swin-L + tricks (converted) | Swin-Large | IN22k | 300 | 36 | 57.5 | model |
DETA
Name | Backbone | Pretrained | Epochs | boxAP | Download |
---|---|---|---|---|---|
Improved-Deformable-DETR-R50 (converted) | R-50 | IN1k | 50 | 49.8 | model |
DETA-R50-5scale (bs=8, 180000 iterations) | R-50 | IN1k | 12 | 50.0 | model |
DETA-R50-5scale (with hacked train engine) | R-50 | IN1k | 12 | 49.9 | model |
DETA-R50-5scale-12ep (no frozen backbone) | R-50 | IN1k | 12 | 50.2 | model |
DETA-R50-5scale (converted) | R-50 | IN1k | 12 | 50.1 | model |
DETA-Swin-Large-finetune (converted) | Swin-Large-384 | Object 365 | 24 | 62.9 | model |