Pruning/Sparsity Tutorial (original) (raw)

Model Pruning and Sparsity in YOLOv5

📚 This guide explains how to apply pruning to YOLOv5 🚀 models to create more efficient networks while maintaining performance.

What is Model Pruning?

Model pruning is a technique used to reduce the size and complexity of neural networks by removing less important parameters (weights and connections). This process creates a more efficient model with several benefits:

Pruning works by identifying and removing parameters that contribute minimally to the model's performance, resulting in a more lightweight model with similar accuracy.

Before You Start

Clone repo and install requirements.txt in a Python>=3.8.0 environment, including PyTorch>=1.8. Models and datasets download automatically from the latest YOLOv5 release.

git clone https://github.com/ultralytics/yolov5 # clone cd yolov5 pip install -r requirements.txt # install

Test Baseline Performance

Before pruning, establish a baseline performance to compare against. This command tests YOLOv5x on COCO val2017 at image size 640 pixels. yolov5x.pt is the largest and most accurate model available. Other options are yolov5s.pt, yolov5m.pt and yolov5l.pt, or your own checkpoint from training a custom dataset ./weights/best.pt. For details on all available models, see the README table.

python val.py --weights yolov5x.pt --data coco.yaml --img 640 --half

Output:

`val: data=/content/yolov5/data/coco.yaml, weights=['yolov5x.pt'], batch_size=32, imgsz=640, conf_thres=0.001, iou_thres=0.65, task=val, device=, workers=8, single_cls=False, augment=False, verbose=False, save_txt=False, save_conf=False, save_json=True, project=runs/val, name=exp, exist_ok=False, half=True, dnn=False YOLOv5 🚀 v6.0-224-g4c40933 torch 1.10.0+cu111 CUDA:0 (Tesla V100-SXM2-16GB, 16160MiB)

Fusing layers... Model Summary: 444 layers, 86705005 parameters, 0 gradients val: Scanning '/content/datasets/coco/val2017.cache' images and labels... 4952 found, 48 missing, 0 empty, 0 corrupt: 100% 5000/5000 [00:00<?, ?it/s] Class Images Labels P R mAP@.5 mAP@.5:.95: 100% 157/157 [01:12<00:00, 2.16it/s] all 5000 36335 0.732 0.628 0.683 0.496 Speed: 0.1ms pre-process, 5.2ms inference, 1.7ms NMS per image at shape (32, 3, 640, 640) # <--- base speed

Evaluating pycocotools mAP... saving runs/val/exp2/yolov5x_predictions.json... ... Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.507 # <--- base mAP Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.689 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.552 Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.345 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.559 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.652 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.381 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.630 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.682 Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.526 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.731 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.829 Results saved to runs/val/exp `

Apply Pruning to YOLOv5x (30% Sparsity)

We can apply pruning to the model using the torch_utils.prune() command. To test a pruned model, we update val.py to prune YOLOv5x to 0.3 sparsity (30% of weights set to zero):

Screenshot showing code for pruning YOLOv5x to 30% sparsity

30% pruned output:

`val: data=/content/yolov5/data/coco.yaml, weights=['yolov5x.pt'], batch_size=32, imgsz=640, conf_thres=0.001, iou_thres=0.65, task=val, device=, workers=8, single_cls=False, augment=False, verbose=False, save_txt=False, save_conf=False, save_json=True, project=runs/val, name=exp, exist_ok=False, half=True, dnn=False YOLOv5 🚀 v6.0-224-g4c40933 torch 1.10.0+cu111 CUDA:0 (Tesla V100-SXM2-16GB, 16160MiB)

Fusing layers... Model Summary: 444 layers, 86705005 parameters, 0 gradients Pruning model... 0.3 global sparsity val: Scanning '/content/datasets/coco/val2017.cache' images and labels... 4952 found, 48 missing, 0 empty, 0 corrupt: 100% 5000/5000 [00:00<?, ?it/s] Class Images Labels P R mAP@.5 mAP@.5:.95: 100% 157/157 [01:11<00:00, 2.19it/s] all 5000 36335 0.724 0.614 0.671 0.478 Speed: 0.1ms pre-process, 5.2ms inference, 1.7ms NMS per image at shape (32, 3, 640, 640) # <--- prune speed

Evaluating pycocotools mAP... saving runs/val/exp3/yolov5x_predictions.json... ... Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.489 # <--- prune mAP Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.677 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.537 Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.334 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.542 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.635 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.370 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.612 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.664 Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.496 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.722 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.803 Results saved to runs/val/exp3 `

Results Analysis

From the results, we can observe:

This demonstrates that pruning can significantly reduce model complexity with only a minor impact on performance, making it an effective optimization technique for deployment in resource-constrained environments.

Fine-tuning Pruned Models

For best results, pruned models should be fine-tuned after pruning to recover accuracy. This can be done by:

  1. Applying pruning with a desired sparsity level
  2. Training the pruned model for a few epochs with a lower learning rate
  3. Evaluating the fine-tuned pruned model against the baseline

This process helps the remaining parameters adapt to compensate for the removed connections, often recovering most or all of the original accuracy.

Supported Environments

Ultralytics provides a range of ready-to-use environments, each pre-installed with essential dependencies such as CUDA, CUDNN, Python, and PyTorch, to kickstart your projects.

Project Status

YOLOv5 CI

This badge indicates that all YOLOv5 GitHub Actions Continuous Integration (CI) tests are successfully passing. These CI tests rigorously check the functionality and performance of YOLOv5 across various key aspects: training, validation, inference, export, and benchmarks. They ensure consistent and reliable operation on macOS, Windows, and Ubuntu, with tests conducted every 24 hours and upon each new commit.

📅 Created 1 year ago ✏️ Updated 25 days ago

Laughing-q glenn-jocher UltralyticsAssistant RizwanMunawar