Quantization, Projection, and Pruning - MATLAB & Simulink (original) (raw)

Main Content

Compress a deep neural network by performing quantization, projection, or pruning

Use Deep Learning Toolbox™ together with the Deep Learning Toolbox Model Quantization Library support package to reduce the memory footprint and computational requirements of a deep neural network by:

For a detailed overview of the compression techniques available in Deep Learning Toolbox Model Quantization Library, see Reduce Memory Footprint of Deep Neural Networks.

Functions

expand all

Pruning

taylorPrunableNetwork Neural network suitable for compression using Taylor pruning (Since R2022a)
forward Compute deep learning network output for training
predict Compute deep learning network output for inference
updatePrunables Remove filters from prunable layers based on importance scores (Since R2022a)
updateScore Compute and accumulate Taylor-based importance scores for pruning (Since R2022a)
dlnetwork Deep learning neural network

Projection

Quantization

dlquantizer Quantize a deep neural network to 8-bit scaled integer data types (Since R2020a)
dlquantizationOptions Options for quantizing a trained deep neural network (Since R2020a)
prepareNetwork Prepare deep neural network for quantization (Since R2024b)
calibrate Simulate and collect ranges of a deep neural network (Since R2020a)
quantize Quantize deep neural network (Since R2022a)
validate Quantize and validate a deep neural network (Since R2020a)
quantizationDetails Display quantization details for a neural network (Since R2022a)
estimateNetworkMetrics Estimate network metrics for specific layers of a neural network (Since R2022a)
equalizeLayers Equalize layer parameters of deep neural network (Since R2022b)
exportNetworkToSimulink Generate Simulink model that contains deep learning layer blocks that correspond to deep learning layer objects (Since R2024b)

Apps

Topics

Overview

Pruning

Projection and Knowledge Distillation

Quantization

Quantization for GPU Target

Quantization for FPGA Target

Quantization for CPU Target