Sparsity — Model Optimizer 0.27.1 (original) (raw)
Quick Start: Sparsity
ModelOpt’s sparsity feature is an effective technique to reduce the memory footprint of deep learning models and accelerate the inference speed. ModelOpt provides the easy-to-use API mts.sparsify() to apply weight sparsity to a given model.mts.sparsify() supportsNVIDIA 2:4 Sparsity sparsity pattern and various sparsification methods, such as NVIDIA ASPand SparseGPT.
This guide provides a quick start to apply weight sparsity to a PyTorch model using ModelOpt.
Post-Training Sparsification (PTS) for PyTorch models
mts.sparsify() requires the model, the appropriate sparsity configuration, and a forward loop as inputs. Here is a quick example of sparsifying a model to 2:4 sparsity pattern with SparseGPT method usingmts.sparsify().
import modelopt.torch.sparsity as mts
Setup the model
model = get_model()
Setup the data loaders. An example usage:
data_loader = get_train_dataloader(num_samples=calib_size)
Define the sparsity configuration
sparsity_config = {"data_loader": data_loader, "collect_func": lambda x: x}
Sparsify the model and perform calibration (PTS)
model = mts.sparsify(model, mode="sparsegpt", config=sparsity_config)
Note
data_loader is only required in case of data-driven sparsity, e.g., SparseGPT for calibration.sparse_magnitude does not require data_loader as it is purely based on the weights of the model.
Note
data_loader and collect_func can be substituted with a forward_loop that iterates the model through the calibration dataset.
Sparsity-aware Training (SAT) for PyTorch models
After sparsifying the model, you can save the checkpoint for the sparsified model and use it for fine-tuning the sparsified model. Check out theGitHub end-to-end exampleto learn more about SAT.
Next Steps
- Learn more about sparsity and advanced usage of ModelOpt sparsity inSparsity guide.
- Checkout out the end-to-end example on GitHubfor PTS and SAT.