Sparsity — Model Optimizer 0.31.0 (original) (raw)

Quick Start: Sparsity

ModelOpt’s sparsity feature is an effective technique to reduce the memory footprint of deep learning models and accelerate the inference speed. ModelOpt provides the easy-to-use API mts.sparsify() to apply weight sparsity to a given model.mts.sparsify() supportsNVIDIA 2:4 Sparsity sparsity pattern and various sparsification methods, such as NVIDIA ASPand SparseGPT.

This guide provides a quick start to apply weight sparsity to a PyTorch model using ModelOpt.

Post-Training Sparsification (PTS) for PyTorch models

mts.sparsify() requires the model, the appropriate sparsity configuration, and a forward loop as inputs. Here is a quick example of sparsifying a model to 2:4 sparsity pattern with SparseGPT method usingmts.sparsify().

import modelopt.torch.sparsity as mts

Setup the model

model = get_model()

Setup the data loaders. An example usage:

data_loader = get_train_dataloader(num_samples=calib_size)

Define the sparsity configuration

sparsity_config = {"data_loader": data_loader, "collect_func": lambda x: x}

Sparsify the model and perform calibration (PTS)

model = mts.sparsify(model, mode="sparsegpt", config=sparsity_config)

Note

data_loader is only required in case of data-driven sparsity, e.g., SparseGPT for calibration.sparse_magnitude does not require data_loader as it is purely based on the weights of the model.

Note

data_loader and collect_func can be substituted with a forward_loop that iterates the model through the calibration dataset.

Sparsity-aware Training (SAT) for PyTorch models

After sparsifying the model, you can save the checkpoint for the sparsified model and use it for fine-tuning the sparsified model. Check out theGitHub end-to-end exampleto learn more about SAT.

Next Steps

Learn more about sparsity and advanced usage of ModelOpt sparsity inSparsity guide.
Checkout out the end-to-end example on GitHubfor PTS and SAT.