quantization — Model Optimizer 0.27.1 (original) (raw)
modelopt.torch.quantization.calib
Calibrator classes.
modelopt.torch.quantization.compress(quant_model)
Compress model weights of quantized model.
modelopt.torch.quantization.config
This document lists the quantization formats supported by Model Optimizer and example quantization configs.
modelopt.torch.quantization.conversion
Quantization conversion/restore utilities.
modelopt.torch.quantization.export_onnx
Utility to export a quantized torch model to quantized ONNX.
modelopt.torch.quantization.extensions
Module to load C++ / CUDA extensions.
modelopt.torch.quantization.mode
This module contains the mode descriptor for the quantization mode.
modelopt.torch.quantization.model_calib
Calibration utilities.
modelopt.torch.quantization.model_quant
User-facing quantization API.
modelopt.torch.quantization.nn
Modules with quantization support.
modelopt.torch.quantization.optim
Deprecated.
modelopt.torch.quantization.plugins
Handles quantization plugins to correctly quantize third-party modules.
modelopt.torch.quantization.qtensor
Tensor Class for Real Quantization.
modelopt.torch.quantization.quant_modules
Deprecated.
modelopt.torch.quantization.tensor_quant
Basic tensor quantization functions.
modelopt.torch.quantization.triton
Triton quantization kernels.
modelopt.torch.quantization.utils
Quantization utilities.
Quantization package.