Quantization API Reference — PyTorch 2.7 documentation (original) (raw)

torch.ao.quantization¶

This module contains Eager mode quantization APIs.

Top level APIs¶

quantize	Quantize the input float model with post training static quantization.
quantize_dynamic	Converts a float model to dynamic (i.e.
quantize_qat	Do quantization aware training and output a quantized model
prepare	Prepares a copy of the model for quantization calibration or quantization-aware training.
prepare_qat	Prepares a copy of the model for quantization calibration or quantization-aware training and converts it to quantized version.
convert	Converts submodules in input module to a different module according to mapping by calling from_float method on the target module class.

Preparing model for quantization¶

fuse_modules.fuse_modules	Fuse a list of modules into a single module.
QuantStub	Quantize stub module, before calibration, this is same as an observer, it will be swapped as nnq.Quantize in convert.
DeQuantStub	Dequantize stub module, before calibration, this is same as identity, this will be swapped as nnq.DeQuantize in convert.
QuantWrapper	A wrapper class that wraps the input module, adds QuantStub and DeQuantStub and surround the call to module with call to quant and dequant modules.
add_quant_dequant	Wrap the leaf child module in QuantWrapper if it has a valid qconfig Note that this function will modify the children of module inplace and it can return a new module which wraps the input module as well.

Utility functions¶

swap_module	Swaps the module if it has a quantized counterpart and it has an observer attached.
propagate_qconfig_	Propagate qconfig through the module hierarchy and assign qconfig attribute on each leaf module
default_eval_fn	Define the default evaluation function.

torch.ao.quantization.quantize_fx¶

This module contains FX graph mode quantization APIs (prototype).

prepare_fx	Prepare a model for post training quantization
prepare_qat_fx	Prepare a model for quantization aware training
convert_fx	Convert a calibrated or trained model to a quantized model
fuse_fx	Fuse modules like conv+bn, conv+bn+relu etc, model must be in eval mode.

torch.ao.quantization.qconfig_mapping¶

This module contains QConfigMapping for configuring FX graph mode quantization.

torch.ao.quantization.backend_config¶

This module contains BackendConfig, a config object that defines how quantization is supported in a backend. Currently only used by FX Graph Mode Quantization, but we may extend Eager Mode Quantization to work with this as well.

BackendConfig	Config that defines the set of patterns that can be quantized on a given backend, and how reference quantized models can be produced from these patterns.
BackendPatternConfig	Config object that specifies quantization behavior for a given operator pattern.
DTypeConfig	Config object that specifies the supported data types passed as arguments to quantize ops in the reference model spec, for input and output activations, weights, and biases.
DTypeWithConstraints	Config for specifying additional constraints for a given dtype, such as quantization value ranges, scale value ranges, and fixed quantization params, to be used in DTypeConfig.
ObservationType	An enum that represents different ways of how an operator/operator pattern should be observed

torch.ao.quantization.fx.custom_config¶

This module contains a few CustomConfig classes that’s used in both eager mode and FX graph mode quantization

torch.ao.quantization.quantizer¶

torch.ao.quantization.pt2e (quantization in pytorch 2.0 export implementation)¶

torch.ao.quantization.pt2e.export_utils¶

model_is_exported	Return True if the torch.nn.Module was exported, False otherwise (e.g.

PT2 Export (pt2e) Numeric Debugger¶

generate_numeric_debug_handle	Attach numeric_debug_handle_id for all nodes in the graph module of the given ExportedProgram, like conv2d, squeeze, conv1d, etc, except for placeholder.
CUSTOM_KEY	str(object='') -> str str(bytes_or_buffer[, encoding[, errors]]) -> str
NUMERIC_DEBUG_HANDLE_KEY	str(object='') -> str str(bytes_or_buffer[, encoding[, errors]]) -> str
prepare_for_propagation_comparison	Add output loggers to node that has numeric_debug_handle
extract_results_from_loggers	For a given model, extract the tensors stats and related information for each debug handle.
compare_results	Given two dict mapping from debug_handle_id (int) to list of tensors return a map from debug_handle_id to NodeAccuracySummary that contains comparison information like SQNR, MSE etc.

torch.ao.nn.intrinsic¶

This module implements the combined (fused) modules conv + relu which can then be quantized.

ConvReLU1d	This is a sequential container which calls the Conv1d and ReLU modules.
ConvReLU2d	This is a sequential container which calls the Conv2d and ReLU modules.
ConvReLU3d	This is a sequential container which calls the Conv3d and ReLU modules.
LinearReLU	This is a sequential container which calls the Linear and ReLU modules.
ConvBn1d	This is a sequential container which calls the Conv 1d and Batch Norm 1d modules.
ConvBn2d	This is a sequential container which calls the Conv 2d and Batch Norm 2d modules.
ConvBn3d	This is a sequential container which calls the Conv 3d and Batch Norm 3d modules.
ConvBnReLU1d	This is a sequential container which calls the Conv 1d, Batch Norm 1d, and ReLU modules.
ConvBnReLU2d	This is a sequential container which calls the Conv 2d, Batch Norm 2d, and ReLU modules.
ConvBnReLU3d	This is a sequential container which calls the Conv 3d, Batch Norm 3d, and ReLU modules.
BNReLU2d	This is a sequential container which calls the BatchNorm 2d and ReLU modules.
BNReLU3d	This is a sequential container which calls the BatchNorm 3d and ReLU modules.

torch.ao.nn.intrinsic.qat¶

This module implements the versions of those fused operations needed for quantization aware training.

LinearReLU	A LinearReLU module fused from Linear and ReLU modules, attached with FakeQuantize modules for weight, used in quantization aware training.
ConvBn1d	A ConvBn1d module is a module fused from Conv1d and BatchNorm1d, attached with FakeQuantize modules for weight, used in quantization aware training.
ConvBnReLU1d	A ConvBnReLU1d module is a module fused from Conv1d, BatchNorm1d and ReLU, attached with FakeQuantize modules for weight, used in quantization aware training.
ConvBn2d	A ConvBn2d module is a module fused from Conv2d and BatchNorm2d, attached with FakeQuantize modules for weight, used in quantization aware training.
ConvBnReLU2d	A ConvBnReLU2d module is a module fused from Conv2d, BatchNorm2d and ReLU, attached with FakeQuantize modules for weight, used in quantization aware training.
ConvReLU2d	A ConvReLU2d module is a fused module of Conv2d and ReLU, attached with FakeQuantize modules for weight for quantization aware training.
ConvBn3d	A ConvBn3d module is a module fused from Conv3d and BatchNorm3d, attached with FakeQuantize modules for weight, used in quantization aware training.
ConvBnReLU3d	A ConvBnReLU3d module is a module fused from Conv3d, BatchNorm3d and ReLU, attached with FakeQuantize modules for weight, used in quantization aware training.
ConvReLU3d	A ConvReLU3d module is a fused module of Conv3d and ReLU, attached with FakeQuantize modules for weight for quantization aware training.
update_bn_stats
freeze_bn_stats

torch.ao.nn.intrinsic.quantized¶

This module implements the quantized implementations of fused operations like conv + relu. No BatchNorm variants as it’s usually folded into convolution for inference.

BNReLU2d	A BNReLU2d module is a fused module of BatchNorm2d and ReLU
BNReLU3d	A BNReLU3d module is a fused module of BatchNorm3d and ReLU
ConvReLU1d	A ConvReLU1d module is a fused module of Conv1d and ReLU
ConvReLU2d	A ConvReLU2d module is a fused module of Conv2d and ReLU
ConvReLU3d	A ConvReLU3d module is a fused module of Conv3d and ReLU
LinearReLU	A LinearReLU module fused from Linear and ReLU modules

torch.ao.nn.intrinsic.quantized.dynamic¶

This module implements the quantized dynamic implementations of fused operations like linear + relu.

LinearReLU	A LinearReLU module fused from Linear and ReLU modules that can be used for dynamic quantization.

torch.ao.nn.qat¶

This module implements versions of the key nn modules Conv2d() andLinear() which run in FP32 but with rounding applied to simulate the effect of INT8 quantization.

Conv2d	A Conv2d module attached with FakeQuantize modules for weight, used for quantization aware training.
Conv3d	A Conv3d module attached with FakeQuantize modules for weight, used for quantization aware training.
Linear	A linear module attached with FakeQuantize modules for weight, used for quantization aware training.

torch.ao.nn.qat.dynamic¶

This module implements versions of the key nn modules such as **Linear()**which run in FP32 but with rounding applied to simulate the effect of INT8 quantization and will be dynamically quantized during inference.

Linear	A linear module attached with FakeQuantize modules for weight, used for dynamic quantization aware training.

torch.ao.nn.quantized¶

This module implements the quantized versions of the nn layers such as ~`torch.nn.Conv2d` and torch.nn.ReLU.

ReLU6	Applies the element-wise function:
Hardswish	This is the quantized version of Hardswish.
ELU	This is the quantized equivalent of ELU.
LeakyReLU	This is the quantized equivalent of LeakyReLU.
Sigmoid	This is the quantized equivalent of Sigmoid.
BatchNorm2d	This is the quantized version of BatchNorm2d.
BatchNorm3d	This is the quantized version of BatchNorm3d.
Conv1d	Applies a 1D convolution over a quantized input signal composed of several quantized input planes.
Conv2d	Applies a 2D convolution over a quantized input signal composed of several quantized input planes.
Conv3d	Applies a 3D convolution over a quantized input signal composed of several quantized input planes.
ConvTranspose1d	Applies a 1D transposed convolution operator over an input image composed of several input planes.
ConvTranspose2d	Applies a 2D transposed convolution operator over an input image composed of several input planes.
ConvTranspose3d	Applies a 3D transposed convolution operator over an input image composed of several input planes.
Embedding	A quantized Embedding module with quantized packed weights as inputs.
EmbeddingBag	A quantized EmbeddingBag module with quantized packed weights as inputs.
FloatFunctional	State collector class for float operations.
FXFloatFunctional	module to replace FloatFunctional module before FX graph mode quantization, since activation_post_process will be inserted in top level module directly
QFunctional	Wrapper class for quantized operations.
Linear	A quantized linear module with quantized tensor as inputs and outputs.
LayerNorm	This is the quantized version of LayerNorm.
GroupNorm	This is the quantized version of GroupNorm.
InstanceNorm1d	This is the quantized version of InstanceNorm1d.
InstanceNorm2d	This is the quantized version of InstanceNorm2d.
InstanceNorm3d	This is the quantized version of InstanceNorm3d.

torch.ao.nn.quantized.functional¶

Functional interface (quantized).

This module implements the quantized versions of the functional layers such as ~`torch.nn.functional.conv2d` and torch.nn.functional.relu. Note:relu() supports quantized inputs.

avg_pool2d	Applies 2D average-pooling operation in kH×kWkH \times kW regions by step size sH×sWsH \times sW steps.
avg_pool3d	Applies 3D average-pooling operation in kD timeskH×kWkD \ times kH \times kW regions by step size sD×sH×sWsD \times sH \times sW steps.
adaptive_avg_pool2d	Applies a 2D adaptive average pooling over a quantized input signal composed of several quantized input planes.
adaptive_avg_pool3d	Applies a 3D adaptive average pooling over a quantized input signal composed of several quantized input planes.
conv1d	Applies a 1D convolution over a quantized 1D input composed of several input planes.
conv2d	Applies a 2D convolution over a quantized 2D input composed of several input planes.
conv3d	Applies a 3D convolution over a quantized 3D input composed of several input planes.
interpolate	Down/up samples the input to either the given size or the given scale_factor
linear	Applies a linear transformation to the incoming quantized data: y=xAT+by = xA^T + b.
max_pool1d	Applies a 1D max pooling over a quantized input signal composed of several quantized input planes.
max_pool2d	Applies a 2D max pooling over a quantized input signal composed of several quantized input planes.
celu	Applies the quantized CELU function element-wise.
leaky_relu	Quantized version of the.
hardtanh	This is the quantized version of hardtanh().
hardswish	This is the quantized version of hardswish().
threshold	Applies the quantized version of the threshold function element-wise:
elu	This is the quantized version of elu().
hardsigmoid	This is the quantized version of hardsigmoid().
clamp	float(input, min_, max_) -> Tensor
upsample	Upsamples the input to either the given size or the given scale_factor
upsample_bilinear	Upsamples the input, using bilinear upsampling.
upsample_nearest	Upsamples the input, using nearest neighbours' pixel values.

torch.ao.nn.quantizable¶

This module implements the quantizable versions of some of the nn layers. These modules can be used in conjunction with the custom module mechanism, by providing the custom_module_config argument to both prepare and convert.

torch.ao.nn.quantized.dynamic¶

Dynamically quantized Linear, LSTM,LSTMCell, GRUCell, andRNNCell.

Linear	A dynamic quantized linear module with floating point tensor as inputs and outputs.
LSTM	A dynamic quantized LSTM module with floating point tensor as inputs and outputs.
GRU	Applies a multi-layer gated recurrent unit (GRU) RNN to an input sequence.
RNNCell	An Elman RNN cell with tanh or ReLU non-linearity.
LSTMCell	A long short-term memory (LSTM) cell.
GRUCell	A gated recurrent unit (GRU) cell

Quantized dtypes and quantization schemes¶

Note that operator implementations currently only support per channel quantization for weights of the conv and linearoperators. Furthermore, the input data is mapped linearly to the quantized data and vice versa as follows:

Quantization:Qout=clamp(xinput/s+z,Qmin,Qmax)Dequantization:xout=(Qinput−z)∗s\begin{aligned} \text{Quantization:}&\\ &Q_\text{out} = \text{clamp}(x_\text{input}/s+z, Q_\text{min}, Q_\text{max})\\ \text{Dequantization:}&\\ &x_\text{out} = (Q_\text{input}-z)*s \end{aligned}

where clamp(.)\text{clamp}(.) is the same as clamp() while the scale ss and zero point zz are then computed as described in MinMaxObserver, specifically:

if Symmetric:s=2max⁡(∣xmin∣,xmax)/(Qmax−Qmin)z={0if dtype is qint8128otherwiseOtherwise:s=(xmax−xmin)/(Qmax−Qmin)z=Qmin−round(xmin/s)\begin{aligned} \text{if Symmetric:}&\\ &s = 2 \max(|x_\text{min}|, x_\text{max}) / \left( Q_\text{max} - Q_\text{min} \right) \\ &z = \begin{cases} 0 & \text{if dtype is qint8} \\ 128 & \text{otherwise} \end{cases}\\ \text{Otherwise:}&\\ &s = \left( x_\text{max} - x_\text{min} \right ) / \left( Q_\text{max} - Q_\text{min} \right ) \\ &z = Q_\text{min} - \text{round}(x_\text{min} / s) \end{aligned}

where [xmin,xmax][x_\text{min}, x_\text{max}] denotes the range of the input data whileQminQ_\text{min} and QmaxQ_\text{max} are respectively the minimum and maximum values of the quantized dtype.

Note that the choice of ss and zz implies that zero is represented with no quantization error whenever zero is within the range of the input data or symmetric quantization is being used.

Additional data types and quantization schemes can be implemented through the custom operator mechanism.

torch.qscheme — Type to describe the quantization scheme of a tensor. Supported types:
- torch.per_tensor_affine — per tensor, asymmetric
- torch.per_channel_affine — per channel, asymmetric
- torch.per_tensor_symmetric — per tensor, symmetric
- torch.per_channel_symmetric — per channel, symmetric
torch.dtype — Type to describe the data. Supported types:
- torch.quint8 — 8-bit unsigned integer
- torch.qint8 — 8-bit signed integer
- torch.qint32 — 32-bit signed integer

QAT Modules.

This package is in the process of being deprecated. Please, use torch.ao.nn.qat.modules instead.

QAT Dynamic Modules.

This package is in the process of being deprecated. Please, use torch.ao.nn.qat.dynamic instead.

This file is in the process of migration to torch/ao/quantization, and is kept here for compatibility while the migration process is ongoing. If you are adding a new entry/functionality, please, add it to the appropriate files under torch/ao/quantization/fx/, while adding an import statement here.

QAT Dynamic Modules.

This package is in the process of being deprecated. Please, use torch.ao.nn.qat.dynamic instead.

Quantized Modules.

Note::

The torch.nn.quantized namespace is in the process of being deprecated. Please, use torch.ao.nn.quantized instead.

Quantized Dynamic Modules.

This file is in the process of migration to torch/ao/nn/quantized/dynamic, and is kept here for compatibility while the migration process is ongoing. If you are adding a new entry/functionality, please, add it to the appropriate file under the torch/ao/nn/quantized/dynamic, while adding an import statement here.