conversion — Model Optimizer 0.27.1 (original) (raw)
Quantization conversion/restore utilities.
Functions
replace_quant_module | Recursively replace the module with quantized module. |
---|---|
set_quantizer_by_cfg | Update the quantizer attributes based on the specified quant_cfg. |
set_quantizer_attribute | Finegrained adjustment of quantizer attribute by wildcard or filter function. |
register | Register a quantized class for the given un-quantized original class. |
unregister | Unregister the quantized class for the given un-quantized original class. |
set_quantizer_by_cfg_context | Context manager for setting quantizer attributes using quant_cfg. |
register(original_cls, quantized_cls)
Register a quantized class for the given un-quantized original class.
Parameters:
- original_cls (Module) – The original un-quantized class.
- quantized_cls (Module) – The quantized class. This class should have a _setup method which initializes various quantizers called in the forward. The forward function of the quantized class should call the quantizers at the correct location.
Here is an example of defining a quantized class and registering it:
import modelopt.torch.quantization as mtq from modelopt.torch.quantization.nn import TensorQuantizer
class QuantLayerNorm(nn.LayerNorm): def init(self, normalized_shape): super().init(normalized_shape) self._setup()
def _setup(self):
# Method to setup the quantizers
self.input_quantizer = TensorQuantizer()
self.weight_quantizer = TensorQuantizer()
def forward(self, input):
input = self.input_quantizer(input)
weight = self.weight_quantizer(self.weight)
return F.layer_norm(input, self.normalized_shape, weight, self.bias, self.eps)
Register the custom quantized module
mtq.register(original_cls=nn.LayerNorm, quantized_cls=QuantLayerNorm)
replace_quant_module(model, version=None)
Recursively replace the module with quantized module.
Parameters:
model (Module) –
set_quantizer_attribute(quant_model, wildcard_or_filter_func, attribute, parent_class=None)
Finegrained adjustment of quantizer attribute by wildcard or filter function.
Parameters:
- quant_model (Module) – A pytorch model
- wildcard_or_filter_func (str | Callable) – a wildcard string or a filter function. The wildcard string is matched against the quantizer module names. The quantizer modules are instances ofTensorQuantizer. The filter function takes a quantized module name as input and returns
True
if the quantizer should be adjusted andFalse
otherwise. - attribute (QuantizerAttributeConfig | list _[_QuantizerAttributeConfig] | dict _[_ _str_ _|_ _Callable_ _,_ QuantizerAttributeConfig | list _[_QuantizerAttributeConfig] ] | dict | list [ dict ]) – An instance of QuantizerAttributeConfig or an equivalent dictionary or a list of these two types. If
attribute
is a list, the matchedTensorQuantizer
modules will be replaced withSequentialQuantizer
modules having one quantizer for each attribute instance from the list. Seeset_from_attribute_config()
for more details on the supported attributes and their types. - parent_class (None | type) – (Optional) The parent class of the quantizer modules matching
wildcard_or_filter_func
which should be adjusted. IfNone
, all the matching quantizer modules will be adjusted.
set_quantizer_by_cfg(quant_model, quant_cfg)
Update the quantizer attributes based on the specified quant_cfg.
quant_cfg is a dictionary mapping wildcards or filter functions to its quantizer attributes which are defined inQuantizerAttributeConfig. The wildcards or filter functions are matched against the quantizer module names. The specified quantizer attributes of the matched quantizer modules are set accordingly. The key "default"
is a special key that sets the quantizer attributes of all the quantizers for which no other wildcard or filter functions match the quantizer module name.
In addition, the dictionary entries could also be pytorch module class names mapping the class specific quantization configuration. The pytorch modules should have a quantized equivalent.
See set_quantizer_attributefor more details.
Parameters:
- quant_model (Module) –
- quant_cfg (dict _[_ _str_ _|_ _Callable_ _,_ QuantizerAttributeConfig | list _[_QuantizerAttributeConfig] | dict _[_ _str_ _|_ _Callable_ _,_ QuantizerAttributeConfig | list _[_QuantizerAttributeConfig] ] ] | dict) –
set_quantizer_by_cfg_context(quant_model, quant_cfg)
Context manager for setting quantizer attributes using quant_cfg.
The set attributes will be reset to the original attributes after exiting the context manager. See set_quantizer_by_cfg() for more details.
Use this context manager with caution. Changing certain attributes of the quantizer such ascalibrator can lead to unexpected behavior.
Parameters:
- quant_model (Module) –
- quant_cfg (dict _[_ _str_ _|_ _Callable_ _,_ QuantizerAttributeConfig | list _[_QuantizerAttributeConfig] | dict _[_ _str_ _|_ _Callable_ _,_ QuantizerAttributeConfig | list _[_QuantizerAttributeConfig] ] ] | dict) –
unregister(original_cls)
Unregister the quantized class for the given un-quantized original class.
Parameters:
original_cls (Module) – The original un-quantized class.