Linear — PyTorch 2.7 documentation (original) (raw)
class torch.ao.nn.quantized.Linear(in_features, out_features, bias_=True, dtype=torch.qint8)[source][source]¶
A quantized linear module with quantized tensor as inputs and outputs. We adopt the same interface as torch.nn.Linear, please seehttps://pytorch.org/docs/stable/nn.html#torch.nn.Linear for documentation.
Similar to Linear, attributes will be randomly initialized at module creation time and will be overwritten later
Variables
- weight (Tensor) – the non-learnable quantized weights of the module of shape (out_features,in_features)(\text{out\_features}, \text{in\_features}).
- bias (Tensor) – the non-learnable bias of the module of shape (out_features)(\text{out\_features}). If
bias
isTrue
, the values are initialized to zero. - scale – scale parameter of output Quantized Tensor, type: double
- zero_point – zero_point parameter for output Quantized Tensor, type: long
Examples:
m = nn.quantized.Linear(20, 30) input = torch.randn(128, 20) input = torch.quantize_per_tensor(input, 1.0, 0, torch.quint8) output = m(input) print(output.size()) torch.Size([128, 30])
classmethod from_float(mod, use_precomputed_fake_quant=False)[source][source]¶
Create a quantized module from an observed float module
Parameters
- mod (Module) – a float module, either produced by torch.ao.quantization utilities or provided by the user
- use_precomputed_fake_quant (bool) – if True, the module will reuse min/max values from the precomputed fake quant module.
classmethod from_reference(ref_qlinear, output_scale, output_zero_point)[source][source]¶
Create a (fbgemm/qnnpack) quantized module from a reference quantized module
Parameters