nf4_tensor — Model Optimizer 0.27.1 (original) (raw)

Implements NF4 quantization for efficient tensor storage and computation.

Classes

NF4QTensor Implements the NF4 quantization on tensors for more efficient storage or computation.

class NF4QTensor

Bases: BaseQuantizedTensor

Implements the NF4 quantization on tensors for more efficient storage or computation.

quantized_data

The quantized data stored as a packed uint8 tensor.

Type:

torch.Tensor

dequantize(dtype=None, **kwarg)

Dequantze NF4 packed tensor to a target dtype.

Parameters:

dtype (dtype) –

classmethod double_quantization(scales, scale_block_size, num_scale_bits)

Perform double quantization on the scales.

Unlike the quantize method quantizing input data, this function quantizes float scales into int8 to further reduce memory usage of scales.

Parameters:

classmethod quantize(input, block_size)

Converting a tensor to a quantized format based on NF4 double quantization.

Parameters:

Returns:

Contains quantized data, input quantization config, and scale quantization config.

Return type:

tuple