utils — Model Optimizer 0.27.1 (original) (raw)

TensorRT Model Optimizer

Quantization utilities.

Functions

convert_quantization_axis_to_reduce_axis Convert the quantization axis to the reduce axis.
reduce_amax Compute the absolute maximum value of a tensor.
is_quantized Check if a module is quantized.
is_quantized_layer_with_weight Check if a module is quantized with weights.
is_quantized_column_parallel_linear Check if a module is a quantized column parallel linear module.
is_quantized_row_parallel_linear Check if a module is a quantized row parallel linear module.
replace_function Replace a function with a new one within a context.
export_torch_mode Context manager enabling the export mode.
get_parallel_state Get the parallel state.

convert_quantization_axis_to_reduce_axis(input, axis)

Convert the quantization axis to the reduce axis.

Parameters:

Returns:

The axis to reduce. None suggests all dimensions should be reduced.

Return type:

list

export_torch_mode()

Context manager enabling the export mode.

get_parallel_state(model, name=None)

Get the parallel state.

Parameters:

Return type:

ParallelState

is_quantized(module)

Check if a module is quantized.

is_quantized_column_parallel_linear(module)

Check if a module is a quantized column parallel linear module.

is_quantized_layer_with_weight(module)

Check if a module is quantized with weights.

is_quantized_row_parallel_linear(module)

Check if a module is a quantized row parallel linear module.

reduce_amax(input, axis=None, keepdims=True, squeeze_scalar=True)

Compute the absolute maximum value of a tensor.

Reduces input_tensor along the dimensions given in axis. Unless keepdims is true, the rank of the tensor is reduced by 1 for each entry in axis. If keepdims is true, the reduced dimensions are retained with length 1.

Note

Gradient computation is disabled as this function is never meant learning reduces amax

Parameters:

Returns:

The reduced tensor.

Raises:

replace_function(package, name, new_func)

Replace a function with a new one within a context.