Layer Base Classes - NVIDIA TensorRT Standard Python API Documentation 10.12.0 (original) (raw)

ITensor

tensorrt.TensorLocation

The physical location of the data.

Members:

DEVICE : Data is stored on the device.

HOST : Data is stored on the host.

tensorrt.TensorFormat

Format of the input/output tensors.

Members:

LINEAR :

Row major linear format.

For a tensor with dimensions {N, C, H, W}, the W axis always has unit stride, and the stride of every other axis is at least the product of the next dimension times the next stride. the strides are the same as for a C array with dimensions [N][C][H][W].

CHW2 :

Two wide channel vectorized row major format.

This format is bound to FP16 and BF16. It is only available for dimensions >= 3.

For a tensor with dimensions {N, C, H, W}, the memory layout is equivalent to a C array with dimensions [N][(C+1)/2][H][W][2], with the tensor coordinates (n, c, h, w) mapping to array subscript [n][c/2][h][w][c%2].

HWC8 :

Eight channel format where C is padded to a multiple of 8.

This format is bound to FP16 and BF16. It is only available for dimensions >= 3.

For a tensor with dimensions {N, C, H, W}, the memory layout is equivalent to the array with dimensions [N][H][W][(C+7)/8*8], with the tensor coordinates (n, c, h, w) mapping to array subscript [n][h][w][c].

CHW4 :

Four wide channel vectorized row major format. This format is bound to INT8. It is only available for dimensions >= 3.

For a tensor with dimensions {N, C, H, W}, the memory layout is equivalent to a C array with dimensions [N][(C+3)/4][H][W][4], with the tensor coordinates (n, c, h, w) mapping to array subscript [n][c/4][h][w][c%4].

CHW16 :

Sixteen wide channel vectorized row major format.

This format is only supported by DLA and requires FP16. It is only available for dimensions >= 3.

For a tensor with dimensions {N, C, H, W}, the memory layout is equivalent to a C array with dimensions [N][(C+15)/16][H][W][16], with the tensor coordinates (n, c, h, w) mapping to array subscript [n][c/16][h][w][c%16].

CHW32 :

Thirty-two wide channel vectorized row major format.

This format is only available for dimensions >= 3.

For a tensor with dimensions {N, C, H, W}, the memory layout is equivalent to a C array with dimensions [N][(C+31)/32][H][W][32], with the tensor coordinates (n, c, h, w) mapping to array subscript [n][c/32][h][w][c%32].

DHWC8 :

Eight channel format where C is padded to a multiple of 8.

This format is bound to FP16 and BF16, and it is only available for dimensions >= 4.

For a tensor with dimensions {N, C, D, H, W}, the memory layout is equivalent to an array with dimensions [N][D][H][W][(C+7)/8*8], with the tensor coordinates (n, c, d, h, w) mapping to array subscript [n][d][h][w][c].

CDHW32 :

Thirty-two wide channel vectorized row major format with 3 spatial dimensions.

This format is bound to FP16 and INT8. It is only available for dimensions >= 4.

For a tensor with dimensions {N, C, D, H, W}, the memory layout is equivalent to a C array with dimensions [N][(C+31)/32][D][H][W][32], with the tensor coordinates (n, d, c, h, w) mapping to array subscript [n][c/32][d][h][w][c%32].

HWC :

Non-vectorized channel-last format. This format is bound to FP32, FP16, INT8, INT64 and BF16, and is only available for dimensions >= 3.

DLA_LINEAR :

DLA planar format. Row major format. The stride for stepping along the H axis is rounded up to 64 bytes.

This format is bound to FP16/Int8 and is only available for dimensions >= 3.

For a tensor with dimensions {N, C, H, W}, the memory layout is equivalent to a C array with dimensions [N][C][H][roundUp(W, 64/elementSize)] where elementSize is 2 for FP16 and 1 for Int8, with the tensor coordinates (n, c, h, w) mapping to array subscript [n][c][h][w].

DLA_HWC4 :

DLA image format. channel-last format. C can only be 1, 3, 4. If C == 3 it will be rounded to 4. The stride for stepping along the H axis is rounded up to 32 bytes.

This format is bound to FP16/Int8 and is only available for dimensions >= 3.

For a tensor with dimensions {N, C, H, W}, with C’ is 1, 4, 4 when C is 1, 3, 4 respectively, the memory layout is equivalent to a C array with dimensions [N][H][roundUp(W, 32/C’/elementSize)][C’] where elementSize is 2 for FP16 and 1 for Int8, C’ is the rounded C. The tensor coordinates (n, c, h, w) maps to array subscript [n][h][w][c].

HWC16 :

Sixteen channel format where C is padded to a multiple of 16. This format is bound to FP16/INT8/FP8. It is only available for dimensions >= 3.

For a tensor with dimensions {N, C, H, W}, the memory layout is equivalent to the array with dimensions [N][H][W][(C+15)/16*16], with the tensor coordinates (n, c, h, w) mapping to array subscript [n][h][w][c].

DHWC :

Non-vectorized channel-last format. This format is bound to FP32. It is only available for dimensions >= 4.

class tensorrt.ITensor

A tensor in an INetworkDefinition .

Variables:

get_dimension_name(self: tensorrt.tensorrt.ITensor, index: int) → str

Get the name of an input dimension.

Parameters:

index – index of the dimension.

Returns:

name of the dimension, or null if dimension is unnamed.

reset_dynamic_range(self: tensorrt.tensorrt.ITensor) → None

[DEPRECATED] Deprecated in TensorRT 10.1. Superseded by explicit quantization. Undo the effect of setting the dynamic range.

set_dimension_name(self: tensorrt.tensorrt.ITensor, index: int, name: str) → None

Name a dimension of an input tensor.

Associate a runtime dimension of an input tensor with a symbolic name. Dimensions with the same non-empty name must be equal at runtime. Knowing this equality for runtime dimensions may help the TensorRT optimizer. Both runtime and build-time dimensions can be named. If the function is called again, with the same index, it will overwrite the previous name. If None is passed as name, it will clear the name of the dimension.

For example, setDimensionName(0, “n”) associates the symbolic name “n” with the leading dimension.

Parameters:

set_dynamic_range(self: tensorrt.tensorrt.ITensor, min: float, max: float) → bool

[DEPRECATED] Deprecated in TensorRT 10.1. Superseded by explicit quantization. Set dynamic range for the tensor. NOTE: It is suggested to use tensor.dynamic_range = (min, max) instead.

Parameters:

Returns:

true if succeed in setting range. Otherwise false.

ILayer

tensorrt.LayerType

Type of Layer

Members:

CONVOLUTION : Convolution layer

GRID_SAMPLE : Grid sample layer

NMS : NMS layer

ACTIVATION : Activation layer

POOLING : Pooling layer

LRN : LRN layer

SCALE : Scale layer

SOFTMAX : Softmax layer

DECONVOLUTION : Deconvolution layer

CONCATENATION : Concatenation layer

ELEMENTWISE : Elementwise layer

PLUGIN : Plugin layer

UNARY : Unary layer

PADDING : Padding layer

SHUFFLE : Shuffle layer

REDUCE : Reduce layer

TOPK : TopK layer

GATHER : Gather layer

MATRIX_MULTIPLY : Matrix multiply layer

RAGGED_SOFTMAX : Ragged softmax layer

CONSTANT : Constant layer

IDENTITY : Identity layer

CAST : Cast layer

PLUGIN_V2 : PluginV2 layer

SLICE : Slice layer

SHAPE : Shape layer

PARAMETRIC_RELU : Parametric ReLU layer

RESIZE : Resize layer

TRIP_LIMIT : Loop Trip limit layer

RECURRENCE : Loop Recurrence layer

ITERATOR : Loop Iterator layer

LOOP_OUTPUT : Loop output layer

SELECT : Select layer

ASSERTION : Assertion layer

FILL : Fill layer

QUANTIZE : Quantize layer

DEQUANTIZE : Dequantize layer

CONDITION : If-conditional Condition layer

CONDITIONAL_INPUT : If-conditional input layer

CONDITIONAL_OUTPUT : If-conditional output layer

SCATTER : Scatter layer

EINSUM : Einsum layer

ONE_HOT : OneHot layer

NON_ZERO : NonZero layer

REVERSE_SEQUENCE : ReverseSequence layer

NORMALIZATION : Normalization layer

PLUGIN_V3 : PluginV3 layer

SQUEEZE : Squeeze layer

UNSQUEEZE : Unsqueeze layer

CUMULATIVE : Cumulative layer

DYNAMIC_QUANTIZE : DynamicQuantize layer

class tensorrt.ILayer

Base class for all layer classes in an INetworkDefinition .

Variables:

Ival metadata:

str The per-layer metadata.

get_input(self: tensorrt.tensorrt.ILayer, index: int) → tensorrt.tensorrt.ITensor

Get the layer input corresponding to the given index.

Parameters:

index – The index of the input tensor.

Returns:

The input tensor, or None if the index is out of range.

get_output(self: tensorrt.tensorrt.ILayer, index: int) → tensorrt.tensorrt.ITensor

Get the layer output corresponding to the given index.

Parameters:

index – The index of the output tensor.

Returns:

The output tensor, or None if the index is out of range.

get_output_type(self: tensorrt.tensorrt.ILayer, index: int) → tensorrt.tensorrt.DataType

Get the output type of the layer.

Parameters:

index – The index of the output tensor.

Returns:

The output precision. Default : DataType.FLOAT.

output_type_is_set(self: tensorrt.tensorrt.ILayer, index: int) → bool

[DEPRECATED] Deprecated in TensorRT 10.12. Superseded by strong typing. Whether the output type has been set for this layer.

Parameters:

index – The index of the output.

Returns:

Whether the output type has been explicitly set.

reset_output_type(self: tensorrt.tensorrt.ILayer, index: int) → None

[DEPRECATED] Deprecated in TensorRT 10.12. Superseded by strong typing. Reset output type of this layer.

Parameters:

index – The index of the output.

reset_precision(self: tensorrt.tensorrt.ILayer) → None

[DEPRECATED] Deprecated in TensorRT 10.12. Superseded by strong typing. Reset the computation precision of the layer.

set_input(self: tensorrt.tensorrt.ILayer, index: int, tensor: tensorrt.tensorrt.ITensor) → None

Set the layer input corresponding to the given index.

Parameters:

set_output_type(self: tensorrt.tensorrt.ILayer, index: int, dtype: tensorrt.tensorrt.DataType) → None

[DEPRECATED] Deprecated in TensorRT 10.12. Superseded by strong typing. Constraint layer to generate output data with given type. Note that this method cannot be used to set the data type of the second output tensor of the topK layer. The data type of the second output tensor of the topK layer is always int32 .

Parameters: