fix: Properly cast intermediate Int8 tensors to TensorRT Engines in Fallback by gs-olive · Pull Request #1549 · pytorch/TensorRT (original) (raw)

Conversation

This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters

[ Show hidden characters]({{ revealButtonHref }})

gs-olive

Description

Error displayed when passing Int8 inputs to non-quantized TRT Engine:

ERROR: [Torch-TensorRT TorchScript Conversion Context] - 4: input_0: input/output with DataType Int8 in network without Q/DQ layers must have dynamic range set when no calibrator is used. ERROR: [Torch-TensorRT TorchScript Conversion Context] - 4: [network.cpp::validate::2772] Error Code 4: Internal Error (DataType does not match TensorFormats.) ERROR: [Torch-TensorRT TorchScript Conversion Context] - 2: [builder.cpp::buildSerializedNetwork::751] Error Code 2: Internal Error (Assertion engine != nullptr failed. )

With this PR, GPT-2 now compiles and runs inference successfully.

Fixes #1455

Type of change

Checklist:

peri044

@gs-olive

@gs-olive gs-olive changed the titlefix: Properly cast Int8 inputs to TensorRT Engines in Fallback fix: Properly cast intermediate Int8 tensors to TensorRT Engines in Fallback

Dec 21, 2022

peri044

Comment on lines 233 to 245

if (partitioning_info.truncate_long_and_double) {
for (size_t i = 0; i < seg_block.inputs().size(); ++i) {
if (ivalues_maps[seg_block.raw_inputs()[i]].isTensor()) {
auto cur_ivalue = ivalues_maps[seg_block.raw_inputs()[i]];
at::ScalarType t = cur_ivalue.toTensor().scalar_type();
if (t == at::kLong) {
// we add a cast operation to cast the type to Int64
auto cast_node = createCastNode(seg_block, i, true, target_device);
seg_block.g()->prependNode(cast_node);
seg_block.inputs()[i]->replaceAllUsesAfterNodeWith(cast_node, cast_node->outputs()[0]);
}
}
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this just linter formatting changes?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I manually made the formatting changes to reduce redundancy of if statements, but they should be functionally equivalent to the previous version

peri044

@gs-olive

peri044

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM