fix: Issue in non-Tensor Input Resolution by gs-olive · Pull Request #1617 · pytorch/TensorRT (original) (raw)
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
[ Show hidden characters]({{ revealButtonHref }})
Description
In certain TensorRT-executed blocks, multiple non-Tensor inputs originate from the same parent Tensor in another Torch block, so a single pass of resolveTRTNonTensorInputs does not resolve all non-Tensor inputs, causing failures at compile-time- Issue was determined to trace to the use of
prim::Loopin the TensorRT block and is fixed in PR fix: fix the prim::Loop fallback issue #1691 - Add function to
SegmentedBlockto get the ID of a block, so re-inserted blocks retain the same ID as the previous block being replaced - Add regression test case to elicit behavior
Failure Case
Before Non-Tensor Input Resolution
DEBUG: [Torch-TensorRT - Debug Build] - Finalizing in progress TensorRT block DEBUG: [Torch-TensorRT - Debug Build] - Segment Block @2: Target: TensorRT
Graph: graph(%1 : int,
%9 : int[],
%12 : Tensor):After 1 Round of Non-Tensor Input Resolution
GRAPH: [Torch-TensorRT - Debug Build] - Running shape analysis on block Segment Block @2: Target: TensorRT
Graph: graph(%1 : int[],
%3 : Tensor):After 2 Rounds of Non-Tensor Input Resolution
INFO: [Torch-TensorRT - Debug Build] - Segment Block @2: Target: TensorRT
Graph: graph(%1 : Tensor):
Type of change
- Bug fix (non-breaking change which fixes an issue)
Checklist:
- [ x ] My code follows the style guidelines of this project (You can use the linters)
- [ x ] I have performed a self-review of my own code
- [ x ] I have commented my code, particularly in hard-to-understand areas and hacks
- [ x ] I have made corresponding changes to the documentation
- [ x ] I have added tests to verify my fix or my feature
- [ x ] New and existing unit tests pass locally with my changes
- [ x ] I have added the relevant labels to my PR in so that relevant reviewers are notified
[](/apps/github-actions)
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code conforms to C++ style guidelines
[](/apps/github-actions)
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code conforms to Python style guidelines
[](/apps/github-actions)
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code conforms to Python style guidelines
[](/apps/github-actions)
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code conforms to C++ style guidelines
[](/apps/github-actions)
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code conforms to C++ style guidelines
[](/apps/github-actions)
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code conforms to Python style guidelines
Expected this PR to make it into 23.03 release
This does not make sense in fact.
Firstly, even if 2 non-Tensor inputs come from a same node, that node will be added into the segmented block. There is no need to do it for multiple passes.
Secondly, I guess why it fails for a single pass is because that there is a Loop block in the TensorRT segment. This could introduce undefined behaviors. The real issue I think here is the Loop block, it shouldn't be in TensorRT segment at all. In our last code refactoring for partitioning, we set all nodes aggressively to be run in TensorRT, this introduces some issue like LOOP nodes default to be run in TensorRT, this could result in some issues like this.
- In certain TensorRT-executed blocks, multiple non-Tensor inputs
originate from the same parent Tensor in another Torch block, so a
single pass of
resolveTRTNonTensorInputsdoes not resolve all non-Tensor inputs, causing failures at compile-time - Add do-while loop to ensure all input dependencies are resolved by running dependency-resolution algorithm multiple times, if necessary
- Add function to
SegmentedBlockto get the ID of a block, so re-inserted blocks retain the same ID as the previous block being replaced
- Added test case in partitioning to elicit bug arising from multiple nonTensor inputs to TensorRT block
[](/apps/github-actions)
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code conforms to C++ style guidelines
[](/apps/github-actions)
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code conforms to Python style guidelines
gs-olive changed the base branch from main to fix_loop_fallback
[](/apps/github-actions)
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code conforms to C++ style guidelines
[](/apps/github-actions)
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code conforms to Python style guidelines
Comment on lines +289 to +290
| cur_partitioned_block[i] = |
|---|
| SegmentedBlock(cur_partitioned_block[i].get_id(), SegmentedBlock::kTensorRT, dependency_nodes); |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added to ensure the block IDs of segmented blocks stay in order despite resolution of non-Tensor inputs.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, please squash your commits then merge.