fix: Automatically send truncated long ints to cuda at shape analysis time by gs-olive · Pull Request #1541 · pytorch/TensorRT (original) (raw)

Description

Uses schema

Tensor.to(device : Device, dtype : int, non_blocking : bool=False, copy : bool=False, memory_format : Optional[int]) -> Tensor

Instead of

Tensor.to(dtype : int, non_blocking : bool=False, copy : bool=False, memory_format : Optional[int]) -> Tensor

This switch was made to ensure the device for truncated objects is GPU, regardless of their origin, to avoid adding another lowering pass for this case. Since an aten::to operation is already being inserted, use the opportunity to use correct tensor device (GPU).

Type of change

Checklist: