[torch.compile] Stop assuming 32 bit indexing by zou3519 · Pull Request #33113 · vllm-project/vllm (original) (raw)

[ gemini-code-assist[bot] ](/apps/gemini-code-assist)

ONLY add when PR is ready to merge/full CI is needed

label

Jan 26, 2026

We ran into some errors with this internally. Previously I thought this meant that we assume that the number of tokens is 32-bit, but this flag actually means if the Tensor.numel is 32-bit, which is not always True.

We should actually be able to infer this, but until then, stop assuming the Tensor.numel is 32-bit.

Signed-off-by: Richard Zou zou3519@gmail.com

apd10 pushed a commit to apd10/vllm that referenced this pull request

Jan 31, 2026

Signed-off-by: Richard Zou zou3519@gmail.com

mystous pushed a commit to mystous/vllm_hybrid that referenced this pull request

May 10, 2026

Signed-off-by: Richard Zou zou3519@gmail.com

my-other-github-account pushed a commit to my-other-github-account/vllm that referenced this pull request

May 15, 2026

Signed-off-by: Richard Zou zou3519@gmail.com

my-other-github-account pushed a commit to my-other-github-account/vllm that referenced this pull request

May 15, 2026

Signed-off-by: Richard Zou zou3519@gmail.com

0826joyce pushed a commit to 0826joyce/vllm-serving-optimization that referenced this pull request

May 19, 2026

Signed-off-by: Richard Zou zou3519@gmail.com

This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters

[ Show hidden characters]({{ revealButtonHref }})