[Bugfix][ROCm] fix the power of 2 exception from triton_unified_attention.py when running llama4 models and unit test fix by hongxiayang · Pull Request #18100 · vllm-project/vllm (original) (raw)

@hongxiayang changed the title[Bugfix] fix the power of 2 exception when running llama4 model in tr… [Bugfix] fix the power of 2 exception when running llama4 model in triton_unified_attention.py

May 13, 2025

@hongxiayang hongxiayang changed the title[Bugfix] fix the power of 2 exception when running llama4 model in triton_unified_attention.py [Bugfix] fix the power of 2 exception from triton_unified_attention.py when running llama4 models

May 13, 2025

tdoublep

This was referenced

May 15, 2025

@hongxiayang

…iton_unified_attention.py

Signed-off-by: Hongxia Yang hongxia.yang@amd.com

@hongxiayang

Signed-off-by: Hongxia Yang hongxia.yang@amd.com

@hongxiayang

Signed-off-by: Hongxia Yang hongxia.yang@amd.com

@tjtanaa @hongxiayang

Signed-off-by: tjtanaa tunjian.tan@embeddedllm.com

@hongxiayang

Signed-off-by: Hongxia Yang hongxia.yang@amd.com

@hongxiayang

Signed-off-by: Hongxia Yang hongxia.yang@amd.com

@hongxiayang

Signed-off-by: Hongxia Yang hongxia.yang@amd.com

@hongxiayang hongxiayang added the ready

ONLY add when PR is ready to merge/full CI is needed

label

May 22, 2025

@hongxiayang

Signed-off-by: Hongxia Yang hongxia.yang@amd.com

@hongxiayang hongxiayang changed the title[Bugfix] fix the power of 2 exception from triton_unified_attention.py when running llama4 models [Bugfix][ROCm] fix the power of 2 exception from triton_unified_attention.py when running llama4 models and unit test fix

May 28, 2025

tdoublep

SageMoore

houseroad

ProExpertProg

amitm02 pushed a commit to amitm02/vllm that referenced this pull request

Jun 1, 2025

…tion.py when running llama4 models and unit test fix (vllm-project#18100)

Signed-off-by: Hongxia Yang hongxia.yang@amd.com Signed-off-by: tjtanaa tunjian.tan@embeddedllm.com Co-authored-by: tjtanaa tunjian.tan@embeddedllm.com Signed-off-by: amit amit.man@gmail.com

bringlein added a commit to bringlein/vllm that referenced this pull request

Jul 10, 2025

@bringlein

Signed-off-by: Burkhard Ringlein ngl@zurich.ibm.com

0826joyce pushed a commit to 0826joyce/vllm-serving-optimization that referenced this pull request

May 19, 2026

@hongxiayang @tjtanaa

…tion.py when running llama4 models and unit test fix (vllm-project#18100)

Signed-off-by: Hongxia Yang hongxia.yang@amd.com Signed-off-by: tjtanaa tunjian.tan@embeddedllm.com Co-authored-by: tjtanaa tunjian.tan@embeddedllm.com

This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters

[ Show hidden characters]({{ revealButtonHref }})