Enable bitsandbytes quantization on AMD GPUs that use warp size 32 by sstamenk · Pull Request #27307 · vllm-project/vllm (original) (raw)
Bot added the rocm
Related to AMD ROCm
label
sstamenk marked this pull request as ready for review
[](/apps/chatgpt-codex-connector)
Signed-off-by: sstamenk strahinja.stamenkovic@amd.com
Signed-off-by: sstamenk strahinja.stamenkovic@amd.com
Signed-off-by: sstamenk strahinja.stamenkovic@amd.com
Signed-off-by: sstamenk strahinja.stamenkovic@amd.com
Signed-off-by: sstamenk strahinja.stamenkovic@amd.com
Signed-off-by: sstamenk strahinja.stamenkovic@amd.com
Signed-off-by: sstamenk strahinja.stamenkovic@amd.com
Signed-off-by: sstamenk strahinja.stamenkovic@amd.com
ONLY add when PR is ready to merge/full CI is needed
label
devpatelio pushed a commit to SumanthRH/vllm that referenced this pull request
kitaekatt pushed a commit to kitaekatt/vllm that referenced this pull request
mystous pushed a commit to mystous/vllm_hybrid that referenced this pull request
0826joyce pushed a commit to 0826joyce/vllm-serving-optimization that referenced this pull request
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
[ Show hidden characters]({{ revealButtonHref }})