[Hardware][NVIDIA] FP4 MoE kernel optimization by dubcyfor3 · Pull Request #19110 · vllm-project/vllm (original) (raw)

[gemini-code-assist[bot]](/apps/gemini-code-assist)

[gemini-code-assist[bot]](/apps/gemini-code-assist)

@dubcyfor3

Signed-off-by: Chiyue Wei chiyuew@nvidia.com

@dubcyfor3

Signed-off-by: Chiyue Wei chiyuew@nvidia.com

@dubcyfor3

Signed-off-by: Chiyue Wei chiyuew@nvidia.com

@dubcyfor3

Signed-off-by: Chiyue Wei chiyuew@nvidia.com

@dubcyfor3

Signed-off-by: Chiyue Wei chiyuew@nvidia.com

@dubcyfor3

Signed-off-by: Chiyue Wei chiyuew@nvidia.com

@dubcyfor3

Signed-off-by: Chiyue Wei chiyuew@nvidia.com

@dubcyfor3

Signed-off-by: Chiyue Wei chiyuew@nvidia.com

@dubcyfor3

Signed-off-by: Chiyue Wei chiyuew@nvidia.com

@dubcyfor3

Signed-off-by: Chiyue Wei chiyuew@nvidia.com

mgoin

@dubcyfor3

Signed-off-by: Chiyue Wei chiyuew@nvidia.com

@dubcyfor3

Signed-off-by: Chiyue Wei chiyuew@nvidia.com

@dubcyfor3

Signed-off-by: Chiyue Wei chiyuew@nvidia.com

@mgoin mgoin added the ready

ONLY add when PR is ready to merge/full CI is needed

label

Jun 4, 2025

@mgoin mgoinenabled auto-merge (squash)

June 4, 2025 23:45

leo-li-opus pushed a commit to leo-li-opus/vllm that referenced this pull request

Jul 22, 2025

@dubcyfor3 @leo-li-opus

Signed-off-by: Chiyue Wei chiyuew@nvidia.com Co-authored-by: Chiyue Wei chiyuew@nvidia.com

0826joyce pushed a commit to 0826joyce/vllm-serving-optimization that referenced this pull request

May 19, 2026

@dubcyfor3

Signed-off-by: Chiyue Wei chiyuew@nvidia.com Co-authored-by: Chiyue Wei chiyuew@nvidia.com

This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters

[ Show hidden characters]({{ revealButtonHref }})