[Misc] Add missing _Backend enums by NickLucche · Pull Request #19081 · vllm-project/vllm (original) (raw)
There are at least two missing MLA backend enums which will evaluate the following script to None:
from vllm import LLM
from vllm.attention.selector import backend_name_to_enum, get_attn_backend
# Some MLA model
llm = LLM(model="deepseek-ai/DeepSeek-V2-Lite", trust_remote_code=True)
backend = get_attn_backend(
llm.llm_engine.model_config.get_head_size(),
llm.llm_engine.model_config.dtype,
llm.llm_engine.cache_config.cache_dtype,
llm.llm_engine.cache_config.block_size,
llm.llm_engine.cache_config.is_attention_free,
False,
use_mla=True
)
backend = backend_name_to_enum(backend.get_name())
print(backend)
Pre-pr: None
Post-pr: _Backend.FLASHMLA_VLLM_V1 or TRITON_MLA_VLLM_V1 depending on whether FA or Flashinfer is used.