Supported Models | OpenVINO GenAI (original) (raw)

Models Compatibility

Other models with similar architectures may also work successfully even if not explicitly validated. Consider testing any unlisted models to verify compatibility with your specific use case.

Large Language Models (LLMs)

LoRA Support

LLM pipeline supports LoRA adapters.

Architecture Models Example HuggingFace Models
AquilaModel Aquila BAAI/Aquila-7BBAAI/AquilaChat-7BBAAI/Aquila2-7BBAAI/AquilaChat2-7B
ArcticForCausalLM Snowflake Snowflake/snowflake-arctic-instructSnowflake/snowflake-arctic-base
BaichuanForCausalLM Baichuan2 baichuan-inc/Baichuan2-7B-Chatbaichuan-inc/Baichuan2-13B-Chat
BloomForCausalLM Bloom bigscience/bloom-560mbigscience/bloom-1b1bigscience/bloom-1b7bigscience/bloom-3bbigscience/bloom-7b1
Bloomz bigscience/bloomz-560mbigscience/bloomz-1b1bigscience/bloomz-1b7bigscience/bloomz-3bbigscience/bloomz-7b1
ChatGLMModel ChatGLM THUDM/chatglm2-6bTHUDM/chatglm3-6bTHUDM/glm-4-9bTHUDM/glm-4-9b-chat
CodeGenForCausalLM CodeGen Salesforce/codegen-350m-multiSalesforce/codegen-2B-multiSalesforce/codegen-6B-multiSalesforce/codegen-16B-multiSalesforce/codegen-350m-monoSalesforce/codegen-2B-monoSalesforce/codegen-6B-monoSalesforce/codegen-16B-monoSalesforce/codegen2-1B_PSalesforce/codegen2-3_7B_PSalesforce/codegen2-7B_PSalesforce/codegen2-16B_P
CohereForCausalLM Aya CohereLabs/aya-23-8BCohereLabs/aya-expanse-8bCohereLabs/aya-23-35B
C4AI Command R CohereLabs/c4ai-command-r7b-12-2024CohereLabs/c4ai-command-r-v01
DbrxForCausalLM DBRX databricks/dbrx-instructdatabricks/dbrx-base
DeciLMForCausalLM DeciLM Deci/DeciLM-7BDeci/DeciLM-7B-instruct
DeepseekForCausalLM DeepSeek-MoE deepseek-ai/deepseek-moe-16b-basedeepseek-ai/deepseek-moe-16b-chat
DeepseekV2ForCausalLM DeepSeekV2 deepseek-ai/DeepSeek-V2-Litedeepseek-ai/DeepSeek-V2-Lite-Chatdeepseek-ai/DeepSeek-Coder-V2-Lite-Instruct
DeepseekV3ForCausalLM DeepSeekV3 deepseek-ai/DeepSeek-V3deepseek-ai/DeepSeek-V3-Basedeepseek-ai/DeepSeek-R1deepseek-ai/DeepSeek-R1-Zero
ExaoneForCausalLM Exaone LGAI-EXAONE/EXAONE-3.5-2.4B-InstructLGAI-EXAONE/EXAONE-3.5-7.8B-InstructLGAI-EXAONE/EXAONE-3.5-32B-InstructLGAI-EXAONE/EXAONE-3.0-7.8B-Instruct
FalconForCausalLM Falcon tiiuae/falcon-11Btiiuae/falcon-7btiiuae/falcon-7b-instructtiiuae/falcon-40btiiuae/falcon-40b-instruct
GemmaForCausalLM Gemma google/gemma-2bgoogle/gemma-2b-itgoogle/gemma-1.1-2b-itgoogle/codegemma-2bgoogle/codegemma-1.1-2bgoogle/gemma-7bgoogle/gemma-7b-itgoogle/gemma-1.1-7b-itgoogle/codegemma-7bgoogle/codegemma-7b-itgoogle/codegemma-1.1-7b-it
Gemma2ForCausalLM Gemma2 google/gemma-2-2bgoogle/gemma-2-2b-itgoogle/gemma-2-9bgoogle/gemma-2-9b-itgoogle/gemma-2-27bgoogle/gemma-2-27b-it
Gemma3ForCausalLM Gemma3 google/gemma-3-270mgoogle/gemma-3-270m-itgoogle/gemma-3-1b-itgoogle/gemma-3-1b-pt
GlmForCausalLM GLM THUDM/glm-edge-1.5b-chatTHUDM/glm-edge-4b-chatTHUDM/glm-4-9b-hfTHUDM/glm-4-9b-chat-hfTHUDM/glm-4-9b-chat-1m-hf
GPT2LMHeadModel GPT2 openai-community/gpt2openai-community/gpt2-mediumopenai-community/gpt2-largeopenai-community/gpt2-xldistilbert/distilgpt2
CodeParrot codeparrot/codeparrot-smallcodeparrot/codeparrot-small-code-to-textcodeparrot/codeparrot-small-text-to-codecodeparrot/codeparrot-small-multicodeparrot/codeparrot
GPTBigCodeForCausalLM StarCoder bigcode/starcoderbase-1bbigcode/starcoderbase-3bbigcode/starcoderbase-7bbigcode/starcoderbasebigcode/starcoderbigcode/octocoderHuggingFaceH4/starchat-alphaHuggingFaceH4/starchat-beta
GPTJForCausalLM GPT-J EleutherAI/gpt-j-6bcrumb/Instruct-GPT-J
GPTNeoForCausalLM GPT Neo EleutherAI/gpt-neo-1.3BEleutherAI/gpt-neo-2.7B
GPTNeoXForCausalLM GPT NeoX EleutherAI/gpt-neox-20b
Dolly databricks/dolly-v2-3bdatabricks/dolly-v2-7bdatabricks/dolly-v2-12b
RedPajama ikala/redpajama-3b-chattogethercomputer/RedPajama-INCITE-Chat-3B-v1togethercomputer/RedPajama-INCITE-Instruct-3B-v1togethercomputer/RedPajama-INCITE-7B-Chattogethercomputer/RedPajama-INCITE-7B-Instruct
GPTNeoXJapaneseForCausalLM GPT NeoX Japanese abeja/gpt-neox-japanese-2.7b
GptOssForCausalLM GPT-OSS openai/gpt-oss-20b
GraniteForCausalLM Granite ibm-granite/granite-3.2-2b-instructibm-granite/granite-3.2-8b-instructibm-granite/granite-3.1-2b-instructibm-granite/granite-3.1-8b-instructibm-granite/granite-3.0-2b-instructibm-granite/granite-3.0-8b-instruct
GraniteMoeForCausalLM GraniteMoE ibm-granite/granite-3.1-1b-a400m-instructibm-granite/granite-3.1-3b-a800m-instructibm-granite/granite-3.0-1b-a400m-instructibm-granite/granite-3.0-3b-a800m-instruct
InternLMForCausalLM InternLM internlm/internlm-chat-7binternlm/internlm-7b
InternLM2ForCausalLM InternLM2 internlm/internlm2-chat-1_8binternlm/internlm2-1_8binternlm/internlm2-chat-7binternlm/internlm2-7binternlm/internlm2-chat-20binternlm/internlm2-20binternlm/internlm2_5-1_8b-chatinternlm/internlm2_5-1_8binternlm/internlm2_5-7b-chatinternlm/internlm2_5-7binternlm/internlm2_5-20b-chatinternlm/internlm2_5-20b
JAISLMHeadModel Jais inceptionai/jais-13b-chatinceptionai/jais-13b
LlamaForCausalLM Llama 3 meta-llama/Llama-3.2-1Bmeta-llama/Llama-3.2-1B-Instructmeta-llama/Llama-3.2-3Bmeta-llama/Llama-3.2-3B-Instructmeta-llama/Llama-3.1-8Bmeta-llama/Llama-3.1-8B-Instructmeta-llama/Meta-Llama-3-8Bmeta-llama/Meta-Llama-3-8B-Instructmeta-llama/Llama-3.3-70B-Instructmeta-llama/Llama-3.1-70Bmeta-llama/Llama-3.1-70B-Instructmeta-llama/Meta-Llama-3-70Bmeta-llama/Meta-Llama-3-70B-Instructdeepseek-ai/DeepSeek-R1-Distill-Llama-8Bdeepseek-ai/DeepSeek-R1-Distill-Llama-70B
Llama 2 meta-llama/Llama-2-13b-chat-hfmeta-llama/Llama-2-13b-hfmeta-llama/Llama-2-7b-chat-hfmeta-llama/Llama-2-7b-hfmeta-llama/Llama-2-70b-chat-hfmeta-llama/Llama-2-70b-hfmicrosoft/Llama2-7b-WhoIsHarryPotter
Falcon3 tiiuae/Falcon3-1B-Instructtiiuae/Falcon3-1B-Basetiiuae/Falcon3-3B-Instructtiiuae/Falcon3-3B-Basetiiuae/Falcon3-7B-Instructtiiuae/Falcon3-7B-Basetiiuae/Falcon3-10B-Instructtiiuae/Falcon3-10B-Base
OpenLLaMA openlm-research/open_llama_13bopenlm-research/open_llama_3bopenlm-research/open_llama_3b_v2openlm-research/open_llama_7bopenlm-research/open_llama_7b_v2
TinyLlama TinyLlama/TinyLlama-1.1B-Chat-v1.0
MPTForCausalLM MPT mosaicml/mpt-7bmosaicml/mpt-7b-instructmosaicml/mpt-7b-chatmosaicml/mpt-30bmosaicml/mpt-30b-instructmosaicml/mpt-30b-chat
MiniCPMForCausalLM MiniCPM openbmb/MiniCPM-1B-sft-bf16openbmb/MiniCPM-2B-dpo-fp16openbmb/MiniCPM-2B-sft-fp32openbmb/MiniCPM-2B-dpo-fp32openbmb/MiniCPM-2B-sft-bf16openbmb/MiniCPM-2B-dpo-bf16openbmb/MiniCPM4-0.5Bopenbmb/MiniCPM4-8B
MiniCPM3ForCausalLM MiniCPM3 openbmb/MiniCPM3-4B
MistralForCausalLM Mistral mistralai/Mistral-7B-Instruct-v0.1mistralai/Mistral-7B-Instruct-v0.2mistralai/Mistral-7B-Instruct-v0.3mistralai/Mistral-Nemo-Instruct-2407mistralai/Mistral-Nemo-Base-2407mistralai/Mistral-7B-v0.1mistralai/Mistral-7B-v0.3
Notus argilla/notus-7b-v1
Zephyr HuggingFaceH4/zephyr-7b-beta
Neural Chat Intel/neural-chat-7b-v3-3Intel/neural-chat-7b-v3-2Intel/neural-chat-7b-v3-1Intel/neural-chat-7b-v3
MixtralForCausalLM Mixtral mistralai/Mixtral-8x7B-Instruct-v0.1mistralai/Mixtral-8x7B-v0.1
OlmoForCausalLM OLMo allenai/OLMo-1B-hfallenai/OLMo-7B-hfallenai/OLMo-7B-Twin-2T-hfallenai/OLMo-7B-Instruct-hfallenai/OLMo-7B-0724-Instruct-hfallenai/OLMo-7B-0724-SFT-hf
OPTForCausalLM OPT facebook/opt-125mfacebook/opt-350mfacebook/opt-1.3bfacebook/opt-2.7bfacebook/opt-6.7bfacebook/opt-13b
OrionForCausalLM Orion OrionStarAI/Orion-14B-ChatOrionStarAI/Orion-14B-LongChatOrionStarAI/Orion-14B-Base
PhiForCausalLM Phi microsoft/phi-2microsoft/phi-1_5
Phi3ForCausalLM Phi3 microsoft/Phi-3-mini-4k-instructmicrosoft/Phi-3-mini-128k-instructmicrosoft/Phi-3-medium-4k-instructmicrosoft/Phi-3-medium-128k-instructmicrosoft/Phi-3.5-mini-instructmicrosoft/Phi-4-mini-instructmicrosoft/phi-4microsoft/Phi-4-reasoning
PhimoeForCausalLM Phi-3.5-MoE microsoft/Phi-3.5-MoE-instruct
QWenLMHeadModel Qwen Qwen/Qwen-1_8B-ChatQwen/Qwen-1_8B-Chat-Int4Qwen/Qwen-1_8BQwen/Qwen-7B-ChatQwen/Qwen-7B-Chat-Int4Qwen/Qwen-7BQwen/Qwen-14B-ChatQwen/Qwen-14B-Chat-Int4Qwen/Qwen-14BQwen/Qwen-72B-ChatQwen/Qwen-72B-Chat-Int4Qwen/Qwen-72B
Qwen2ForCausalLM Qwen2 Qwen/Qwen2.5-0.5B-InstructQwen/Qwen2.5-1.5B-InstructQwen/Qwen2.5-3B-InstructQwen/Qwen2.5-7B-InstructQwen/Qwen2.5-14B-InstructQwen/Qwen2.5-32B-InstructQwen/Qwen2.5-72B-InstructQwen/Qwen2-0.5B-InstructQwen/Qwen2-1.5B-InstructQwen/Qwen2-7B-InstructQwen/Qwen2-72B-InstructQwen/Qwen1.5-0.5B-ChatQwen/Qwen1.5-1.8B-ChatQwen/Qwen1.5-4B-ChatQwen/Qwen1.5-7B-ChatQwen/Qwen1.5-14B-ChatQwen/Qwen1.5-32B-ChatQwen/Qwen1.5-7B-Chat-GPTQ-Int4Qwen/QwQ-32Bdeepseek-ai/DeepSeek-R1-Distill-Qwen-1.5Bdeepseek-ai/DeepSeek-R1-Distill-Qwen-7Bdeepseek-ai/DeepSeek-R1-Distill-Qwen-14Bdeepseek-ai/DeepSeek-R1-Distill-Qwen-32B
Qwen2MoeForCausalLM Qwen2MoE Qwen/Qwen2-57B-A14B-InstructQwen/Qwen2-57B-A14BQwen/Qwen1.5-MoE-A2.7B-ChatQwen/Qwen1.5-MoE-A2.7B
Qwen3ForCausalLM Qwen3 Qwen/Qwen3-0.6BQwen/Qwen3-1.7BQwen/Qwen3-4BQwen/Qwen3-8BQwen/Qwen3-14BQwen/Qwen3-32BQwen/Qwen3-0.6B-BaseQwen/Qwen3-1.7B-BaseQwen/Qwen3-4B-BaseQwen/Qwen3-8B-BaseQwen/Qwen3-14B-Base
Qwen3MoeForCausalLM Qwen3MoE Qwen/Qwen3-30B-A3BQwen/Qwen3-30B-A3B-Base
StableLmForCausalLM StableLM stabilityai/stablelm-zephyr-3bstabilityai/stablelm-2-1_6bstabilityai/stablelm-2-12bstabilityai/stablelm-2-zephyr-1_6bstabilityai/stablelm-3b-4e1t
Starcoder2ForCausalLM Startcoder2 bigcode/starcoder2-3bbigcode/starcoder2-7bbigcode/starcoder2-15b
XGLMForCausalLM XGLM facebook/xglm-564Mfacebook/xglm-1.7Bfacebook/xglm-2.9Bfacebook/xglm-4.5Bfacebook/xglm-7.5B
XverseForCausalLM Xverse xverse/XVERSE-7Bxverse/XVERSE-7B-Chatxverse/XVERSE-13Bxverse/XVERSE-13B-Chatxverse/XVERSE-65Bxverse/XVERSE-65B-Chat

info

The LLM pipeline can work with other similar topologies produced by optimum-intel with the same model signature. The model is required to have the following inputs after the conversion:

  1. input_ids contains the tokens.
  2. attention_mask is filled with 1.
  3. beam_idx selects beams.
  4. position_ids (optional) encodes a position of currently generating token in the sequence and a single logits output.

note

Models should belong to the same family and have the same tokenizers.

Image Generation Models

Architecture Text to Image Image to Image Inpainting LoRA Support Example HuggingFace Models
Latent Consistency Model SimianLuo/LCM_Dreamshaper_v7
Stable Diffusion CompVis/stable-diffusion-v1-1CompVis/stable-diffusion-v1-2CompVis/stable-diffusion-v1-3CompVis/stable-diffusion-v1-4junnyu/stable-diffusion-v1-4-paddlejcplus/stable-diffusion-v1-5stable-diffusion-v1-5/stable-diffusion-v1-5botp/stable-diffusion-v1-5dreamlike-art/dreamlike-anime-1.0stabilityai/stable-diffusion-2stabilityai/stable-diffusion-2-basestabilityai/stable-diffusion-2-1bguisard/stable-diffusion-nano-2-1justinpinkney/pokemon-stable-diffusionstablediffusionapi/architecture-tuned-modelIDEA-CCNL/Taiyi-Stable-Diffusion-1B-Chinese-EN-v0.1ZeroCool94/stable-diffusion-v1-5pcuenq/stable-diffusion-v1-4rinna/japanese-stable-diffusionbenjamin-paine/stable-diffusion-v1-5philschmid/stable-diffusion-v1-4-endpointsnaclbit/trinart_stable_diffusion_v2Fictiverse/Stable_Diffusion_PaperCut_Model
Stable Diffusion Inpainting stabilityai/stable-diffusion-2-inpaintingstable-diffusion-v1-5/stable-diffusion-inpaintingbotp/stable-diffusion-v1-5-inpaintingparlance/dreamlike-diffusion-1.0-inpainting
Stable Diffusion XL stabilityai/stable-diffusion-xl-base-0.9stabilityai/stable-diffusion-xl-base-1.0stabilityai/sdxl-turbocagliostrolab/animagine-xl-4.0
Stable Diffusion XL Inpainting diffusers/stable-diffusion-xl-1.0-inpainting-0.1
Stable Diffusion 3 stabilityai/stable-diffusion-3-medium-diffusersstabilityai/stable-diffusion-3.5-mediumstabilityai/stable-diffusion-3.5-largestabilityai/stable-diffusion-3.5-large-turbotensorart/stable-diffusion-3.5-medium-turbotensorart/stable-diffusion-3.5-large-TurboX
Flux black-forest-labs/FLUX.1-schnellFreepik/flux.1-lite-8B-alphablack-forest-labs/FLUX.1-devshuttleai/shuttle-3-diffusionshuttleai/shuttle-3.1-aestheticshuttleai/shuttle-jaguarShakker-Labs/AWPortrait-FLblack-forest-labs/FLUX.1-Fill-dev

Visual Language Models (VLMs)

LoRA Support

VLM pipeline does not support LoRA adapters.

Architecture Models Example HuggingFace Models
InternVLChat InternVLChatModel (Notes) OpenGVLab/InternVL2-1BOpenGVLab/InternVL2-2BOpenGVLab/InternVL2-4BOpenGVLab/InternVL2-8BOpenGVLab/InternVL2_5-1BOpenGVLab/InternVL2_5-2BOpenGVLab/InternVL2_5-4BOpenGVLab/InternVL2_5-8BOpenGVLab/InternVL3-1BOpenGVLab/InternVL3-2BOpenGVLab/InternVL3-8BOpenGVLab/InternVL3-9BOpenGVLab/InternVL3-14B
LLaVA LLaVA-v1.5 llava-hf/llava-1.5-7b-hf
nanoLLaVA nanoLLaVA qnguyen3/nanoLLaVA
nanoLLaVA-1.5 qnguyen3/nanoLLaVA-1.5
LLaVA-NeXT LLaVA-v1.6 llava-hf/llava-v1.6-mistral-7b-hfllava-hf/llava-v1.6-vicuna-7b-hfllava-hf/llama3-llava-next-8b-hf
LLaVA-NeXT-Video LLaVA-Next-Video llava-hf/LLaVA-NeXT-Video-7B-hf
MiniCPMO MiniCPM-o-2_6 (Notes) openbmb/MiniCPM-o-2_6
MiniCPMV MiniCPM-V-2_6 openbmb/MiniCPM-V-2_6
Phi3VForCausalLM phi3_v (Notes) microsoft/Phi-3-vision-128k-instructmicrosoft/Phi-3.5-vision-instruct
Phi4MMForCausalLM phi4mm (Notes) microsoft/Phi-4-multimodal-instruct
Qwen2-VL Qwen2-VL Qwen/Qwen2-VL-2B-InstructQwen/Qwen2-VL-7B-InstructQwen/Qwen2-VL-2BQwen/Qwen2-VL-7B
Qwen2.5-VL Qwen2.5-VL Qwen/Qwen2.5-VL-3B-InstructQwen/Qwen2.5-VL-7B-Instruct
Gemma3ForConditionalGeneration gemma3 google/gemma-3-4b-itgoogle/gemma-3-12b-itgoogle/gemma-3-27b-it

VLM Models Notes

InternVL2

To convert InternVL2 models, timm and einops are required:

pip install timm einops

MiniCPMO

  1. openbmb/MiniCPM-o-2_6 doesn't support transformers>=4.52 which is required for optimum-cli export.
  2. --task image-text-to-text is required for optimum-cli export openvino --trust-remote-code because image-text-to-text isn't MiniCPM-o-2_6's native task.

phi3_v

Models' configs aren't consistent. It's required to override the default eos_token_id with the one from a tokenizer:

generation_config.set_eos_token_id(pipe.get_tokenizer().get_eos_token_id())

phi4mm

Apply https://huggingface.co/microsoft/Phi-4-multimodal-instruct/discussions/78/files to fix the model export for transformers>=4.50

Speech Recognition Models (Whisper-based)

LoRA Support

Speech recognition pipeline does not support LoRA adapters.

Architecture Models Example HuggingFace Models
WhisperForConditionalGeneration Whisper openai/whisper-tinyopenai/whisper-tiny.enopenai/whisper-baseopenai/whisper-base.enopenai/whisper-smallopenai/whisper-small.enopenai/whisper-mediumopenai/whisper-medium.enopenai/whisper-large-v3
Distil-Whisper distil-whisper/distil-small.endistil-whisper/distil-medium.endistil-whisper/distil-large-v3

Speech Generation Models

LoRA Support

Speech generation pipeline does not support LoRA adapters.

Architecture Models Example HuggingFace Models
SpeechT5ForTextToSpeech SpeechT5 TTS microsoft/speecht5_tts

Text Embeddings Models

LoRA Support

Text embeddings pipeline does not support LoRA adapters.

Architecture Example HuggingFace Models
BertModel BAAI/bge-small-en-v1.5BAAI/bge-base-en-v1.5BAAI/bge-large-en-v1.5sentence-transformers/all-MiniLM-L12-v2mixedbread-ai/mxbai-embed-large-v1mixedbread-ai/mxbai-embed-xsmall-v1WhereIsAI/UAE-Large-V1
MPNetForMaskedLM sentence-transformers/all-mpnet-base-v2sentence-transformers/multi-qa-mpnet-base-dot-v1
RobertaForMaskedLM sentence-transformers/all-distilroberta-v1
XLMRobertaModel mixedbread-ai/deepset-mxbai-embed-de-large-v1intfloat/multilingual-e5-large-instructintfloat/multilingual-e5-large
Qwen3ForCausalLM Qwen/Qwen3-Embedding-0.6BQwen/Qwen3-Embedding-4BQwen/Qwen3-Embedding-8B

Text Embeddings Models Notes

Qwen3 Embedding models require --task feature-extraction during the conversion with optimum-cli.

Text Rerank Models

LoRA Support

Text rerank pipeline does not support LoRA adapters.

Architecture `optimum-cli` task Example HuggingFace Models
BertForSequenceClassification text-classification cross-encoder/ms-marco-MiniLM-L2-v2cross-encoder/ms-marco-MiniLM-L4-v2cross-encoder/ms-marco-MiniLM-L6-v2cross-encoder/ms-marco-MiniLM-L12-v2cross-encoder/ms-marco-TinyBERT-L2-v2tomaarsen/reranker-MiniLM-L12-gooaq-bce
XLMRobertaForSequenceClassification text-classification BAAI/bge-reranker-v2-m3
ModernBertForSequenceClassification text-classification tomaarsen/reranker-ModernBERT-base-gooaq-bcetomaarsen/reranker-ModernBERT-large-gooaq-bceAlibaba-NLP/gte-reranker-modernbert-base
Qwen3ForCausalLM text-generation-with-past Qwen/Qwen3-Reranker-0.6BQwen/Qwen3-Reranker-4BQwen/Qwen3-Reranker-8B

Text Rerank Models Notes

Text Rerank models require appropriate --task provided during the conversion with optimum-cli. Task can be found in the table above.


Hugging Face Notes

Some models may require access request submission on the Hugging Face page to be downloaded.

If https://huggingface.co/ is down, the conversion step won't be able to download the models.