Providers | liteLLM (original) (raw)

Learn how to deploy + call models from different providers on LiteLLM

📄️ Integrate as a Model ProviderQuick Start for OpenAI-Compatible Providers

📄️ Add OpenAI-Compatible Provider (JSON)For simple OpenAI-compatible providers (like Hyperbolic, Nscale, etc.), you can add support by editing a single JSON file.

📄️ Add Model Pricing & Context WindowTo add pricing or context window information for a model, simply make a PR to this file:

🗃️ OpenAI4 items

📄️ OpenAI (Text Completion)LiteLLM supports OpenAI text completion models

📄️ OpenAI-Compatible EndpointsSelecting openai as the provider routes your request to an OpenAI-compatible endpoint using the upstream

🗃️ Azure OpenAI5 items

🗃️ Azure AI9 items

🗃️ Vertex AI11 items

🗃️ Google AI Studio6 items

📄️ AnthropicLiteLLM supports all anthropic models.

📄️ Tool SearchTool search enables Claude to dynamically discover and load tools on-demand from large tool catalogs (10,000+ tools). Instead of loading all tool definitions into the context window upfront, Claude searches your tool catalog and loads only the tools it needs.

📄️ AWS SagemakerLiteLLM supports All Sagemaker Huggingface Jumpstart Models

🗃️ Bedrock13 items

📄️ LiteLLM Proxy (LLM Gateway)| Property | Details |

📄️ AbliterationOverview

📄️ AI21LiteLLM supports the following AI21 models:

📄️ AI/ML APIhttps://aimlapi.com/

📄️ Aleph AlphaLiteLLM supports all models from Aleph Alpha.

📄️ Amazon Nova| Property | Details |

📄️ Anyscalehttps://app.endpoints.anyscale.com/

📄️ Apertis AI (Stima API)Overview

📄️ BasetenLiteLLM supports both Baseten Model APIs and dedicated deployments with automatic routing.

📄️ Black Forest Labs Image GenerationBlack Forest Labs provides state-of-the-art text-to-image generation using their FLUX models.

📄️ Black Forest Labs Image EditingBlack Forest Labs provides powerful image editing capabilities using their FLUX models to modify existing images based on text descriptions.

📄️ BytezLiteLLM supports all chat models on Bytez!

📄️ Cerebrashttps://inference-docs.cerebras.ai/api-reference/chat-completions

📄️ ChutesOverview

📄️ ClarifaiAnthropic, OpenAI, Qwen, xAI, Gemini and most of Open soured LLMs are Supported on Clarifai.

📄️ Cloudflare Workers AIhttps://developers.cloudflare.com/workers-ai/models/text-generation/

📄️ Codestral API [Mistral AI]Codestral is available in select code-completion plugins but can also be queried directly. See the documentation for more details.

📄️ CohereAPI KEYS

📄️ CometAPILiteLLM supports all AI models from CometAPI. CometAPI provides access to 500+ AI models through a unified API interface, including cutting-edge models like GPT-5, Claude Opus 4.1, and various other state-of-the-art language models.

📄️ CompactifAIhttps://docs.compactif.ai/

📄️ Custom API Server (Custom Format)Call your custom torch-serve / internal LLM APIs via LiteLLM

📄️ Dashscope API (Qwen models)https://dashscope.console.aliyun.com/

📄️ DatabricksLiteLLM supports all models on Databricks

📄️ DataRobotLiteLLM supports all models from DataRobot. Select datarobot as the provider to route your request through the datarobot OpenAI-compatible endpoint using the upstream official OpenAI Python API library.

📄️ DeepgramLiteLLM supports Deepgram's /listen endpoint.

📄️ DeepInfrahttps://deepinfra.com/

📄️ Deepseekhttps://deepseek.com/

📄️ Docker Model RunnerOverview

📄️ ElevenLabsElevenLabs provides high-quality AI voice technology, including speech-to-text capabilities through their transcription API.

📄️ EmpirioLabs AIOverview

📄️ Fal AIFal AI provides fast, scalable access to state-of-the-art image generation models including FLUX, Stable Diffusion, Imagen, and more.

📄️ Featherless AIhttps://featherless.ai/

📄️ Fireworks AIWe support ALL Fireworks AI models, just set fireworks_ai/ as a prefix when sending completion requests

📄️ FriendliAIWe support ALL FriendliAI models, just set friendliai/ as a prefix when sending completion requests

📄️ Galadrielhttps://docs.galadriel.com/api-reference/chat-completion-API

📄️ Githubhttps://github.com/marketplace/models

📄️ GitHub Copilothttps://docs.github.com/en/copilot

📄️ GMI CloudOverview

📄️ ChatGPT SubscriptionUse ChatGPT Pro/Max subscription models through LiteLLM with OAuth device flow authentication.

📄️ GradientAIhttps://digitalocean.com/products/gradientai

📄️ Groqhttps://groq.com/

📄️ HeliconeOverview

📄️ HerokuProvision a Model

🗃️ HuggingFace2 items

📄️ HyperbolicOverview

📄️ InceptionOverview

📄️ Infinity| Property | Details |

📄️ Jina AIhttps://jina.ai/embeddings/

📄️ Lambda AIOverview

📄️ LangGraphCall LangGraph agents through LiteLLM using the OpenAI chat completions format.

📄️ LemonadeLemonade Server is an OpenAI-compatible local language model inference provider optimized for AMD GPUs and NPUs. The lemonade litellm provider supports standard chat completions with full OpenAI API compatibility.

📄️ LlamafileLiteLLM supports all models on Llamafile.

📄️ LlamaGateOverview

📄️ LM Studiohttps://lmstudio.ai/docs/basics/server

📄️ ManusUse Manus AI agents through LiteLLM's OpenAI-compatible Responses API.

📄️ Meta Llama| Property | Details |

📄️ Milvus - Vector StoreUse Milvus as a vector store for RAG.

📄️ Mistral AI APIhttps://docs.mistral.ai/api/

📄️ MiniMaxOverview

📄️ Moonshot AIOverview

📄️ MorphLiteLLM supports all models on Morph

📄️ Nebius AI Studiohttps://docs.nebius.com/studio/inference/quickstart

📄️ NLP CloudLiteLLM supports all LLMs on NLP Cloud.

📄️ NanoGPTOverview

📄️ Novita AI| Property | Details |

📄️ Nscale (EU Sovereign)https://docs.nscale.com/docs/inference/chat

🗃️ Nvidia NIM2 items

📄️ Nvidia Riva (Speech-to-Text)LiteLLM supports NVIDIA Riva for speech-to-text via /audio/transcriptions. Works with both the NVCF-hosted Riva endpoint (e.g. Parakeet on build.nvidia.com) and self-hosted Riva deployments.

📄️ Oracle Cloud Infrastructure (OCI)LiteLLM supports the following models for OCI on-demand GenAI API.

📄️ OllamaLiteLLM supports all models from Ollama

📄️ OpenRouterLiteLLM supports all the text / chat / vision / embedding models from OpenRouter

📄️ Sarvam.aiLiteLLM supports all the text models from Sarvam ai

📄️ 🆕 OVHCloud AI EndpointsLeading French Cloud provider in Europe with data sovereignty and privacy.

🗃️ Perplexity AI2 items

📄️ PetalsPetals//github.com/bigscience-workshop/petals

📄️ PoeOverview

📄️ PublicAIOverview

📄️ PredibaseLiteLLM supports all models on Predibase

📄️ Pydantic AI AgentsCall Pydantic AI Agents via LiteLLM's A2A Gateway.

📄️ RAGFlowLitellm supports Ragflow's chat completions APIs

📄️ Recrafthttps://www.recraft.ai/

📄️ ReplicateLiteLLM supports all models on Replicate

🗃️ RunwayML2 items

📄️ SambaNovahttps://cloud.sambanova.ai/

📄️ SAP Generative AI HubLiteLLM supports SAP Generative AI Hub's Orchestration Service.

📄️ ScalewayLiteLLM supports all models available on Scaleway Generative APIs ↗.

📄️ Stability AIhttps://stability.ai/

📄️ SyntheticOverview

📄️ Snowflake CortexLiteLLM supports all models on the Snowflake Cortex REST API, including models from Anthropic (Claude), OpenAI (GPT), Meta (Llama), Mistral, DeepSeek, and Snowflake.

📄️ TensormeshOverview

📄️ Together AILiteLLM supports all models on Together AI.

📄️ Topaz| Property | Details |

📄️ Triton Inference ServerLiteLLM supports Embedding Models on Triton Inference Servers

📄️ v0Overview

📄️ Vercel AI GatewayOverview

🗃️ vLLM2 items

📄️ Volcano Engine (Volcengine)https://www.volcengine.com/docs/82379/1263482

📄️ Voyage AIhttps://docs.voyageai.com/embeddings/

📄️ Weights & Biases Inferencehttps://weave-docs.wandb.ai/quickstart-inference

🗃️ WatsonX2 items

🗃️ xAI2 items

📄️ Xiaomi MiMohttps://platform.xiaomimimo.com/#/docs

📄️ Xinference [Xorbits Inference]https://inference.readthedocs.io/en/latest/index.html

📄️ Z.AI (Zhipu AI)https://z.ai/