Google models (original) (raw)
Get started
- Get an API key
- Configure application default credentials
- API quickstart
- Vertex AI Studio quickstart
- Migrate from Google AI Studio to Vertex AI
- Deploy your Vertex AI Studio prompt as a web application
- Vertex AI Studio capabilities
- Get started with Gemini 3
- Generate an image and verify its watermark using Imagen
- Google GenAI libraries
- Compatibility with OpenAI library
- Vertex AI in express mode
- Overview
- Console tutorial
- API tutorial
Select models
- Model Garden
- Overview of Model Garden
- Use models in Model Garden
- Test model capabilities
- Google Models
- All Google models
- Gemini
* Migrate to the latest Gemini models
* Pro
* Gemini 3 Pro
* Gemini 3 Pro Image
* Gemini 2.5 Pro
* Flash
* Gemini 3 Flash
* Gemini 2.5 Flash
* Gemini 2.5 Flash Image
* Gemini 2.5 Flash Live API
* Gemini 2.0 Flash
* Flash-Lite
* Gemini 2.5 Flash-Lite
* Gemini 2.0 Flash-Lite
* Other Gemini models
* Vertex AI Model Optimizer - Imagen
* Imagen 3
* Imagen 4
* Imagen 4.0 upscale Preview
* Virtual Try-On Preview 08-04
* Imagen product recontext preview 06-30 - Model versions
- Managed models
- Model as a Service (MaaS) overview
- Partner models
* Overview
* Claude
* Overview
* Request predictions
* Batch predictions
* Prompt caching
* Count tokens
* Web search
* Safety classifiers
* Model details
* Claude Opus 4.5
* Claude Sonnet 4.5
* Claude Opus 4.1
* Claude Haiku 4.5
* Claude Opus 4
* Claude Sonnet 4
* Claude 3.5 Haiku
* Claude 3 Haiku - Open models
* Overview
* Use open models via Model as a Service (MaaS)
* Grant access to open models
* Models
* Qwen
* Overview
* Qwen 3 Next Instruct 80B
* Qwen 3 Next Thinking 80B
* Qwen 3 Coder
* Qwen 3 235B
* Llama
* Overview
* Request predictions
* Model details
* Llama 4 Maverick
* Llama 4 Scout
* Llama 3.3
* Llama 3.2
* Llama 3.1 405b
* Llama 3.1 70b
* Llama 3.1 8b
* API
* Call MaaS APIs for open models
* Function calling
* Thinking
* Structured output
* Batch prediction - Model deprecations (MaaS)
- Self-deployed models
- Overview
- Choose an open model serving option
- Deploy partner models from Model Garden
- Llama
- Use Hugging Face Models
- Hex-LLM
- Comprehensive guide to vLLM for Text and Multimodal LLM Serving (GPU)
- vLLM TPU
- xDiT
- Tutorial: Deploy Llamma 3 models with SpotVM and Reservations
Build
- Agents
- Vertex AI Agent Builder documentation
- Prompt design
- Introduction to prompting
- Prompting strategies
* Overview
* Give clear and specific instructions
* Use system instructions
* Include few-shot examples
* Add contextual information
* Structure prompts
* Compare prompts
* Instruct the model to explain its reasoning
* Break down complex tasks
* Experiment with parameter values
* Prompt iteration strategies - Capabilities
- Safety
* Overview
* Responsible AI
* System instructions for safety
* Configure content filters
* Gemini for safety filtering and content moderation
* Abuse monitoring
* Process blocked responses
* Content Credentials - Image generation
* Overview
* Generate and edit images with Gemini
* Generate images using text prompts with Imagen
* Edit images with Imagen
* Verify an image watermark
* Edit images
* Overview
* Insert objects into an image using inpaint
* Remove objects from an image using inpaint
* Expand the content of an image using outpaint
* Replace the background of an image
* Upscale images
* Prompt and image attribute guide
* Base64 encode and decode files
* Responsible AI and usage guidelines for Imagen - Video generation
* Introduction to Veo
* Generate Veo videos from text prompts
* Generate Veo videos from an image
* Generate Veo videos using first and last video frames
* Extend Veo videos
* Direct Veo video generation using a reference image
* Insert objects into Veo videos
* Remove objects from Veo videos
* Veo prompt guide
* Veo best practices
* Turn off Veo's prompt rewriter
* Responsible AI for Veo - Grounding
* Overview
* Grounding with Google Search
* Grounding with Google Maps
* Grounding with Vertex AI Search
* Grounding with your search API
* Grounding responses using RAG
* Grounding with Elasticsearch
* Web Grounding for Enterprise - URL context
- Computer Use
- Live API
* Overview
* Start and manage live sessions
* Send audio and video streams
* Configure language and voice
* Configure Gemini capabilities
* Best practices with Live API
* Demo apps and resources - Translation
- Generate speech from text
- Transcribe speech
- Development tools
- RAG Engine
* RAG overview
* RAG quickstart
* RAG Engine billing
* Understanding RagManagedDb
* Data ingestion
* Use Vertex AI Search with RAG
* Reranking for RAG
* Manage your RAG corpus
* Use CMEK with RAG
* RAG quotas
* Use RAG in Gemini Live API - Multimodal datasets
- Use Vertex AI Search
- Model tuning
- Introduction to tuning
- Open models
- Tuning recommendations with LoRA and QLoRA
- Migrate
Evaluate
- Overview
- Tutorial: Perform evaluation using the console
- Alternative evaluation methods
- Run AutoSxS pipeline
- Run a computation-based evaluation pipeline
Deploy
- Overview
- Optimize cost, latency, and performance
- Deployment best practices
- Quotas and system limits
- All quotas and system limits
- Dynamic shared quota
- Provisioned Throughput
* Provisioned Throughput overview
* Supported models
* Calculate Provisioned Throughput requirements
* Provisioned Throughput for Live API
* Provisioned Throughput for Veo 3 models
* Single Zone Provisioned Throughput
* Purchase Provisioned Throughput
* Use Provisioned Throughput
* Troubleshooting error code 429
Administer
Go to Vertex AI documentation
Google models
Featured Gemini models
3 Pro
Designed for comprehensive multimodal understanding and complex problem solving
- Features a 1 million token context window
- Excels in agentic workflows and autonomous coding tasks
- Designed for complex multimodal tasks and advanced reasoning
3 Flash
Our most powerful agentic and coding model, with the best multimodal understanding capabilities
- The latest in our workhorse line of Gemini models
- enhanced multimodal and coding capabilities
- Features our new near-zero thinking level option
2.5 Flash Image
Jumpstart your creative workflow with image generation and conversational editing
- Generate high-quality images
- Capable of turn-based conversational editing
- Same balance of speed and price as Gemini 2.5 Flash
Generally available Gemini models
diamond Gemini 2.5 Pro Our high-capability model for complex reasoning and coding. Features adaptive thinking capabilities to solve complex agentic and multimodal challenges with a 1 million token context.
spark Gemini 2.5 Flash Lightning-fast and highly capable. Delivers a balance of intelligence and latency with controllable thinking budgets for versatile applications.
🍌 Gemini 2.5 Flash Image Turn ideas into production-ready assets. Features conversational editing, multi-image fusion, and character consistency for advanced creative workflows.
performance_auto Gemini 2.5 Flash-Lite Built for massive scale. Balances cost and performance for high-throughput tasks, optimized for efficiency without sacrificing multimodal understanding.
audio_spark Gemini 2.5 Flash with Gemini Live API Designed for real-time, bidirectional streaming. Features low-latency built-in audio and affective dialogue capabilities for natural, conversational interactions.
spark Gemini 2.0 Flash Multimodal performance for developers needing a cost-effective model for general-purpose tasks.
performance_auto Gemini 2.0 Flash-Lite Streamlined and ultra-efficient for simple, high-frequency tasks where speed and price are the priority.
Preview Gemini models
preview Gemini 3 Pro Our latest reasoning-first model optimized for complex agentic workflows and coding. Features adaptive thinking, a 1M token context window, and integrated grounding for sophisticated multimodal problem solving.
preview Gemini 3 Flash Our best model for complex multimodal understanding, designed to tackle the most challenging agentic problems with strong coding and state-of-the-art reasoning capabilities.
preview Gemini 3 Pro Image High-fidelity image generation with reasoning-enhanced composition. Supports legible text rendering, complex multi-turn editing, and character consistency using up to 14 reference inputs.
Gemma models
Gemma 3n An open model designed for efficient execution on low-resource devices, supporting multimodal input (text, image, video, and audio) and text output in over 140 languages.
Gemma 3 An open model featuring text and image input, support for over 140 languages, and a 128K context window.
Gemma 2 An open model supporting text generation, summarization, and extraction.
Gemma A small, lightweight open model supporting text generation, summarization, and extraction.
ShieldGemma 2 Instruction-tuned models for evaluating text and image safety against defined policies.
PaliGemma An open vision-language model combining SigLIP and Gemma.
CodeGemma A powerful, lightweight open model for coding tasks, including code completion, generation, and understanding.
TxGemma A model that generates predictions, classifications, or text based on therapeutic-related data, for building AI models with less data and compute.
MedGemma A collection of Gemma 3 variants trained for performance on medical text and image comprehension.
MedSigLIP A SigLIP variant trained to encode medical images and text into a common embedding space.
T5Gemma A family of lightweight encoder-decoder research models.
Embeddings models
width_normal Embeddings for Text Converts text data into vector representations for semantic search, classification, and clustering.
width_normal Multimodal Embeddings Generates vectors based on images, for tasks such as image classification and search.
Imagen models
photo_spark Imagen 4 for Generation Use text prompts to generate novel images with higher quality than our previous image generation models
photo_spark Imagen 4 for Fast Generation Use text prompts to generate novel images with higher quality and lower latency than our previous image generation models
photo_spark Imagen 4 for Ultra Generation Use text prompts to generate novel images with higher quality and better prompt adherence than our previous image generation models
photo_spark Imagen 3 for Generation 002 Use text prompts to generate novel images
photo_spark Imagen 3 for Generation 001 Use text prompts to generate novel images
photo_spark Imagen 3 for Fast Generation Use text prompts to generate novel images with lower latency than our other image generation models
image_edit_auto Imagen 3 for Editing and Customization Edits existing images or generates new images based on text prompts and provided context.
Preview Imagen models
photo_spark Virtual Try-On Generates images of people wearing clothing products.
image_edit_auto Imagen product recontext on Vertex AI Edits product images to place them in different scenes or backgrounds based on text prompts.
Veo models
movie Veo 2 Generate Generates videos from text prompts and images.
movie Veo 3 Generate Generates videos from text prompts and images with high quality.
movie Veo 3 Fast Generates videos from text prompts and images with high quality and low latency.
movie Veo 3.1 Generate Generates videos from text prompts and images with high quality.
movie Veo 3.1 Fast Generates videos from text prompts and images with high quality and low latency.
Preview Veo models
movie Veo 3 Generate preview Generates videos from text prompts and images with high quality.
movie Veo 3 Fast preview Generates videos from text prompts and images with high quality and low latency.
movie Veo 3.1 Generate preview Generates videos from text prompts and images with high quality.
movie Veo 3.1 Fast preview Generates videos from text prompts and images with high quality and low latency.
movie Veo 2 Preview Generates videos from text prompts and images, supporting inpaint and outpaint.
Experimental Veo models
movie Veo 2 Experimental An experimental model with features under test.
MedLM models
medical_information MedLM-medium A HIPAA-compliant model for medical question answering and summarization of healthcare documents.
clinical_notes MedLM-large-large A HIPAA-compliant model for medical question answering and summarization of healthcare documents.
Language support
Gemini
All the Gemini models can understand and respond in the following languages:
Afrikaans (af), Albanian (sq), Amharic (am), Arabic (ar), Armenian (hy), Assamese (as), Azerbaijani (az), Basque (eu), Belarusian (be), Bengali (bn), Bosnian (bs), Bulgarian (bg), Catalan (ca), Cebuano (ceb), Chinese (Simplified and Traditional) (zh), Corsican (co), Croatian (hr), Czech (cs), Danish (da), Dhivehi (dv), Dutch (nl), English (en), Esperanto (eo), Estonian (et), Filipino (Tagalog) (fil), Finnish (fi), French (fr), Frisian (fy), Galician (gl), Georgian (ka), German (de), Greek (el), Gujarati (gu), Haitian Creole (ht), Hausa (ha), Hawaiian (haw), Hebrew (iw), Hindi (hi), Hmong (hmn), Hungarian (hu), Icelandic (is), Igbo (ig), Indonesian (id), Irish (ga), Italian (it), Japanese (ja), Javanese (jv), Kannada (kn), Kazakh (kk), Khmer (km), Korean (ko), Krio (kri), Kurdish (ku), Kyrgyz (ky), Lao (lo), Latin (la), Latvian (lv), Lithuanian (lt), Luxembourgish (lb), Macedonian (mk), Malagasy (mg), Malay (ms), Malayalam (ml), Maltese (mt), Maori (mi), Marathi (mr), Meiteilon (Manipuri) (mni-Mtei), Mongolian (mn), Myanmar (Burmese) (my), Nepali (ne), Norwegian (no), Nyanja (Chichewa) (ny), Odia (Oriya) (or), Pashto (ps), Persian (fa), Polish (pl), Portuguese (pt), Punjabi (pa), Romanian (ro), Russian (ru), Samoan (sm), Scots Gaelic (gd), Serbian (sr), Sesotho (st), Shona (sn), Sindhi (sd), Sinhala (Sinhalese) (si), Slovak (sk), Slovenian (sl), Somali (so), Spanish (es), Sundanese (su), Swahili (sw), Swedish (sv), Tajik (tg), Tamil (ta), Telugu (te), Thai (th), Turkish (tr), Ukrainian (uk), Urdu (ur), Uyghur (ug), Uzbek (uz), Vietnamese (vi), Welsh (cy), Xhosa (xh), Yiddish (yi), Yoruba (yo), and Zulu (zu).
Gemma
Gemma and Gemma 2 support only the English (en) language. Gemma 3 and Gemma 3n provide multilingual support in over 140 languages.
Embeddings
Multilingual text embedding models support the following languages:
Afrikaans (af), Albanian (sq), Amharic (am), Arabic (ar), Armenian (hy), Azerbaijani (az), Basque (eu), Belarusian (be), Bengali (bn), Bulgarian (bg), Catalan (ca), Cebuano (ceb), Chinese (Simplified and Traditional) (zh), Corsican (co), Czech (cs), Danish (da), Dutch (nl), English (en), Esperanto (eo), Estonian (et), Filipino (Tagalog) (fil), Finnish (fi), French (fr), Frisian (fy), Galician (gl), Georgian (ka), German (de), Greek (el), Gujarati (gu), Haitian Creole (ht), Hausa (ha), Hawaiian (haw), Hebrew (iw), Hindi (hi), Hmong (hmn), Hungarian (hu), Icelandic (is), Igbo (ig), Indonesian (id), Irish (ga), Italian (it), Japanese (ja), Javanese (jv), Kannada (kn), Kazakh (kk), Khmer (km), Korean (ko), Kurdish (ku), Kyrgyz (ky), Lao (lo), Latin (la), Latvian (lv), Lithuanian (lt), Luxembourgish (lb), Macedonian (mk), Malagasy (mg), Malay (ms), Malayalam (ml), Maltese (mt), Maori (mi), Marathi (mr), Mongolian (mn), Myanmar (Burmese) (my), Nepali (ne), Nyanja (Chichewa) (ny), Norwegian (no), Pashto (ps), Persian (fa), Polish (pl), Portuguese (pt), Punjabi (pa), Romanian (ro), Russian (ru), Samoan (sm), Scots Gaelic (gd), Serbian (sr), Sesotho (st), Shona (sn), Sindhi (sd), Sinhala (Sinhalese) (si), Slovak (sk), Slovenian (sl), Somali (so), Spanish (es), Sundanese (su), Swahili (sw), Swedish (sv), Tajik (tg), Tamil (ta), Telugu (te), Thai (th), Turkish (tr), Ukrainian (uk), Urdu (ur), Uzbek (uz), Vietnamese (vi), Welsh (cy), Xhosa (xh), Yiddish (yi), Yoruba (yo), and Zulu (zu).
Imagen 3
Imagen 3 supports the following languages:
English (en), Chinese (Simplified and Traditional) (zh), Hindi (hi), Japanese (ja), Korean (ko), Portuguese (pt), and Spanish (es).
MedLM
The MedLM model supports the English (en) language.
Explore all models in Model Garden
Model Garden is a platform that helps you discover, test, customize, and deploy Google proprietary and select OSS models and assets. To explore the generative AI models and APIs that are available on Vertex AI, go to Model Garden in the Google Cloud console.
To learn more about Model Garden, including available models and capabilities, seeExplore AI models in Model Garden.
Model versions
To see all model versions, including legacy and retired models, seeModel versions and lifecycle.
What's next
- Try a quickstart tutorial usingVertex AI Studio or the Vertex AI API.
- Explore pretrained models inModel Garden.
- Learn how to control access to specific models in Model Garden by using a Model Garden organization policy.
- Learn about pricing.
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-12-16 UTC.