Vertex AI in express mode overview (original) (raw)

With Vertex AI in express mode, you can quickly sign up and begin building generative AI applications on Google Cloud. This simplified setup experience streamlines access to Google Cloud APIs by simplifying organization, billing, and project management.

To learn more about Vertex AI in express mode, seeGoogle Cloud express mode FAQs.

Sign up for express mode

Vertex AI is available in express mode for developers with a @gmail.com Google Account, regardless of whether they are new or existing Google Cloud users:

To sign up for Vertex AI in express mode, access Express Mode from the Google Cloud console:

Go to Express Mode

After you're set up, you can start using Vertex AI in express mode by following one of the tutorials:

Learn about Vertex AI in express mode

When you sign up for express mode, Google Cloud enables APIs on your behalf. This process happens in the background, so you can start using Vertex AI without needing to manually configure resources.

When you sign up for express mode, you get access to the following:

If you are an existing Google Cloud user or after enabling billing, the 90 day free tier is removed. You transition to a paid tier to access extended quota limits and additional Google Cloud features. As your quotas are increased, you only pay for what you use. At any time, you can choose toend express mode and start using all the Google Cloud services and capabilities.

The following table lists the differences between the express mode experiences and the full Vertex AI experience:

Item Vertex AI express mode in the free tier Vertex AI express mode in the paid tier Full Vertex AI
Time limit 90 days Unlimited Unlimited
Available services Basic Generative AI on Vertex AI services. Expanded Vertex AI services and select Google Cloud services. All Google Cloud services, including Vertex AI.
Data sources Google Drive Google Drive Web files YouTube video URLs All data sources available in Google Cloud.
Quota Free tier limits. See Available models and rate limits in express mode. Standard pay-as-you-go limits. See Rate limits. Standard pay-as-you-go limits. See Rate limits.
Service level agreement (SLA) None Vertex AI SLA Vertex AI SLA
Standard format of API endpoints Specify API key instead of project ID and location. For example: https://aiplatform.googleapis.com/v1/publishers/google/models/{model}:streamGenerateContent?key={API\_KEY} Specify API key instead of project ID and location. For example: https://aiplatform.googleapis.com/v1/publishers/google/models/{model}:streamGenerateContent?key={API\_KEY} Specify project ID and location. For example: https://{location}-aiplatform.googleapis.com/v1/projects/{project}/locations/{location}/publishers/google/models/{model}:streamGenerateContent

Available models and rate limits in express mode

You can try out several models in express mode, including the latest Gemini Flash models. The following table lists the models that are available in express mode, along with their rate limits:

Model category Available models Requests per minute Discontinuation date
Gemini gemini-2.5-flash-image 10
gemini-2.5-flash-preview-09-2025 10
gemini-2.5-flash-lite-preview-09-2025 10
gemini-2.5-flash-lite 10 July 22, 2026
gemini-2.5-pro 10 June 17, 2026
gemini-2.5-flash 10 June 17, 2026
gemini-2.5-flash-image-preview 10 October 31, 2025
gemini-2.0-flash-001 10 February 5, 2026
gemini-2.0-flash-lite-001 10 February 25, 2026

For Gemini 2.0 models, the Multimodal Live API isn't available in the Google Cloud console in express mode. To use the Multimodal Live API in express mode, use the Vertex AI API or the Google Gen AI SDK.

You can start sending requests from your application to Vertex AI APIs in three steps:

  1. After signing up for express mode, use Vertex AI Studio to quickly try Vertex AI features:
    Go to Vertex AI Studio
    For example, select Vertex AI Studio > Create prompt to create and optimize multimodal prompts using a variety of Gemini models.
  2. Get the code for what you implemented with the UI.
    On the prompt page, click Build with code > Get code. A panel opens showing code that programmatically sends the same requests that you implemented in the UI. You can get the code for a programming language or curl. You can use Google Colab to try the Python code.
  3. Use your API key to authenticate with the Vertex AI API.
    In Google Cloud console in express mode, open APIs & Services > Credentials:
    Go to Credentials
    Then, copy your Generative Language API Key into your code where it says"YOUR_API_KEY". For example:

What's different in express mode

Vertex AI in express mode provides a subset of the features for Generative AI on Vertex AI. Therefore, some of the Vertex AI documentation is not relevant if you signed up in express mode. For details on the available API endpoints in express mode, see theVertex AI in express mode REST API reference.

In addition, customers in Google Cloud typically use_organizations_ and projects to work with resources (for example, to call an API endpoint). When using Vertex AI in express mode, you don't need to worry about organizations or projects. However, you might see them mentioned in some of the Google Cloud documentation that you reference while you're using Vertex AI in express mode. You can still use the documentation, but ignore concepts and instructions that refer to organizations and projects. In addition, the location you selected when signing up in express mode is used throughout your experience.

When calling REST API endpoints in express mode, you'll use the endpoint format for express mode and specify your API key. For example:

Standard endpoint URL https://{location}-aiplatform.googleapis.com/v1/projects/{project}/locations/{location}/publishers/google/models/{model}:streamGenerateContent
Endpoint URL in express mode https://aiplatform.googleapis.com/v1/publishers/google/models/{model}:streamGenerateContent?key={API\_KEY}

View and manage API keys

To authenticate withVertex AI API endpoints that support express mode, use the API key that was created for you during sign-up or any key that you've created in express mode. An API key is an encrypted string that is auto-generated for you when you sign up in express mode. These API keys can be viewed and managed on the APIs & Services page.

To learn more about the best practices for managing API keys, seeBest practices for managing API keys.

To view and manage your API keys, do the following:

  1. In the Google Cloud console, go to the Credentials page:
    Go to Credentials
  2. In the API Keys section, you can manage your API keys.

View quotas

Your free use of Vertex AI in express mode is restricted by quotas. These quotas restrict the rate at which you can use Vertex AI in express mode at no cost. A quota limits how much of a Google Cloud resource you can use.

To view your current usage and quotas, do the following:

  1. Go to the Vertex AI Studio Overview page in express mode.
    Go to Vertex AI Studio
  2. In Google Cloud console in express mode, open Quotas & System Limits:
    Go to Quotas & System Limits
  3. Explore your service quotas.

Enable and manage billing

You can increase your quotas and remove the 90 day limit by enabling billing.

After enabling billing, you pay only for what you use. You can also save your prompts and access additional settings in the Google Cloud console that are that grayed out when billing isn't enabled.

To manage billing, do the following:

  1. Go to the Vertex AI Studio Overview page in express mode.
    Go to Vertex AI Studio
  2. In Google Cloud console in express mode, open Billing:
    Go to Billing
  3. Manage your billing accounts.

Start using all Google Cloud capabilities and services

Express mode is designed to help you get started quickly. When you're ready to use other Google Cloud services or need more control over your environment, you can transition your express mode account to a standard Google Cloud account. This process is sometimes called upgrade.

You can start using all the capabilities and services available in Google Cloud in your project by upgrading your express mode account.

To upgrade from express mode, do the following:

  1. Go to the Vertex AI Studio Overview page in express mode.
    Go to Vertex AI Studio
  2. In Google Cloud console in express mode, open Billing:
    Go to Billing
  3. In the Access all Google Cloud section, click Learn more and get started.

After you upgrade from express mode, specify your project ID and location instead of your API key when you call the REST API endpoints. For example:

https://{location}-aiplatform.googleapis.com/v1/projects/{projectid}/locations/{location}/publishers/google/models/{model}:streamGenerateContent

What's next