The AI.GENERATE function (original) (raw)

This document describes the AI.GENERATE function, which lets you analyze any combination of structured and unstructured data. You can choose to generate text or structured output according to a custom schema that you specify. The function works by sending requests to a Vertex AI Gemini model and returning a STRUCT that contains your generated data, the full model response, and a status.

For example, the following query generates summaries of BBC news articles:

SELECT
  title,
  AI.GENERATE(CONCAT("Summarize in one sentence: ", body)).result AS article_summary
FROM `bigquery-public-data.bbc_news.fulltext`
LIMIT 3;

You can also use the AI.GENERATE function to extract structured output. For example, you can use a query similar to the following to extract a patient's name, age, and phone number from an unstructured description:

SELECT
  AI.GENERATE(patient_description, output_schema => 'name STRING, age INT64, phone_number STRING')
FROM mydataset.patient_data;

Input

Using the AI.GENERATE function, you can use the following types of input:

Text data from standard tables.
ObjectRefRuntimevalues that are generated by theOBJ.GET_ACCESS_URL function. You can useObjectRefvalues from standard tables as input to the OBJ.GET_ACCESS_URL function. (Preview)

When you analyze unstructured data, that data must meet the following requirements:

Content must be in one of the supported formats that are described in the Gemini API modelmimeType parameter.
If you are analyzing a video, the maximum supported length is two minutes. If the video is longer than two minutes, AI.GENERATE only returns results based on the first two minutes.

Prompt design can strongly affect the responses returned by the model. For more information, seeIntroduction to prompting.

Syntax

AI.GENERATE( [ prompt => ] 'PROMPT', [, endpoint => 'ENDPOINT'] [, model_params => MODEL_PARAMS] [, output_schema => 'OUTPUT_SCHEMA'] [, connection_id => 'CONNECTION'] [, request_type => 'REQUEST_TYPE'] )

Arguments

AI.GENERATE takes the following arguments:

PROMPT: a STRING or STRUCT value that specifies the PROMPT value to send to the model. The prompt must be the first argument that you specify. You can provide the value in the following ways:

Specify a STRING value. For example, 'This is a prompt.'

Specify a STRUCT value that contains one or more fields. You can use the following types of fields within theSTRUCT value:

Field type	Description	Examples
STRING or ARRAY	A string literal, array of string literals, or the name of a STRING column.	String literal: 'This is a prompt.' String column name:my_string_column
ObjectRefRuntime or ARRAY	An ObjectRefRuntime value returned by theOBJ.GET_ACCESS_URL function. The OBJ.GET_ACCESS_URL function takes anObjectRef value as input, which you can provide by either specifying the name of a column that contains ObjectRef values, or by constructing an ObjectRef value. ObjectRefRuntime values must have theaccess_url.read_url anddetails.gcs_metadata.content_type elements of the JSON value populated. Your input can contain at most one video object.	Function call with ObjectRef column:OBJ.GET_ACCESS_URL(my_objectref_column, 'r') Function call with constructed ObjectRef value:OBJ.GET_ACCESS_URL(OBJ.MAKE_REF('gs://image.jpg', 'myconnection'), 'r')
The function combines `STRUCT` fields similarly to aCONCAT operation and concatenates the fields in their specified order. The same is true for the elements of any arrays used within the struct. The following table shows some examples of `STRUCT` prompt values and how they are interpreted:
Struct field types	Struct value	Semantic equivalent
--------------------------------	------------------------------------------------------------------------------------	----------------------------------------------------------
STRUCT<STRING, STRING, STRING>	('Describe the city of ', my_city_column, ' in 15 words')	'Describe the city of my_city_column_value in 15 words'
STRUCT<STRING, ObjectRefRuntime>	('Describe the following city', OBJ.GET_ACCESS_URL(image_objectref_column, 'r'))	'Describe the following city image'

ENDPOINT: a STRING value that specifies the Vertex AI endpoint to use for the model. You can specify anygenerally availableorpreviewGemini model. If you specify the model name, BigQuery ML automatically identifies and uses the full endpoint of the model. If you don't specify an ENDPOINT value, BigQuery ML selects a recent stable version of Gemini to use. The default endpoint is gemini-2.5-flash. You can also specify theglobal endpoint. For example, to use gemini-3-pro-preview, specify the following endpoint:

https://aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/global/publishers/google/models/gemini-3-pro-preview

MODEL_PARAMS: a JSON literal that provides additional parameters to the model. The MODEL_PARAMS value must conform to thegenerateContent request body format. You can provide a value for any field in the request body except for thecontents field; the contents field is populated with the PROMPTargument value.
OUTPUT_SCHEMA: a STRING value that specifies the schema of the output as a comma-separated list of fields. Each field consists of a name, a data type, and an optional OPTIONS clause in which you can specify a description of the field. The following example shows how to specify an output schema that contains two string fields, name and state, with a description for the statefield:
OUTPUT_SCHEMA => '''
name STRING,
state STRING OPTIONS(description = 'The 2-letter abbreviation of the state name')
'''
Supported data types include STRING, INT64, FLOAT64, BOOL, ARRAY, and STRUCT. For a STRUCT data type, you can specify the OPTIONS clause on any of its subfields:
OUTPUT_SCHEMA => '''
location STRUCT<city STRING,
state STRING OPTIONS(description = 'The 2-letter abbreviation of the state name')>

'''

CONNECTION: a STRING value specifying the connection to use to communicate with the model, in the format[PROJECT_ID].LOCATION.CONNECTION_ID. For example, myproject.us.myconnection.
If you don't specify a connection, then the query uses yourend-user credentials.
For information about configuring permissions, see Set permissions for BigQuery ML generative AI functions that call Vertex AI models.
REQUEST_TYPE: a STRING value that specifies the type of inference request to send to the Gemini model. The request type determines what quota the request uses. Valid values are as follows:
- SHARED: The function only uses dynamic shared quota (DSQ).
- DEDICATED: The function only uses Provisioned Throughput quota. The function returns an invalid query error if Provisioned Throughput quota isn't available. For more information, see Use Vertex AI Provisioned Throughput.
- UNSPECIFIED: The function uses quota as follows:
  * If you haven't purchased Provisioned Throughput quota, the function uses DSQ quota.
  * If you have purchased Provisioned Throughput quota, the function uses the Provisioned Throughput quota first. If requests exceed the Provisioned Throughput quota, the overflow traffic uses DSQ quota.

The default value is UNSPECIFIED.

Output

AI.GENERATE returns a STRUCT value for each row in the table. The struct contains the following fields:

result: a STRING value containing the model's response to the prompt. The result is NULL if the request fails or is filtered byresponsible AI. If you specify an output schema then result is replaced by your custom schema.
full_response: a JSON value containing theresponsefrom theprojects.locations.endpoints.generateContentcall to the model. The generated text is in the text element.
status: a STRING value that contains the API response status for the corresponding row. This value is empty if the operation was successful.

Examples

The following examples assume that you have granted theVertex AI User roleto your personal account. For more information, seeRun generative AI queries with end-user credentials.

Translate

The following query translates publicly available BBC news technology articles into French:

SELECT body, AI.GENERATE( CONCAT("Translate into French ", body)).result AS translation FROM bigquery-public-data.bbc_news.fulltext WHERE category = 'tech' LIMIT 3;

The result is similar to the following:

AI_GENERATE_TRANSLATION

Use structured output for entity extraction

The following query extracts information about a person from an unstructured description. The query uses the output_schema argument to set custom fields in the output:

SELECT
  AI.GENERATE(
    input,
    output_schema => '''name STRING,
                        age INT64,
                        address STRUCT<street_address STRING, city STRING, state STRING, zip_code STRING>,
                        is_married BOOL,
                        phone_number ARRAY<STRING>,
                        weight_in_pounds FLOAT64''') AS info
FROM
  (
    SELECT
      '''John Smith is a 20-year old single man living at 1234 NW 45th St, Kirkland WA, 98033.
           He has two phone numbers 123-123-1234, and 234-234-2345. He is 200.5 pounds.'''
        AS input
  );

The result is similar to the following:

+------------+----------+-----------------------------+-------------------+-----+ | info.name | info.age | info.address.street_address | info.address.city | ... | +------------+----------+-----------------------------+-------------------+-----+ | John Smith | 20 | 1234 NW 45th St | Kirkland | ... | +------------+----------+-----------------------------+-------------------+-----+

The following query extracts information about customer complaints. The query uses the output_schema argument to set custom fields in the output:

SELECT
  complaint_id,
  AI.GENERATE(
    CONCAT('Analyze the following complaint: ', consumer_complaint_narrative),
    output_schema => """
    grievance_subject ARRAY<STRING> OPTIONS(description = 'a list of grievance subjects'),
    complaint_type STRING OPTIONS(description = 'classify the complaint type as Billing Dispute, Service Issue, or Reporting Error')
    """
  ).* EXCEPT (full_response, status)
FROM
  `bigquery-public-data.cfpb_complaints.complaint_database`
WHERE
  consumer_complaint_narrative IS NOT NULL
  AND LENGTH(consumer_complaint_narrative) > 100 -- Ensure there's a narrative to analyze
LIMIT 3;

The result is similar to the following:

+--------------+-----------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | complaint_id | complaint_type | grievance_subject | +--------------+-----------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 4767874 | Reporting Error | ["Inaccurate debt reporting on credit report","Enhanced Recovery Company reporting debt they no longer own","Failure to delete debt from credit report after recall"] | | 2091987 | Service Issue | ["Unwanted calls to place of employment","Request for correspondence via mail"] | | 6047403 | Reporting Error | ["Inaccurate derogatory collection on credit report","Disputed debt collection account","Lack of debt validation documentation","Refusal to remove incorrect collection from credit report"] | +--------------+-----------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

Process images in a Cloud Storage bucket

The following query creates an external table from images of pet products stored in a publicly available Cloud Storage bucket:

CREATE SCHEMA IF NOT EXISTS bqml_tutorial;

CREATE OR REPLACE EXTERNAL TABLE bqml_tutorial.product_images WITH CONNECTION DEFAULT OPTIONS ( object_metadata = 'SIMPLE', uris = ['gs://cloud-samples-data/bigquery/tutorials/cymbal-pets/images/*.png']);

You can use AI.GENERATE to describe images and what's in them. To do that, construct your prompt from a natural language instruction and an ObjectRefRuntime of the image. The following query asks Gemini what each image is. It specifies anoutput_schema to structure the results with one column to name the items in the image and another column to provide a description of the image.

SELECT uri, STRING(OBJ.GET_ACCESS_URL(ref,'r').access_urls.read_url) AS signed_url, AI.GENERATE( ("What is this: ", OBJ.GET_ACCESS_URL(ref, 'r')), output_schema => "image_description STRING, entities_in_the_image ARRAY").* FROM bqml_tutorial.product_images WHERE uri LIKE "%aquarium%";

This result is similar to the following:

AI_GENERATE_WITH_IMAGE

Use grounding

The following queries shows how to set the model_params argument to use Google Search or Google Maps grounding for the request. You can only use grounding with Gemini 2.0 or later models.

Set Google Search grounding:

SELECT name, AI.GENERATE( ('Please check the weather of ', name, ' for today.'), model_params => JSON '{"tools": [{"googleSearch": {}}]}' ) FROM UNNEST(['Seattle', 'NYC', 'Austin']) AS name;

Set Google Maps grounding:

SELECT name, AI.GENERATE( ('Please find some tourist attractions in ', name), model_params => JSON '{"tools": [{"googleMaps": {}}]}' ) FROM UNNEST(['Seattle', 'NYC', 'Austin']) AS name;

Disable the thinking budget

The following query shows how to use the model_params argument to set the model's thinking budget to 0 for the request:

SELECT AI.GENERATE( ('What is the capital of Monaco?'), endpoint => 'gemini-2.5-flash', model_params => JSON '{"generation_config":{"thinking_config": {"thinking_budget": 0}}}');

Use a context cache and low thinking level

The following query shows how to use the model_params argument to specify a context cacheand use a low thinking level:

SELECT AI.GENERATE( "Give me a summary of the document in context.", endpoint => "projects/PROJECT_NUMBER/locations/global/publishers/google/models/gemini-3-flash-preview", model_params => JSON '''{"cachedContent": "projects/PROJECT_NUMBER/locations/LOCATION/cachedContents/CACHED_CONTENT_ID", "generation_config": {"thinking_config":{"thinking_level": "LOW"}}}''')

Best Practices

This function passes your input to a Gemini model and incurs charges in Vertex AI each time it's called. For information about how to view these charges, seeTrack costs. To minimize Vertex AI charges when you use AI.GENERATE on a subset of data using the LIMIT clause, materialize the selected data to a table first. For example, the first of the following examples is preferable to the second one:

CREATE TABLE mydataset.cities AS ( SELECT city_name from mydataset.customers LIMIT 10 );

SELECT city, AI.GENERATE( ('Give a short, one sentence description of ', city)).result FROM mydataset.cities;

SELECT city, AI.GENERATE( ('Give a short, one sentence description of ', city)).result FROM (SELECT city_name from mydataset.customers LIMIT 10);

Writing the query results to a table beforehand helps you to ensure that you are sending as few rows as possible to the model.

Use Vertex AI Provisioned Throughput

You can useVertex AI Provisioned Throughputwith the AI.GENERATE function to provide consistent high throughput for requests. The remote model that you reference in the AI.GENERATE function must use asupported Gemini modelin order for you to use Provisioned Throughput.

To use Provisioned Throughput,calculate your Provisioned Throughput requirementsand thenpurchase Provisioned Throughputquota before running the AI.GENERATE function. When you purchase Provisioned Throughput, do the following:

For Model, select the same Gemini model as the one used by the remote model that you reference in the AI.GENERATE function.
For Region, select the same region as the dataset that contains the remote model that you reference in the AI.GENERATE function, with the following exceptions:
- If the dataset is in the US multi-region, select the us-central1region.
- If the dataset is in the EU multi-region, select the europe-west4region.

After you submit the order, wait for the order to be approved and appear on theOrders page.

After you have purchased Provisioned Throughput quota, use theREQUEST_TYPE argument to determine how the AI.GENERATE function uses the quota.

Locations

You can run AI.GENERATE in all of theregionsthat support Gemini models, and also in the US and EUmulti-regions.

Quotas

See Vertex AI and Cloud AI service functions quotas and limits.

What's next

For more information about using Vertex AI models to generate text and embeddings, seeGenerative AI overview.
For more information about using Cloud AI APIs to perform AI tasks, seeAI application overview.
For more information about supported SQL statements and functions for generative AI models, seeEnd-to-end user journeys for generative AI models.