Run a query (original) (raw)

This document shows you how to run a query in BigQuery and understand how much data the query will process before execution by performing adry run.

Types of queries

You can query BigQuery databy using one of the following query job types:

Interactive query jobs. By default, BigQuery runs queries as interactive query jobs, which are intended to start executing as quickly as possible.
Batch query jobs. Batch queries have lower priority than interactive queries. When a project or reservation is using all of its available compute resources, batch queries are more likely to be queued and remain in the queue. After a batch query starts running, the batch query runs the same as an interactive query. For more information, see query queues.
Continuous query jobs(Preview). With these jobs, the query runs continuously, letting you analyze incoming data in BigQuery in real time and then write the results to a BigQuery table, or export the results to Bigtable or Pub/Sub. You can use this capability to perform time sensitive tasks, such as creating and immediately acting on insights, applying real time machine learning (ML) inference, and building event-driven data pipelines.

You can run query jobs by using the following methods:

Compose and run a query in the Google Cloud console.
Run the bq query command in the bq command-line tool.
Programmatically call thejobs.queryorjobs.insertmethod in the BigQueryREST API.
Use the BigQuery client libraries.

BigQuery saves query results to either atemporary table (default) or permanent table. When you specify a permanent table as the destination table for the results, you can choose whether to append or overwrite an existing table, or create a new table with a unique name.

Required roles

To get the permissions that you need to run a query job, ask your administrator to grant you the following IAM roles:

BigQuery Job User (roles/bigquery.jobUser) on the project.
BigQuery Data Viewer (roles/bigquery.dataViewer) on all tables and views that your query references. To query views, you also need this role on all underlying tables and views. If you're using authorized views or authorized datasets, you don't need access to the underlying source data.

For more information about granting roles, see Manage access to projects, folders, and organizations.

These predefined roles contain the permissions required to run a query job. To see the exact permissions that are required, expand the Required permissions section:

Required permissions

The following permissions are required to run a query job:

bigquery.jobs.create on the project from which the query is being run, regardless of where the data is stored.
bigquery.tables.getData on all tables and views that your query references. To query views, you also need this permission on all underlying tables and views. If you're using authorized views or authorized datasets, you don't need access to the underlying source data.

You might also be able to get these permissions with custom roles or other predefined roles.

Troubleshooting

Access Denied: Project [project_id]: User does not have bigquery.jobs.create
permission in project [project_id].

This error occurs when a principal lacks permission to create a query jobs in the project.

Resolution: An administrator must grant you the bigquery.jobs.createpermission on the project you are querying. This permission is required in addition to any permission required to access the queried data.

For more information about BigQuery permissions, seeAccess control with IAM.

Run an interactive query

To run an interactive query, select one of the following options:

Console

Go to the BigQuery page.
Go to BigQuery
Click SQL query.
In the query editor, enter a valid GoogleSQL query.
For example, query theBigQuery public dataset usa_namesto determine the most common names in the United States between the years 1910 and 2013:

SELECT  
  name, gender,  
  SUM(number) AS total  
FROM  
  `bigquery-public-data.usa_names.usa_1910_2013`  
GROUP BY  
  name, gender  
ORDER BY  
  total DESC  
LIMIT  
  10;

Optional: To select additional query settings, clickMore, and then click Query settings.
Click Run.
If you don't specify a destination table, the query job writes the output to a temporary (cache) table.
You can now explore the query results in the Results tab of theQuery results pane.
Optional: To sort the query results by column, clickOpen sort menunext to the column name and select a sort order. If the estimated bytes processed for the sort is more than zero, then the number of bytes is displayed at the top of the menu.
Optional: To see visualization of your query results, go to theChart tab. You can zoom in or zoom out of the chart, download the chart as a PNG file, or toggle the legend visibility.
In the Chart configuration pane, you can change the chart type (line, bar, or scatter) and configure the measures and dimensions of the chart. Fields in this pane are prefilled with the initial configuration inferred from the destination table schema of the query. The configuration is preserved between following query runs in the same query editor. Dimensions support INT64, FLOAT64,NUMERIC, BIGNUMERIC, TIMESTAMP, DATE, DATETIME, TIME, andSTRING data types, while measures support INT64,FLOAT64, NUMERIC, and BIGNUMERIC data types.
Optional: In the JSON tab, you can explore the query results in the JSON format, where the key is the column name and the value is the result for that column.

bq

In the Google Cloud console, activate Cloud Shell.
Activate Cloud Shell
At the bottom of the Google Cloud console, aCloud Shell session starts and displays a command-line prompt. Cloud Shell is a shell environment with the Google Cloud CLI already installed and with values already set for your current project. It can take a few seconds for the session to initialize.
Use thebq query command. In the following example, the --use_legacy_sql=false flag lets you use GoogleSQL syntax.
bq query \
--use_legacy_sql=false \
'QUERY'
Replace QUERY with a valid GoogleSQL query. For example, query theBigQuery public dataset usa_namesto determine the most common names in the United States between the years 1910 and 2013:

bq query \  
    --use_legacy_sql=false \  
    'SELECT  
      name, gender,  
      SUM(number) AS total  
    FROM  
      `bigquery-public-data.usa_names.usa_1910_2013`  
    GROUP BY  
      name, gender  
    ORDER BY  
      total DESC  
    LIMIT  
      10;'

The query job writes the output to a temporary (cache) table.
Optionally, you can specify the destination table andlocationfor the query results. To write the results to an existing table, include the appropriate flag to append (--append_table=true) or overwrite (--replace=true) the table.
bq query \
    --location=LOCATION \
    --destination_table=TABLE \
    --use_legacy_sql=false \
    'QUERY'
Replace the following:

LOCATION: the region or multi-region for the destination table—for example, US
In this example, the usa_names dataset is stored in the US multi-region location. If you specify a destination table for this query, the dataset that contains the destination table must also be in the US multi-region. You cannot query a dataset in one location and write the results to a table in another location.
You can set a default value for the location using the.bigqueryrc file.
TABLE: a name for the destination table—for example,myDataset.myTable
If the destination table is a new table, then BigQuery creates the table when you run your query. However, you must specify an existing dataset.
If the table isn't in your current project, then add the Google Cloud project ID using the formatPROJECT_ID:DATASET.TABLE—for example,myProject:myDataset.myTable. If --destination_table is unspecified, a query job is generated that writes the output to a temporary table.

API

To run a query using the API, insert a new joband populate the query job configuration property. Optionally specify your location in the location property in the jobReference section of thejob resource.

Poll for results by callinggetQueryResults. Poll until jobComplete equals true. Check for errors and warnings in theerrors list.

C#

Before trying this sample, follow the C# setup instructions in theBigQuery quickstart using client libraries. For more information, see theBigQuery C# API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, seeSet up authentication for client libraries.

Go

Before trying this sample, follow the Go setup instructions in theBigQuery quickstart using client libraries. For more information, see theBigQuery Go API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, seeSet up authentication for client libraries.

Java

Before trying this sample, follow the Java setup instructions in theBigQuery quickstart using client libraries. For more information, see theBigQuery Java API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, seeSet up authentication for client libraries.

To run a query with a proxy, see Configuring a proxy.

Node.js

Before trying this sample, follow the Node.js setup instructions in theBigQuery quickstart using client libraries. For more information, see theBigQuery Node.js API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, seeSet up authentication for client libraries.

PHP

Before trying this sample, follow the PHP setup instructions in theBigQuery quickstart using client libraries. For more information, see theBigQuery PHP API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, seeSet up authentication for client libraries.

Python

Before trying this sample, follow the Python setup instructions in theBigQuery quickstart using client libraries. For more information, see theBigQuery Python API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, seeSet up authentication for client libraries.

Ruby

Before trying this sample, follow the Ruby setup instructions in theBigQuery quickstart using client libraries. For more information, see theBigQuery Ruby API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, seeSet up authentication for client libraries.

Run a batch query

To run a batch query, select one of the following options:

Console

Go to the BigQuery page.
Go to BigQuery
Click SQL query.
In the query editor, enter a valid GoogleSQL query.
For example, query theBigQuery public dataset usa_namesto determine the most common names in the United States between the years 1910 and 2013:

SELECT  
  name, gender,  
  SUM(number) AS total  
FROM  
  `bigquery-public-data.usa_names.usa_1910_2013`  
GROUP BY  
  name, gender  
ORDER BY  
  total DESC  
LIMIT  
  10;

Click More, and then click Query settings.
In the Resource management section, select Batch.
Optional: Adjust your query settings.
Click Save.
Click Run.
If you don't specify a destination table, the query job writes the output to a temporary (cache) table.

bq

In the Google Cloud console, activate Cloud Shell.
Activate Cloud Shell
At the bottom of the Google Cloud console, aCloud Shell session starts and displays a command-line prompt. Cloud Shell is a shell environment with the Google Cloud CLI already installed and with values already set for your current project. It can take a few seconds for the session to initialize.
Use thebq query commandand specify the --batch flag. In the following example, the--use_legacy_sql=false flag lets you use GoogleSQL syntax.
bq query \
    --batch \
    --use_legacy_sql=false \
    'QUERY'
Replace QUERY with a valid GoogleSQL query. For example, query theBigQuery public dataset usa_namesto determine the most common names in the United States between the years 1910 and 2013:

bq query \  
    --batch \  
    --use_legacy_sql=false \  
    'SELECT  
      name, gender,  
      SUM(number) AS total  
    FROM  
      `bigquery-public-data.usa_names.usa_1910_2013`  
    GROUP BY  
      name, gender  
    ORDER BY  
      total DESC  
    LIMIT  
      10;'

The query job writes the output to a temporary (cache) table.
Optionally, you can specify the destination table andlocationfor the query results. To write the results to an existing table, include the appropriate flag to append (--append_table=true) or overwrite (--replace=true) the table.
bq query \
    --batch \
    --location=LOCATION \
    --destination_table=TABLE \
    --use_legacy_sql=false \
    'QUERY'
Replace the following:

LOCATION: the region or multi-region for the destination table—for example, US
In this example, the usa_names dataset is stored in the US multi-region location. If you specify a destination table for this query, the dataset that contains the destination table must also be in the US multi-region. You cannot query a dataset in one location and write the results to a table in another location.
You can set a default value for the location using the.bigqueryrc file.
TABLE: a name for the destination table—for example,myDataset.myTable
If the destination table is a new table, then BigQuery creates the table when you run your query. However, you must specify an existing dataset.
If the table isn't in your current project, then add the Google Cloud project ID using the formatPROJECT_ID:DATASET.TABLE—for example,myProject:myDataset.myTable. If --destination_table is unspecified, a query job is generated that writes the output to a temporary table.

API

When you populate the query job properties, include theconfiguration.query.priority property and set the value to BATCH.

Poll for results by callinggetQueryResults. Poll until jobComplete equals true. Check for errors and warnings in theerrors list.

Go

Before trying this sample, follow the Go setup instructions in theBigQuery quickstart using client libraries. For more information, see theBigQuery Go API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, seeSet up authentication for client libraries.

Java

To run a batch query, set the query prioritytoQueryJobConfiguration.Priority.BATCHwhen creating aQueryJobConfiguration.

Before trying this sample, follow the Java setup instructions in theBigQuery quickstart using client libraries. For more information, see theBigQuery Java API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, seeSet up authentication for client libraries.

Node.js

Before trying this sample, follow the Node.js setup instructions in theBigQuery quickstart using client libraries. For more information, see theBigQuery Node.js API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, seeSet up authentication for client libraries.

Python

Before trying this sample, follow the Python setup instructions in theBigQuery quickstart using client libraries. For more information, see theBigQuery Python API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, seeSet up authentication for client libraries.

Run a continuous query

Running a continuous query job requires additional configuration. For more information, seeCreate continuous queries.

Query settings

When you run a query, you can specify the following settings:

A destination table for the query results.
The priority of the job.
Whether to use cached query results.
The job timeout in milliseconds.
Whether to use session mode.
The type of encryption to use.
The maximum number of bytes billed for the query.
The dialect of SQL to use.
The location in which to run the query. The query must run in the same location as any tables referenced in the query.
The reservation to run your query in (Preview).

Optional job creation mode

Optional job creation mode (formerly Short query optimized mode) can improve the overall latency of queries that run for a short duration, such as those from dashboards or data exploration workloads. This mode executes the query and returns the results inline for SELECT statements without requiring the use ofjobs.getQueryResultsto fetch the results. Queries using optional job creation mode don't create a job when executed unless BigQuery determines that a job creation is necessary to complete the query.

To enable optional job creation mode, set the jobCreationMode field of theQueryRequestinstance to JOB_CREATION_OPTIONAL in thejobs.query request body.

When the value of this field is set to JOB_CREATION_OPTIONAL, BigQuery determines if the query can use the optional job creation mode. If so, BigQuery executes the query and returns all results in the rows field of the response. Since a job isn't created for this query, BigQuery doesn't return a jobReference in the response body. Instead, it returns a queryId field, which you can use to get insights about the query using the INFORMATION_SCHEMA.JOBSview. Since no job is created, there is no jobReference that can be passed tojobs.get andjobs.getQueryResultsAPIs to lookup these queries.

If BigQuery determines that a job is required to complete the query, a jobReference is returned. You can inspect the job_creation_reasonfield in INFORMATION_SCHEMA.JOBSview to determine the reason that a job was created for the query. In this case, you should usejobs.getQueryResultsto fetch the results when the query is complete.

When you use the JOB_CREATION_OPTIONAL value, the jobReference field might not be present in the response. Check if the field exists before accessing it.

When JOB_CREATION_OPTIONAL is specified for multi-statement queries (scripts), BigQuery might optimize the execution process. As part of this optimization, BigQuery might determine that it can complete the script by creating fewer job resources than the number of individual statements, potentially even executing the entire script without creating any job at all. This optimization depends on BigQuery's assessment of the script, and the optimization might not be applied in every case. The optimization is fully automated by the system. No user controls or actions are required.

To run a query using optional job creation mode, select one of the following options:

Console

Go to the BigQuery page.
Go to BigQuery
Click SQL query.
In the query editor, enter a valid GoogleSQL query.
For example, query theBigQuery public dataset usa_namesto determine the most common names in the United States between the years 1910 and 2013:

SELECT  
  name, gender,  
  SUM(number) AS total  
FROM  
  `bigquery-public-data.usa_names.usa_1910_2013`  
GROUP BY  
  name, gender  
ORDER BY  
  total DESC  
LIMIT  
  10;

Click More, and then choose the Optional job creation query mode. To confirm this choice, click Confirm.
Click Run.

bq

In the Google Cloud console, activate Cloud Shell.
Activate Cloud Shell
At the bottom of the Google Cloud console, aCloud Shell session starts and displays a command-line prompt. Cloud Shell is a shell environment with the Google Cloud CLI already installed and with values already set for your current project. It can take a few seconds for the session to initialize.
Use thebq query commandand specify the --job_creation_mode=JOB_CREATION_OPTIONAL flag. In the following example, the --use_legacy_sql=false flag lets you use GoogleSQL syntax.
bq query \
    --rpc=true \
    --use_legacy_sql=false \
    --job_creation_mode=JOB_CREATION_OPTIONAL \
    --location=LOCATION \
    'QUERY'
Replace QUERY with a valid GoogleSQL query, and replace LOCATION with a valid region where the dataset is located. For example, query theBigQuery public dataset usa_namesto determine the most common names in the United States between the years 1910 and 2013:

bq query \  
    --rpc=true \  
    --use_legacy_sql=false \  
    --job_creation_mode=JOB_CREATION_OPTIONAL \  
    --location=us \  
    'SELECT  
      name, gender,  
      SUM(number) AS total  
    FROM  
      `bigquery-public-data.usa_names.usa_1910_2013`  
    GROUP BY  
      name, gender  
    ORDER BY  
      total DESC  
    LIMIT  
      10;'

The query job returns the output inline in the response.

API

To run a query in optional job creation mode using the API, run a query synchronouslyand populate the QueryRequest property. Include the jobCreationMode property and set its value to JOB_CREATION_OPTIONAL.

Check the response. If jobComplete equals true and jobReference is empty, read the results from the rows field. You can also get the queryId from the response.

If jobReference is present, you can check jobCreationReason for why a job was created by BigQuery. Poll for results by callinggetQueryResults. Poll until jobComplete equals true. Check for errors and warnings in theerrors list.

Java

Available version: 2.51.0 and up

Before trying this sample, follow the Java setup instructions in theBigQuery quickstart using client libraries. For more information, see theBigQuery Java API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, seeSet up authentication for client libraries.

To run a query with a proxy, see Configuring a proxy.

Python

Available version: 3.34.0 and up

Before trying this sample, follow the Python setup instructions in theBigQuery quickstart using client libraries. For more information, see theBigQuery Python API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, seeSet up authentication for client libraries.

Node

Available version: 8.1.0 and up

Before trying this sample, follow the Node.js setup instructions in theBigQuery quickstart using client libraries. For more information, see theBigQuery Node.js API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, seeSet up authentication for client libraries.

Go

Available version: 1.69.0 and up

Before trying this sample, follow the Go setup instructions in theBigQuery quickstart using client libraries. For more information, see theBigQuery Go API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, seeSet up authentication for client libraries.

JDBC Driver

Available version: JDBC v1.6.1 and up

Requires setting JobCreationMode=2 in the connection string.

jdbc:bigquery://https://www.googleapis.com/bigquery/v2:443;JobCreationMode=2;Location=US;

ODBC Driver

Available version: ODBC v3.0.7.1016 and up

Requires setting JobCreationMode=2 in the .ini file.

[ODBC Data Sources]
Sample DSN=Simba Google BigQuery ODBC Connector 64-bit
[Sample DSN]
JobCreationMode=2

Quotas

For information about quotas regarding interactive and batch queries, seeQuery jobs.

Monitor queries

You can get information about queries as they are executing by using thejobs explorer or by querying theINFORMATION_SCHEMA.JOBS_BY_PROJECT view.

Dry run

A dry run in BigQuery provides the following information:

estimate of charges in on-demand mode
validation of your query
approximate bytes processed by your query in capacity mode

Dry runs don't use query slots, and you are not charged for performing a dry run. You can use the estimate returned by a dry run to calculate query costs in the pricing calculator.

Perform a dry run

To perform a dry run, do the following:

Console

Go to the BigQuery page.
Go to BigQuery
Enter your query in the query editor.
If the query is valid, then a check mark automatically appears along with the amount of data that the query will process. If the query is invalid, then an exclamation point appears along with an error message.

bq

Enter a query like the following using the --dry_run flag.

bq query
--use_legacy_sql=false
--dry_run
'SELECT COUNTRY, AIRPORT, IATA FROM project_id.dataset.airports LIMIT 1000'

For a valid query, the command produces the following response:

Query successfully validated. Assuming the tables are not modified, running this query will process 10918 bytes of data.

API

To perform a dry run by using the API, submit a query job withdryRun set to true in theJobConfigurationtype.

Go

Before trying this sample, follow the Go setup instructions in theBigQuery quickstart using client libraries. For more information, see theBigQuery Go API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, seeSet up authentication for client libraries.

Java

Before trying this sample, follow the Java setup instructions in theBigQuery quickstart using client libraries. For more information, see theBigQuery Java API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, seeSet up authentication for client libraries.

Node.js

Before trying this sample, follow the Node.js setup instructions in theBigQuery quickstart using client libraries. For more information, see theBigQuery Node.js API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, seeSet up authentication for client libraries.

PHP

Before trying this sample, follow the PHP setup instructions in theBigQuery quickstart using client libraries. For more information, see theBigQuery PHP API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, seeSet up authentication for client libraries.

Python

Set theQueryJobConfig.dry_runproperty to True.Client.query()always returns a completedQueryJobwhen provided a dry run query configuration.

Before trying this sample, follow the Python setup instructions in theBigQuery quickstart using client libraries. For more information, see theBigQuery Python API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, seeSet up authentication for client libraries.

What's next

Learn how tomanage query jobs.
Learn how toview query history.
Learn how tosave and share queries.
Learn about query queues.
Learn how to write query results.