Update dataset properties (original) (raw)

Stay organized with collections Save and categorize content based on your preferences.

This document describes how to update dataset properties in BigQuery. After you create a dataset, you can update the following dataset properties:

Before you begin

Grant Identity and Access Management (IAM) roles that give users the necessary permissions to perform each task in this document.

Required permissions

To update dataset properties, you need the following IAM permissions:

The roles/bigquery.dataOwner predefined IAM role includes the permissions that you need to update dataset properties.

Additionally, if you have the bigquery.datasets.create permission, you can update properties of the datasets that you create.

For more information on IAM roles and permissions in BigQuery, see Predefined roles and permissions.

Update dataset descriptions

You can update a dataset's description in the following ways:

To update a dataset's description:

Console

  1. In the Explorer panel, expand your project and select a dataset.
  2. Expand theActions option and click Open.
  3. In the Details panel, click Edit details to edit the description text.
    In the Edit detail dialog that appears, do the following:
    1. In the Description field, enter a description or edit the existing description.
    2. To save the new description text, click Save.

SQL

To update a dataset's description, use theALTER SCHEMA SET OPTIONS statementto set the description option.

The following example sets the description on a dataset named mydataset:

  1. In the Google Cloud console, go to the BigQuery page.
    Go to BigQuery
  2. In the query editor, enter the following statement:
    ALTER SCHEMA mydataset
    SET OPTIONS (
    description = 'Description of mydataset');
  3. Click Run.

For more information about how to run queries, see Run an interactive query.

bq

Issue the bq update command with the --description flag. If you are updating a dataset in a project other than your default project, add the project ID to the dataset name in the following format:project_id:dataset.

bq update
--description "string"
project_id:dataset

Replace the following:

Examples:

Enter the following command to change the description of mydataset to "Description of mydataset." mydataset is in your default project.

bq update --description "Description of mydataset" mydataset

Enter the following command to change the description of mydataset to "Description of mydataset." The dataset is in myotherproject, not your default project.

bq update \
--description "Description of mydataset" \
myotherproject:mydataset

API

Call datasets.patch and update the description property in thedataset resource. Because the datasets.update method replaces the entire dataset resource, the datasets.patch method is preferred.

Go

Before trying this sample, follow the Go setup instructions in theBigQuery quickstart using client libraries. For more information, see theBigQuery Go API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, seeSet up authentication for client libraries.

Java

Before trying this sample, follow the Java setup instructions in theBigQuery quickstart using client libraries. For more information, see theBigQuery Java API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, seeSet up authentication for client libraries.

Create aDataset.Builderinstance from an existingDatasetinstance with theDataset.toBuilder()method. Configure the dataset builder object. Build the updated dataset with theDataset.Builder.build()method, and call theDataset.update()method to send the update to the API.

Node.js

Before trying this sample, follow the Node.js setup instructions in theBigQuery quickstart using client libraries. For more information, see theBigQuery Node.js API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, seeSet up authentication for client libraries.

Python

Before trying this sample, follow the Python setup instructions in theBigQuery quickstart using client libraries. For more information, see theBigQuery Python API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, seeSet up authentication for client libraries.

Configure the Dataset.descriptionproperty and call Client.update_dataset()to send the update to the API.

Update default table expiration times

You can update a dataset's default table expiration time in the following ways:

You can set a default table expiration time at the dataset level, or you can set a table's expiration time when the table is created. If you set the expiration when the table is created, the dataset's default table expiration is ignored. If you don't set a default table expiration at the dataset level, and you don't set a table expiration when the table is created, the table never expires and you must delete the tablemanually. When a table expires, it's deleted along with all of the data it contains.

When you update a dataset's default table expiration setting:

The value for default table expiration is expressed differently depending on where the value is set. Use the method that gives you the appropriate level of granularity:

To update the default expiration time for a dataset:

Console

  1. In the Explorer panel, expand your project and select a dataset.
  2. Expand theActions option and click Open.
  3. In the details panel, click the pencil icon next to Dataset infoto edit the expiration.
  4. In the Dataset info dialog, in the Default table expirationsection, enter a value for Number of days after table creation.
  5. Click Save.

SQL

To update the default table expiration time, use theALTER SCHEMA SET OPTIONS statementto set the default_table_expiration_days option.

The following example updates the default table expiration for a dataset named mydataset.

  1. In the Google Cloud console, go to the BigQuery page.
    Go to BigQuery
  2. In the query editor, enter the following statement:
    ALTER SCHEMA mydataset
    SET OPTIONS(
    default_table_expiration_days = 3.75);
  3. Click Run.

For more information about how to run queries, see Run an interactive query.

bq

To update the default expiration time for newly created tables in a dataset, enter the bq update command with the --default_table_expiration flag. If you are updating a dataset in a project other than your default project, add the project ID to the dataset name in the following format:project_id:dataset.

bq update
--default_table_expiration integer
project_id:dataset

Replace the following:

Examples:

Enter the following command to set the default table expiration for new tables created in mydataset to two hours (7200 seconds) from the current time. The dataset is in your default project.

bq update --default_table_expiration 7200 mydataset

Enter the following command to set the default table expiration for new tables created in mydataset to two hours (7200 seconds) from the current time. The dataset is in myotherproject, not your default project.

bq update --default_table_expiration 7200 myotherproject:mydataset

API

Call datasets.patch and update the defaultTableExpirationMs property in thedataset resource. The expiration is expressed in milliseconds in the API. Because thedatasets.update method replaces the entire dataset resource, thedatasets.patch method is preferred.

Go

Before trying this sample, follow the Go setup instructions in theBigQuery quickstart using client libraries. For more information, see theBigQuery Go API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, seeSet up authentication for client libraries.

Java

Before trying this sample, follow the Java setup instructions in theBigQuery quickstart using client libraries. For more information, see theBigQuery Java API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, seeSet up authentication for client libraries.

Create aDataset.Builderinstance from an existingDatasetinstance with theDataset.toBuilder()method. Configure the dataset builder object. Build the updated dataset with theDataset.Builder.build()method, and call theDataset.update()method to send the update to the API.

Configure the default expiration time with theDataset.Builder.setDefaultTableLifetime()method.

Node.js

Before trying this sample, follow the Node.js setup instructions in theBigQuery quickstart using client libraries. For more information, see theBigQuery Node.js API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, seeSet up authentication for client libraries.

Python

Before trying this sample, follow the Python setup instructions in theBigQuery quickstart using client libraries. For more information, see theBigQuery Python API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, seeSet up authentication for client libraries.

Configure theDataset.default_table_expiration_msproperty and callClient.update_dataset()to send the update to the API.

Update default partition expiration times

You can update a dataset's default partition expiration in the following ways:

Setting or updating a dataset's default partition expiration isn't currently supported by the Google Cloud console.

You can set a default partition expiration time at the dataset level that affects all newly created partitioned tables, or you can set apartition expirationtime for individual tables when the partitioned tables are created. If you set the default partition expiration at the dataset level, and you set the default table expiration at the dataset level, new partitioned tables will only have a partition expiration. If both options are set, the default partition expiration overrides the default table expiration.

If you set the partition expiration time when the partitioned table is created, that value overrides the dataset-level default partition expiration if it exists.

If you do not set a default partition expiration at the dataset level, and you do not set a partition expiration when the table is created, the partitions never expire and you must delete the partitions manually.

When you set a default partition expiration on a dataset, the expiration applies to all partitions in all partitioned tables created in the dataset. When you set the partition expiration on a table, the expiration applies to all partitions created in the specified table. Currently, you cannot apply different expiration times to different partitions in the same table.

When you update a dataset's default partition expiration setting:

The value for default partition expiration is expressed differently depending on where the value is set. Use the method that gives you the appropriate level of granularity:

To update the default partition expiration time for a dataset:

Console

Updating a dataset's default partition expiration is not currently supported by the Google Cloud console.

SQL

To update the default partition expiration time, use theALTER SCHEMA SET OPTIONS statementto set the default_partition_expiration_days option.

The following example updates the default partition expiration for a dataset named mydataset:

  1. In the Google Cloud console, go to the BigQuery page.
    Go to BigQuery
  2. In the query editor, enter the following statement:
    ALTER SCHEMA mydataset
    SET OPTIONS(
    default_partition_expiration_days = 3.75);
  3. Click Run.

For more information about how to run queries, see Run an interactive query.

bq

To update the default expiration time for a dataset, enter the bq updatecommand with the --default_partition_expiration flag. If you are updating a dataset in a project other than your default project, add the project ID to the dataset name in the following format:project_id:dataset.

bq update
--default_partition_expiration integer
project_id:dataset

Replace the following:

Examples:

Enter the following command to set the default partition expiration for new partitioned tables created in mydataset to 26 hours (93,600 seconds). The dataset is in your default project.

bq update --default_partition_expiration 93600 mydataset

Enter the following command to set the default partition expiration for new partitioned tables created in mydataset to 26 hours (93,600 seconds). The dataset is in myotherproject, not your default project.

bq update --default_partition_expiration 93600 myotherproject:mydataset

API

Call datasets.patch and update the defaultPartitionExpirationMs property in thedataset resource. The expiration is expressed in milliseconds. Because the datasets.updatemethod replaces the entire dataset resource, the datasets.patch method is preferred.

Update rounding mode

You can update a dataset's default rounding modeby using theALTER SCHEMA SET OPTIONS DDL statement. The following example updates the default rounding mode for mydataset toROUND_HALF_EVEN.

ALTER SCHEMA mydataset SET OPTIONS ( default_rounding_mode = "ROUND_HALF_EVEN");

This sets the default rounding mode for new tables created in the dataset. It has no impact on new columns added to existing tables. Setting the default rounding mode on a table in the dataset overrides this option.

Update time travel windows

You can update a dataset's time travel window in the following ways:

For more information on the time travel window, seeConfigure the time travel window.

To update the time travel window for a dataset:

Console

  1. In the Explorer panel, expand your project and select a dataset.
  2. Expand theActions option and click Open.
  3. In the Details panel, click Edit details.
  4. Expand Advanced options, then select the Time travel windowto use.
  5. Click Save.

SQL

Use theALTER SCHEMA SET OPTIONSstatement with the max_time_travel_hours option to specify the time travel window when altering a dataset. The max_time_travel_hours value must be an integer expressed in multiples of 24 (48, 72, 96, 120, 144, 168) between 48 (2 days) and 168 (7 days).

  1. In the Google Cloud console, go to the BigQuery page.
    Go to BigQuery
  2. In the query editor, enter the following statement:
    ALTER SCHEMA DATASET_NAME
    SET OPTIONS(
    max_time_travel_hours = HOURS);
    Replace the following:
    • DATASET_NAME: the name of the dataset that you're updating
    • HOURS with the time travel window's duration in hours.
  3. Click Run.

For more information about how to run queries, see Run an interactive query.

bq

Use the bq updatecommand with the --max_time_travel_hours flag to specify the time travel window when altering a dataset. The --max_time_travel_hours value must be an integer expressed in multiples of 24 (48, 72, 96, 120, 144, 168) between 48 (2 days) and 168 (7 days).

bq update \
--dataset=true --max_time_travel_hours=HOURS \
PROJECT_ID:DATASET_NAME

Replace the following:

API

Call thedatasets.patch ordatasets.updatemethod with a defineddataset resource in which you have specified a value for the maxTimeTravelHours field. ThemaxTimeTravelHours value must be an integer expressed in multiples of 24 (48, 72, 96, 120, 144, 168) between 48 (2 days) and 168 (7 days).

Update storage billing models

You can alter thestorage billing modelfor a dataset. Set the storage_billing_model value to PHYSICAL to use physical bytes when calculating storage changes, or to LOGICAL to use logical bytes. LOGICAL is the default.

When you change a dataset's billing model, it takes 24 hours for the change to take effect.

Once you change a dataset's storage billing model, you must wait 14 days before you can change the storage billing model again.

Console

  1. In the Explorer panel, expand your project and select a dataset.
  2. Expand theActions option and click Open.
  3. In the Details panel, click Edit details.
  4. Expand Advanced options, then select Enable physical storage billing model to use physical storage billing, or deselect it to use logical storage billing.
  5. Click Save.

SQL

To update the billing model for a dataset, use theALTER SCHEMA SET OPTIONS statementand set the storage_billing_model option:

  1. In the Google Cloud console, go to the BigQuery page.
    Go to BigQuery
  2. In the query editor, enter the following statement:
    ALTER SCHEMA DATASET_NAME
    SET OPTIONS(
    storage_billing_model = 'BILLING_MODEL');
    Replace the following:
    • DATASET_NAME with the name of the dataset that you are changing
    • BILLING_MODEL with the type of storage you want to use, either LOGICAL or PHYSICAL
  3. Click Run.

For more information about how to run queries, see Run an interactive query.

To update the storage billing model for all datasets in a project, use the following SQL query for every region, where datasets are located:

FOR record IN (SELECT CONCAT(catalog_name, '.', schema_name) AS dataset_path FROM PROJECT_ID.region-REGION.INFORMATION_SCHEMA.SCHEMATA) DO EXECUTE IMMEDIATE "ALTER SCHEMA " || record.dataset_path || " SET OPTIONS(storage_billing_model = 'BILLING_MODEL')"; END FOR;

Replace the following:

bq

To update the billing model for a dataset, use thebq update commandand set the --storage_billing_model flag:

bq update -d --storage_billing_model=BILLING_MODEL PROJECT_ID:DATASET_NAME

Replace the following:

API

Call the datasets.update methodwith a defined dataset resourcewhere the storageBillingModel field is set.

The following example shows how to call datasets.update using curl:

curl -H "Authorization: Bearer $(gcloud auth print-access-token)" -H "Content-Type: application/json" -L -X PUT https://bigquery.googleapis.com/bigquery/v2/projects/PROJECT_ID/datasets/DATASET_ID -d '{"datasetReference": {"projectId": "PROJECT_ID", "datasetId": "DATASET_NAME"}, "storageBillingModel": "BILLING_MODEL"}'

Replace the following:

Update access controls

To control access to datasets in BigQuery, seeControlling access to datasets. For information about data encryption, see Encryption at rest.

What's next