Authorized datasets (original) (raw)
This document describes how to use authorized datasets in BigQuery. An authorized dataset lets you authorize all of the views in a specified dataset to access the data in a second dataset. With an authorized dataset, you don't need to configure individualauthorized views.
Overview
A view in BigQuery is a virtual table defined by a SQL query. For example, a view's query might return only a subset of the columns of a table, excluding columns that contain personal identifiable information (PII). To query a view, a user needs to have access to the resources that are accessed by the view's query.
Authorized views
If you want to let users query a view, without giving them direct access to the resources referenced by the view, you can use anauthorized view. When you create an authorized view, you can share either a logical view or a materialized view. When you authorize a materialized view, it's referred to as an_authorized materialized view_.
An authorized view lets you, for example, share more limited data in a view with specified groups or users (principals), without giving the principals access to all of the underlying data. Principals can view the data you share and run queries on it, but they can't access the source dataset directly. Instead, the authorized view has access to the source data.
Authorized datasets
If you want to give a collection of views access to a dataset, without having to authorize each individual view, you can group the views together into a dataset, and then give the dataset that contains the views access to the dataset that contains the data. You can then give principals access to the dataset with the group of views, or to individual views in the dataset, as needed.
A dataset that has access to another dataset is called an authorized dataset. The dataset that authorizes another dataset to access its data is called the shared dataset.
Required permissions and roles
To authorize a dataset, or to revoke a dataset's authorization, you must have the followingIdentity and Access Management (IAM) permissions, which let you update the access control list of the dataset you are sharing.
After a dataset is authorized, you need these same permissions if you want to create or update views in the authorized dataset. For more information, seeCreate or update a view in an authorized dataset.
Permission | Resource |
---|---|
bigquery.datasets.get | The dataset you are sharing. |
bigquery.datasets.update | The dataset you are sharing. |
The following predefinedIAM rolesprovide the required permissions.
Role | Description |
---|---|
bigquery.dataOwner | BigQuery Data Owner |
bigquery.admin | BigQuery Admin |
Quotas and limits
Authorized datasets are subject to dataset limits. For more information, see Dataset limits.
Authorize a dataset
You can authorize a dataset's current and future views to access another dataset by adding the dataset you want to authorize to the access list of the dataset you want to share, as follows:
Console
- In the Google Cloud console, go to the BigQuery page.
Go to BigQuery - In the Explorer panel, expand your project and select the dataset that contains the data you want to share.
- Click theActions icon and select Open.
- In the details pane that appears, click Sharing and select theAuthorize Datasets option.
- In the Authorized dataset pane that appears, enter the Dataset IDof the dataset that you want to authorize, in the following format:
PROJECT.AUTHORIZED_DATASET
For example:myProject.myDataset
- Click Add Authorization and then click Close.
bq
- Open the Cloud Shell:
Go to Cloud Shell - Write the existing metadata (including the access control list) for the dataset you want to share into a JSON file by using thebq show command.
bq show --format=prettyjson PROJECT:SHARED_DATASET > FILE_PATH - Use a text editor to add the dataset that you want to authorize into the existing
access
section of the JSON file that was created atFILE_PATH.
For example:
"access": [
...
{
"dataset": {
"dataset": {
"project_id": "PROJECT",
"dataset_id": "AUTHORIZED_DATASET"
},
"target_types": "VIEWS"
}
}
] - Update the shared dataset by using thebq updatecommand. For example:
bq update --source FILE_PATH PROJECT:SHARED_DATASET - To verify that the authorized dataset has been added, enter the
bq show
command again. For example:
bq show --format=prettyjson PROJECT:SHARED_DATASET
API
- Get the current metadata for the dataset you want to share by calling thedatasets.getmethod, as follows:
GET https://bigquery.googleapis.com/bigquery/v2/projects/PROJECT/datasets/SHARED_DATASET
The response body returns a Datasetresource that contains JSON metadata for the dataset. - Add the dataset that you want authorize into the
access
section of the JSON metadata that was returned in theDataset
resource as follows:
"access": [
...
{
"dataset": {
"dataset": {
"project_id": "PROJECT",
"dataset_id": "AUTHORIZED_DATASET"
},
"target_types": "VIEWS"
}
}
] - Use the datasets.updatemethod to update the dataset with the added authorization:
PUT https://bigquery.googleapis.com/bigquery/v2/projects/PROJECT/datasets/SHARED_DATASET
Include the updatedDataset
resource in the request body. - You can verify that the authorized dataset has been added by calling thedatasets.getmethod again.
Revoke a dataset's authorization
When you delete a dataset authorized to access another source dataset, it can take up to 24 hours for the change to fully reflect in the source dataset'saccess control lists (ACLs). During this time:
- You won't be able to access the source data through the deleted dataset.
- The deleted dataset might still appear in the source dataset's ACL and count towards any authorized dataset limits. This could prevent you from creating new authorized datasets until the ACL is updated.
To revoke the access granted to the views in an authorized dataset, remove the authorized dataset from the shared dataset's access list, as follows:
Console
- In the Google Cloud console, go to the BigQuery page.
Go to BigQuery - In the Explorer panel, expand your project and select the shared dataset.
- Click theActions icon and select Open.
- In the details pane that appears, click Sharing and select theAuthorize Datasets option.
- In the Authorized dataset pane that appears, find the entry for the authorized dataset in the Currently authorized datasets section.
- Click the delete icon next to the authorized dataset you want to remove, and then click Close.
bq
- Open the Cloud Shell:
Go to Cloud Shell - Write the existing metadata (including the access control list) for the shared dataset into a JSON file by using thebq show command.
bq show --format=prettyjson PROJECT:SHARED_DATASET > FILE_PATH - Use a text editor to remove the authorized dataset from the
access
section of the JSON file that was created at FILE_PATH, as follows:
{
"dataset": {
"dataset": {
"project_id": "PROJECT",
"dataset_id": "AUTHORIZED_DATASET"
},
"target_types": "VIEWS"
}
} - Update the shared dataset by using thebq updatecommand. For example:
bq update --source FILE_PATH PROJECT:SHARED_DATASET - To verify that the authorized dataset has been removed, enter the
bq show
command again. For example:
bq show --format=prettyjson PROJECT:SHARED_DATASET
API
- Get the current metadata for the shared dataset by calling thedatasets.getmethod, as follows:
GET https://bigquery.googleapis.com/bigquery/v2/projects/PROJECT/datasets/SHARED_DATASET
The response body returns a Datasetresource that contains JSON metadata for the dataset. - Remove the authorized dataset from the
access
section of the JSON that was returned in theDataset
resource, for example:
{
"dataset": {
"dataset": {
"project_id": "PROJECT",
"dataset_id": "AUTHORIZED_DATASET"
},
"target_types": "VIEWS"
}
} - Use the datasets.updatemethod to update the dataset with the removed authorization:
PUT https://bigquery.googleapis.com/bigquery/v2/projects/PROJECT/datasets/SHARED_DATASET
Include the updatedDataset
resource in the request body. - You can verify that the authorized dataset has been removed by calling thedatasets.getmethod again.
Create or update a view in an authorized dataset
To create or update a view that is in an authorized dataset, you must have the permissions for the shared dataset that are listed inRequired permissions and roles, in addition to the permissions that are required tocreate orupdate a view in a standard dataset.
The following table summarizes the necessaryIdentity and Access Management (IAM) permissionsto create or update a view that is in an authorized dataset:
Permission | Resource |
---|---|
bigquery.datasets.get | The dataset you are sharing. |
bigquery.tables.getData | Any tables or views from the shared dataset that are referenced in the new view you are creating or updating. |
bigquery.tables.create | The authorized dataset in which you are creating a view. |
bigquery.tables.update | The authorized dataset in which you are updating a view. |
You don't need any additional permissions todelete a view from an authorized dataset.
Query a view in an authorized dataset
To query a view in an authorized dataset, a user needs to have access to the view, but access to the shared dataset is not required.
For more information, seeAuthorized views.
Authorized dataset example
The following example describes how to create and use an authorized dataset.
Assume you have two datasets, named private_dataset
and public_dataset
. The private_dataset
dataset contains a table named private_table
. Thepublic_dataset
dataset contains a view named private_table_filtered
. Theprivate_table_filtered
view is based on a query that returns some, but not all, of the fields in the private_table
table.
You can give a user access to the data returned by the private_table_filtered
view, but not all of the data in the private_table
table, as follows:
- Grant the
bigquery.dataViewer
role to the user for thepublic_dataset
dataset. This role includes thebigquery.tables.getData
permission, which lets the user query the views in thepublic_dataset
dataset. For information about how to grant a role to a user for a dataset, seeControlling access to datasets.
The user now has permission to query views in thepublic_dataset
, but they still cannot access theprivate_table
table inprivate_dataset
. If the user tries to query theprivate_table
table directly, or if they try to access theprivate_table
table indirectly by querying theprivate_table_filtered
view, they get an error message similar to the following:Access Denied: Table PROJECT:private_dataset.private_table: User does not have permission to query tablePROJECT:private_dataset.private_table.
- In the BigQuery page of the Google Cloud console, open the
private_dataset
dataset, clickSharing, and then select Authorize Datasets. - In the Authorized dataset pane that appears, enter
PROJECT.public_dataset
in the Dataset ID field, and then click Add Authorization.
Thepublic_dataset
dataset is added to the access control list of theprivate_dataset
dataset, authorizing the views in thepublic_dataset
dataset to query the data in theprivate_dataset
dataset.
The user can now query theprivate_table_filtered
view in thepublic_dataset
dataset, which indirectly accesses theprivate_dataset
dataset, without having any permissions to directly access data in theprivate_dataset
dataset.
Limitations
- You can create authorized datasets in different regions, but BigQuery doesn't support cross-region queries. Therefore, we recommend that you create datasets in the same region.
What's next
- For information about authorizing an individual view to access data in a dataset, see Authorized views.
- For information about authorizing a table function or a user-defined function to access data in a dataset, seeAuthorized functions.