Best practices for using service accounts in pipelines (original) (raw)

Deployment pipelines let you automate the process of taking code or pre-built artifacts and deploying them to a Google Cloud environment, and they can be an alternative to using interactive tools like the Google Cloud console or the Google Cloud CLI.

Deployment pipelines differ from interactive tools like the Google Cloud console or the gcloud CLI in the way they interact with Identity and Access Management, and you must take these differences into consideration when securing your Google Cloud resources.

Before Google Cloud lets you access a resource, it performs an access check. To perform this check, IAM typically considers:

In a deployment pipeline, you rarely call Google Cloud APIs directly. Instead, you use tools to access Google Cloud resources. Tools like the Google Cloud console or the gcloud CLI require that you first authorize the tool to access resources on your behalf. By providing this authorization, you give the tool permission to use your identity when making API calls.

Like the Google Cloud console or the gcloud CLI, a deployment pipeline acts on your behalf: it takes your changes, expressed as source code, and deploys them to Google Cloud. But unlike the Google Cloud console or the gcloud CLI, a deployment pipeline typically doesn't use your identity to perform the deployment:

  1. As a user, you typically don't interact with a deployment pipeline directly. Instead, you interact with a source control system (SCM) by pushing code changes to a source repository, or approving code reviews.
  2. The deployment pipeline reads submitted code changes from the SCM system and deploys them to Google Cloud.
    To perform the deployment, the deployment pipeline typically can't use your identity because:
    1. The source code and its metadata might not indicate that you were the author, or the author information isn't tamper-proof (as in the case of unsigned Git commits)
    2. The identity you used to submit source code might be different from your identity for Google Cloud, and the two identities can't be mapped
      Most deployment pipelines therefore perform deployments under their own identity by using a service account.
  3. When the deployment pipeline accesses Google Cloud, IAM allows or denies access solely based on the identity of the service account used by the pipeline, not the identity of your user account.

Deployment pipeline

Letting a deployment pipeline use a service account to access Google Cloud has some advantages:

However, using a service account also introduces new threats. These include:

Protect against spoofing threats

To grant a deployment pipeline access to Google Cloud, you typically do the following:

  1. Create a service account
  2. Grant the service account access to the required resources
  3. Configure the deployment pipeline to use the service account

From an IAM perspective, the service account represents the deployment pipeline, but the deployment pipeline and the service account are two separate entities. If not secured properly, a bad actor might be able to use the same service account, which lets them "spoof" the identity of the deployment pipeline.

The following section describes best practices that can help you reduce the risk of such threats.

Avoid attaching service accounts to VM instances used by CI/CD systems

For applications deployed on Compute Engine that need access to Google Cloud resources, it's typically best to attach a service account to the underlying VM instance. For CI/CD systems that use Compute Engine VMs to run different deployment pipelines, this practice can be problematic if the same VM instance might be used to run different deployment pipelines that each require access to different resources.

Instead of using attached service accounts to let deployment pipelines access resources, let each deployment pipeline use a separate service account. Avoid attaching a service accountto VM instances used by CI/CD systems, or attach a service account that's limited to accessing essential services such as Cloud Logging only.

Use dedicated service accounts per deployment pipeline

When you let multiple deployment pipelines use the same service account, IAM can't differentiate between the pipelines. The pipelines have access to the same resources, and audit logs might not contain sufficient information to determine which deployment pipeline triggered a resource to be accessed or changed.

To avoid such ambiguity, maintain a 1:1 relationship between deployment pipelines and service accounts. Create a dedicated service account for each deployment pipeline and make sure to do the following:

Use Workload Identity Federation whenever possible

Some CI/CD systems like GitHub Actions or GitLab let deployment pipelines obtain OpenID Connect-compliant tokens that assert the identity of the deployment pipeline. You can let deployment pipelines use these tokens to impersonate a service account by using Workload Identity Federation.

Using Workload Identity Federation helps you avoid the risks associated with using service account keys.

Use VPC Service Controls to reduce the impact of leaked credentials

If a bad actor manages to steal an access token or service account key from one of your deployment pipelines, they might attempt to use this credential and access your resources from a different machine that they control.

By default, IAM doesn't take the geolocation, source IP address, or origin Google Cloud project into account when making access decisions. A stolen credential might therefore be usable from anywhere.

You can impose restrictions on the sources from where your Google Cloud resources can be accessed by placing your projects in a VPC service perimeterand using ingress rules:

Protect against tampering threats

For some data that you store on Google Cloud, you might find it particularly important to prevent unauthorized modification or deletion. If unauthorized modification or deletion is of particular concern, then you can characterize the data as high-integrity data.

To maintain the integrity of your data, you must ensure that the Google Cloud resources that you use to store and manage that data are configured securely, and must maintain their integrity.

Deployment pipelines can help you maintain the integrity of your data and resources, but they can also pose a risk: If the pipeline of one of its components doesn't meet the integrity requirements of the resources it manages, then the deployment pipeline turns into a weak spot that might enable bad actors to tamper with your data or resources.

The following section describes best practices that can help you reduce the risk of tampering threats.

Limit access to security controls

To ensure the security and integrity of your data and resources on Google Cloud, you use security controls such as:

These security controls are resources by themselves. Tampering with security controls endangers the integrity of the resources that the security controls apply to. As a result, you must consider the integrity of security controls to be at least as important as the integrity of the resources they apply to.

If you let a deployment pipeline manage security controls, then it's up to the deployment pipeline to maintain the integrity of security controls. As a result, you must consider the integrity of the deployment pipeline itself to be at least as important as the integrity of the security controls it manages, and the resources these controls apply to.

You can limit a deployment pipeline's impact on the integrity of your resources by doing the following:

If your deployment pipeline, its components, and underlying infrastructure can't meet the integrity demands of certain security controls, it's best to avoid letting deployment pipelines manage these security controls.

Protect against non-repudiation threats

At some point, you might notice suspicious activity affecting one of your resources on Google Cloud. In that event, you must be able to find out more about the activity and, ideally, be able to reconstruct the chain of events that led to it.

Cloud Audit Logs let you find out when resources were accessed or modified, and which users were involved. Although Cloud Audit Logs provide a starting point for analyzing suspicious activity, the information provided by these logs might not be sufficient: if you use deployment pipelines, you must also be able to correlate Cloud Audit Logs with logs produced by your deployment pipeline.

This section contains best practices that can help you maintain an audit trail across Google Cloud and your deployment pipelines.

Ensure that you can correlate deployment pipeline logs with Cloud Audit Logs

Cloud Audit Logs contain timestamps and information about the user that initiated an activity. If you use a dedicated service account for each deployment pipeline, then this information lets you identify the deployment pipeline that initiated the activity and might also help you narrow down which code changes and pipeline runs could have been responsible. But identifying the exact pipeline run and code change that led to the activity can be difficult without more information that lets you correlate Cloud Audit Logs with the logs of your deployment pipeline.

You can enrich Cloud Audit Logs to contain more information in multiple ways, including:

You can also enrich the logs emitted by your deployment pipeline:

Align the retention periods of deployment pipeline logs and Cloud Audit Logs

To analyze suspicious activity related to a deployment pipeline, you typically need multiple types of logs, including Admin Activity audit logs,Data Access audit logs, and the logs of your deployment pipeline.

Cloud Logging only retains logs for a certain period of time. By default, this retention period is shorter for Data Access audit logsthan for Admin Activity audit logs. The system that runs your deployment pipeline might also discard its logs after a certain time period. Depending on the nature of your deployment pipeline, and the importance of the resources that the deployment pipeline manages, these default retention periods might be insufficient or misaligned.

To ensure that logs are available when you need them, make sure that the log retention periods used by the different systems are aligned and sufficiently long.

If necessary, customize the retention periodfor Data Access audit logs, or set up a custom sinkto route logs to a custom storage location.

Protect against information disclosure threats

When a deployment pipeline's service account has access to confidential data, then a bad actor might attempt to use the deployment pipeline to exfiltrate that data. A deployment pipeline's access to data can be direct or indirect:

This section contains best practices that can help you limit the risk of disclosing confidential data.

Avoid granting direct access to confidential data

To deploy infrastructure, configuration, or new software versions, a deployment pipeline often doesn't require access to existing data. Instead, it's often sufficient to limit access to resources that don't contain any data, or at least don't contain confidential data.

Ways to minimize access to existing, potentially confidential data include:

Use VPC Service Controls to help prevent data exfiltration

You can reduce the risk of indirect data exfiltration bydeploying your Google Cloud resources in a VPC Service Controls perimeter.

If your deployment pipeline runs outside of Google Cloud, or is part of a different perimeter, you can grant the pipeline's service account access to the perimeter by configuring an ingress rule. If possible, configure the ingress rule so that it only allows access from the IP addresses used by the deployment pipeline, and only permits access to the services that the deployment pipeline really needs.

Protect against privilege escalation threats

When a deployment pipeline uses a service account to access Google Cloud resources, it does so irrespective of the developer or user who authored a code or configuration change. The disconnect between the pipeline's service account and the developer's identity makes deployment pipelines prone toconfused deputy attacks, in which a bad actor tricks the pipeline into performing an action that the bad actor isn't allowed to do themselves, and that the pipeline might not even be supposed to perform.

This section contains best practices that can help you reduce the risk of your deployment pipeline being abused for privilege escalation.

Limit access to the deployment pipeline and all inputs

Most deployment pipelines use a source code repository as their main source of input and might trigger automatically as soon as they detect a code change in certain branches (for example, the main branch). Deployment pipelines typically can't verify whether the code and configuration they find in the source code repository is authentic and trustworthy. The security of this architecture therefore depends on:

For these controls to be effective, you must also ensure that bad actors can't sidestep them by:

When managed by a deployment pipeline, your resources on Google Cloud can only be as secure as your deployment pipeline, its configuration, infrastructure, and inputs. Therefore, you must protect these components as well as you want your Google Cloud resources to be protected.

Avoid letting a deployment pipeline modify policies

For most types of resources, IAM defines aRESOURCE_TYPE.setIamPolicy permission. This permission enables a user to modify a resource's allow policy, either to grant other users access or to modify and extend their own access. Unless constrained by a deny policy, granting a user or service account a *.setIamPolicypermission has the effect of granting them full access to the resource.

Whenever possible, avoid letting a deployment pipeline modify access to resources. When granting the pipeline's service account access to Google Cloud resources, use roles that don't include any *.setIamPolicy permission and avoid using the basic roles Editor and Owner.

For some deployment pipelines, granting permission to modify allow policies or deny policies might be unavoidable: For example, a deployment pipeline's purpose might be to create new resources or manage access to existing resources. In these scenarios, you can still limit the extent to which the deployment can modify access by:

Don't reveal service account credentials in logs

The logs generated by a deployment pipeline are often accessible to a larger group of users, including users that don't have permission to modify the pipeline's configuration. It's possible that these logs accidentally reveal credentials by echoing the following:

If logs accidentally reveal credentials such as access tokens, then these credentials could be abused by bad actors to escalate their privileges. Ways to prevent logs from revealing credentials include the following:

What's next