Set maximum instances for services (original) (raw)
Set maximum instances for services
Stay organized with collections Save and categorize content based on your preferences.
This page describes how to set the maximum number of instances that can be used for your Cloud Run service using the default Cloud Run autoscalingbehavior. If you use manual scaling, you should also consult the documentation for manual scalingfor information on how these billing settings work with manually scaled services.
Specifying maximum instances in Cloud Run lets you limit the scaling of your service in response to incoming requests, although this maximum setting can be exceeded for a brief period due to circumstances such as traffic spikes.
You can use this setting as a way to control your costs or to limit the number of connections to a backing service, such as to a database.
For information about the maximum instance limits that might apply to your service, refer to Maximum instances limits.
For more information on the way Cloud Run autoscales container instances, refer to Instance autoscaling.
Required roles
To get the permissions that you need to configure and deploy Cloud Run services, ask your administrator to grant you the following IAM roles:
- Cloud Run Developer (
roles/run.developer
) on the Cloud Run service - Service Account User (
roles/iam.serviceAccountUser
) on the service identity
For a list of IAM roles and permissions that are associated with Cloud Run, seeCloud Run IAM rolesand Cloud Run IAM permissions. If your Cloud Run service interfaces with Google Cloud APIs, such as Cloud Client Libraries, see theservice identity configuration guide. For more information about granting roles, seedeployment permissionsand manage access.
Set and update maximum instances
Any configuration change leads to the creation of a new revision. Subsequent revisions will also automatically get this configuration setting unless you make explicit updates to change it.
By default, Cloud Run services are configured to scale out to a maximum of 100 instances.
You can change the maximum instances setting using the Google Cloud console, the Google Cloud CLI, or a YAML file when youcreate a new service ordeploy a new revision.
Console
- In the Google Cloud console, go to Cloud Run:
Go to Cloud Run - Click Deploy container and select Service to configure a new service. If you are configuring an existing service, click the service, then click Edit and deploy new revision.
- If you are configuring a new service, fill out the initial service settings page, then click Container(s), volumes, networking, security to expand the service configuration page.
- Click the Container tab.
- In the field labelled Maximum number of instances, specify the desired maximum number of instances, using any integer value from
1
to the maximum limit.
- In the field labelled Maximum number of instances, specify the desired maximum number of instances, using any integer value from
- Click Create or Deploy.
gcloud
You can update the maximum number of instancesof a given service by using the following command:
gcloud run services update SERVICE --max-instances MAX-VALUE
Replace
- SERVICE with the name of your service and
- MAX-VALUE with the desired maximum number of container instances, using any integer value from
1
to the maximum limit. Specifydefault
to clear any maximum instance setting and restore the default of 100 instances.
You can also set the maximum number of instances duringdeployment using the command:
gcloud run deploy --image IMAGE_URL --max-instances MAX-VALUE
Replace
- IMAGE_URL with a reference to the container image, for example,
us-docker.pkg.dev/cloudrun/container/hello:latest
. If you use Artifact Registry, the repository REPO_NAME must already be created. The URL has the shapeLOCATION-docker.pkg.dev/PROJECT_ID/REPO_NAME/PATH:TAG
. - MAX-VALUE with the desired maximum number of container instances.
YAML
- If you are creating a new service, skip this step. If you are updating an existing service, download its YAML configuration:
gcloud run services describe SERVICE --format export > service.yaml - Update the
autoscaling.knative.dev/maxScale:
attribute:
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
name: SERVICE
spec:
template:
metadata:
annotations:
autoscaling.knative.dev/maxScale: 'MAX-INSTANCE'
name: REVISION
Replace- SERVICE with the name of your Cloud Run service
- MAX-INSTANCE with the required maximum number.
- REVISION with a new revision name or delete it (if present). If you supply a new revision name, it must meet the following criteria:
* Starts withSERVICE-
* Contains only lowercase letters, numbers and-
* Does not end with a-
* Does not exceed 63 characters
- Create or update the service using the following command:
gcloud run services replace service.yaml
Terraform
To learn how to apply or remove a Terraform configuration, seeBasic Terraform commands.
The following google_cloud_run_v2_service
resource specifies a maximum number of instances of 10
under template.scaling
. Replace 10
with your required maximum number of instances.
View maximum instances settings
To view the current maximum instances settings for your Cloud Run service:
Console
- In the Google Cloud console, go to Cloud Run:
Go to Cloud Run - Click the service you are interested in to open the Service detailspage.
- Click the Revisions tab.
- In the details panel at the right, the maximum instances setting is listed under the Container tab.
gcloud
- Use the following command:
gcloud run services describe SERVICE - Locate the maximum instances setting in the returned configuration.