Network bandwidth (original) (raw)

Google Cloud accounts for bandwidth per compute instance, not per virtual network interface (vNIC) or IP address. An instance'smachine type defines its maximum possible egress rate; however, you can only achieve that maximum possible egress rate in specific situations.

This page outlines the network bandwidth limits, which are useful when planning your deployments. It categorizes bandwidth using two dimensions:

All of the information on this page is applicable to Compute Engine compute instances, as well as products that depend on Compute Engine instances. For example, a Google Kubernetes Engine node is a Compute Engine instance.

Configurations that impact network bandwidth

Neither additional virtual network interfaces (vNICs)nor additional IP addresses per vNIC increase ingress or egress bandwidth for a compute instance. For example, a C3 VM with 22 vCPUs is limited to 23 Gbps total egress bandwidth. If you configure the C3 VM with two vNICs, the VM is still limited to 23 Gbps total egress bandwidth, not 23 Gbps bandwidth per vNIC.

The following sections describe how other compute instance configurations can impact network bandwidth.

Use per VM Tier_1 networking performance

To get the highest possible ingress and egress bandwidth,configure Tier_1 networkingfor your compute instance.

Dynamic Network Interfaces

Dynamic Network Interfaces use the bandwidth of their parent vNIC. There is no traffic isolation within a parent vNIC. Network traffic from a Dynamic NIC can starve the other Dynamic NICs associated with the same parent vNIC. To avoid this conflict, you can use Linux traffic control (TC) to craft application-specific traffic shaping policies. These policies help to implement either fairness or certain types of priority. For prioritization, you map traffic (for example for Dynamic NICs) to a traffic class, and then map that traffic class to a quality of service. For an example of this approach, seeTraffic shaping with Red Hat.

Dynamic NICs aren't supported for Compute Engine instances that run a Windows OS.

Bandwidth sharing with Cloud RDMA

Applications using Cloud RDMA typically use both TCP and RDMA traffic. For example, during high performance computing (HPC) jobs, applications commonly use TCP for communication during load and store phases, and RDMA for compute stages. H4D instances using GVNIC for TCP traffic provide up to 200 Gbps network bandwidth. If you also configure an H4D instance to use Cloud RDMA, then the network bandwidth is shared among the configured network interfaces.

Network bandwidth allocation between Cloud RDMA traffic and TCP traffic is done dynamically. Instead of limiting both Cloud RDMA and TCP traffic to 100 Gbps bandwidth each, the GVNIC network interface can use all of the available bandwidth when the Cloud RDMA network interface is not being used. Similarly, the Cloud RDMA network interface can use all of the available bandwidth when the GVNIC network interface is not being used. When both network interface types are in use, the bandwidth is shared between Cloud RDMA traffic and TCP traffic.

Bandwidth summary

The following table illustrates the maximum possible bandwidth based on whether a packet is sent from (egress) or received by (ingress) a compute instance and the packet routing method.

Egress bandwidth limits

Routing withina VPC network Primarily defined by a per-instance maximum egress bandwidth based on the sending instance's machine type and whether Tier_1 networking is enabled. N2, N2D, C2, C2D, M3, and C4A VMs with Tier_1 networking support egress bandwidth limits up to 100 Gbps. H3 VMs support VM-to-VM egress bandwidth limits up to 200 Gbps. H4D instances support VM-to-VM egress bandwidth of up to 200 Gbps for Cloud RDMA and gVNIC combined. X4, M4, A2, and G2 instances support egress bandwidth limits up to 100 Gbps. G4 instances support egress bandwidth limits up to 400 Gbps. A4X Max instances support egress bandwidth limits up to 3,600 Gbps. A4X instances support egress bandwidth limits up to 2,000 Gbps. A4 and A3 instances support egress bandwidth limits up to 3,600 Gbps. C4, C4D, C3, C3D, and Z3 instances support up to 200 Gbps egress bandwidth limits with Tier_1 networking. For other factors, definitions, and scenarios, seeEgress to destinations routable within a VPC network.
Routing outsidea VPC network Primarily defined by a per-instance maximum egress bandwidth based on the sending instance's machine type and whether Tier_1 networking is enabled. A sending instance's maximum possible egress to a destination outside of its VPC network cannot exceed the following: 3 Gbps per flow When Tier_1 networking is enabled: 25 Gbps total When Tier_1 networking isn't enabled or isn't supported, the following total bandwidth is available per machine series: For G4 instances: 7 Gbps total for machine types with less than 48 vCPUs, and 28 Gbps total for machine types with 48 or more vCPUs For H4D and H3 machine series, 1 Gbps total For machine series that support multiple physical NICs, such as A3, A4, A4X, and A4X Max instances, 7 Gbps per NIC For all other machine series, 7 Gbps total For other factors, definitions, and caveats, seeEgress to destinations outside of a VPC network.

Ingress bandwidth limits

Routing withina VPC network Generally, ingress rates are similar to the egress rates for a machine type. To get the highest possible ingress bandwidth,enable Tier_1 networking. The size of your compute instance, the capacity of the server NIC, the traffic coming into other guest VMs running on the same host hardware, your guest OS network configuration, and the number of disk reads performed by your instance can all impact the ingress rate. Google Cloud doesn't impose any additional limitations on ingress rates within a VPC network. For other factors, definitions, and scenarios, seeIngress to destinations routable within a VPC network.
Routing outsidea VPC network Google Cloud protects each compute instance by limiting ingress traffic routed outside a VPC network. The limit is the first of the following rates encountered: 1,800,000 pps (packets per second) 30 Gbps For a machine series that supports multiple physical NICs, such as A3, A4, A4X, and A4X Max instances, the limit is the first of the following rates encountered: 1,800,000 pps (packets per second) per physical NIC 30 Gbps per physical NIC For other factors, definitions, and scenarios, seeIngress to destinations outside of a VPC network.

Egress bandwidth

Google Cloud limits outbound (egress) bandwidth using per-instance maximum egress rates. These rates are based the machine type of the compute instance that is sending the packet and whether the packet's destination is accessible using routes within a VPC network or routes outside of a VPC network. Outbound bandwidth includes packets emitted by all of the instance's NICs and data transferred to all Hyperdisk and Persistent Disk volumes connected to the instance.

Per-instance maximum egress bandwidth

Per-instance maximum egress bandwidth is generally 2 Gbps per vCPU, but there are some differences and exceptions, depending on the machine series. The following table shows the range of maximum limits for egress bandwidth for traffic routed within a VPC network.

The following table summarizes the maximum egress bandwidth for each machine series. You can find the per-instance maximum egress bandwidth for every machine type listed on its specific machine family page (using the links for each machine series in the table).

Maximum per-instance egress limit
Machine series Standard Tier_1 networking
C4 and C4D 100 Gbps 200 Gbps
C4A 50 Gbps 100 Gbps
C3 and C3D 100 Gbps 200 Gbps
C2 and C2D 32 Gbps 100 Gbps
E2 16 Gbps N/A
E2 shared-core 2 Gbps N/A
H4D and H3 200 Gbps N/A
M4 100 Gbps N/A
M3 32 Gbps 100 Gbps
M2 32 Gbps on Intel Cascade Lake or later CPU16 Gbps on other CPU platforms N/A
M1 32 Gbps N/A
N4,N4A, and N4D 50 Gbps N/A
N2 and N2D 32 Gbps 100 Gbps
N1 (excluding VMs with 1 vCPU) 32 Gbps on Intel Skylake and later CPU16 Gbps on earlier CPU platforms N/A
N1 machine types with 1 vCPU 2 Gbps N/A
N1 shared-core (f1-micro and g1-small) 1 Gbps N/A
T2A and T2D 32 Gbps N/A
X4 100 Gbps N/A
Z3 100 Gbps 200 Gbps

For network bandwidth information for Accelerator-optimized machine series, seeNetworking and GPU machines.

Per-instance maximum egress bandwidth is not a guarantee. The actual egress bandwidth can be lowered according to factors such as the following non-exhaustive list:

To get the largest possible per-instance maximum egress bandwidth:

Egress to destinations routable within a VPC network

From the perspective of a sending instance and for destination IP addresses accessible by means of routes within a VPC network, Google Cloud limits outbound traffic using these rules:

Destinations routable within a VPC network include all of the following destinations, each of which is accessible from the perspective of the sending instance by a route whose next hop is not the default internet gateway:

The following list ranks traffic from sending instances to internal destinations, from highest possible bandwidth to lowest:

Egress to destinations outside of a VPC network

From the perspective of a sending instance and for destination IP addresses_outside of a VPC network_, Google Cloud limits outbound traffic to whichever of the following rates is reached first:

For example, even though a c3-standard-44 VM has a per-instance_maximum_ egress bandwidth of 32 Gbps, the per-instance egress bandwidth from a c3-standard-44 VM to external destinations is either 25 Gbps or 7 Gbps, depending on whether Tier_1 networking is enabled.

Destinations outside of a VPC network include all of the following destinations, each of which is accessible by a route in the sending instance's VPC network whose next hop is the default internet gateway:

For details about which Google Cloud resources use what types of external IP addresses, seeExternal IP addresses.

Ingress bandwidth

Google Cloud handles inbound (ingress) bandwidth depending on how the incoming packet is routed to a receiving compute instance.

Ingress to destinations routable within a VPC network

A receiving compute instance can handle as many incoming packets as its machine type, operating system, and other network conditions permit. Google Cloud doesn't implement any purposeful bandwidth restriction on incoming packets delivered to an instance if the incoming packet is delivered using routes_within_ a VPC network:

Destinations for packets that are routed within a VPC network include:

Ingress to destinations outside of a VPC network

Google Cloud implements the following bandwidth limits for incoming packets delivered to a receiving instance using routes outside a VPC network. When load balancing is involved, the bandwidth limits are applied individually to each receiving instance.

For machine series that don't support multiple physical NICs, the applicable inbound bandwidth restriction applies collectively to all virtual network interfaces (vNICs). The limit is the first of the following rates encountered:

For machine series that support multiple physical NICs the applicable inbound bandwidth restriction applies individually to each physical NIC. The limit is the first of the following rates encountered:

Destinations for packets that are routed using routes outside of a VPC network include:

Use jumbo frames to maximize network bandwidth

To receive and send jumbo frames, configure the VPC network used by your compute instances; set themaximum transmission unit (MTU) to a larger value, up to 8896.

Higher MTU values increase the packet size and reduce the packet-header overhead, which increases payload data throughput.

You can use jumbo frames with the gVNIC driver version 1.3 or later on VM instances, or with the IDPF driver on bare metal instances. Not all Google Cloud public images include these drivers. For more information about operating system support for jumbo frames, see theNetworking features tab on theOperating system detailspage.

If you are using an OS image that doesn't have full support for jumbo frames, you can manually install gVNIC driver version v1.3.0 or later. Google recommends installing the gVNIC driver version marked Latest to benefit from additional features and bug fixes. You can download the gVNIC drivers fromGitHub.

To manually update the gVNIC driver version in your guest OS, seeUse on non-supported operating systems.

Jumbo frames and GPU machines

For GPU machine types, use the recommended MTU settings for Jumbo frames. For more information, seeRecommended MTU settings for Jumbo frames.

Receive and transmit queues

Each NIC or vNIC for a compute instance is assigned a number of receive and transmit queues for processing packets from the network.

Default queue allocation

Unless you explicitly assign queue counts for NICs, you can model the algorithm Google Cloud uses to assign a fixed number of RX and TX queues per vNIC in this way:

Bare metal instances

For bare metal instances, there is only one vNIC, so the maximum queue count is 16.

VM instances that use the gVNIC network interface

For C4 instances, to improve performance, the following configurations use a fixed number of queues:

For the other machine series, the queue count depends on whether the machine series uses Titanium or not.

To finish the default queue count calculation:

  1. If the calculated number is less than 1, assign each vNIC one queue instead.
  2. Determine if the calculated number is greater than the maximum number of queues per vNIC, which is 16. If the calculated number is greater than 16, ignore the calculated number, and assign each vNIC 16 queues instead.

VM instances using the VirtIO network interface or a custom driver

Divide the number of vCPUs by the number of vNICs, and discard any remainder —[number of vCPUs/number of vNICs].

  1. If the calculated number is less than 1, assign each vNIC one queue instead.
  2. Determine if the calculated number is greater than the maximum number of queues per vNIC, which is 32. If the calculated number is greater than 32, ignore the calculated number, and assign each vNIC 32 queues instead.

H4D instances with Cloud RDMA

For H4D instances that use Cloud RDMA, each physical host runs a single compute instance. Thus the instance gets all the available queue pairs. H4D instances have 16 queues for the gVNIC network interface and 16 queues for the IRDMA network interface.

Examples

The following examples show how to calculate the default number of queues for a VM instance:

On Linux systems, you can use ethtool to configure a vNIC with fewer queues than the number of queues Google Cloud assigns per vNIC.

Queue counts when using Dynamic Network Interface

If you use Dynamic Network Interfaces with your network interfaces, the queue counts don't change. A Dynamic NIC doesn't have its own receive and transmit queues; it uses the same queues as the parent vNIC.

Custom queue allocation for VM instances

Instead of the default queue allocation, you can assign a custom queue count (total of both RX and TX) to each vNIC when you create a new compute instance by using the Compute Engine API.

The number of custom queues you specify must adhere to the following rules:

With queue oversubscription, you can assign up to 16 queues for each vNIC of a VM instance, even if the total queue count for the VM exceeds the number of vCPUs. To oversubscribe the custom queue count, the VM instance must satisfy the following conditions:

With queue oversubscription, the maximum queue count for the VM instance is 16 times the number of vNICs. So, if you have 6 vNICs configured for an instance with 30 vCPUs, you can configure a maximum of (16 * 6), or 96 custom queues for your VM instance.

Examples

It's also possible to assign a custom queue count for only some vNICs, letting Compute Engine assign queues to the remaining vNICs. The number of queues that you can assign per vNIC is still subject to the rules mentioned previously. You can model the feasibility of your configuration, and the number of queues that Compute Engine assigns to the remaining vNICs, with this process:

  1. Calculate the sum of queues for the vNICs that have a custom queue assignment.
  2. Subtract the sum of custom-assigned queues from the number of vCPUs. If the difference is less than the number of remaining vNICs for which Compute Engine must assign queues, then Compute Engine returns an error because each vNIC must have at least one queue.
  3. Divide the difference from the previous step by the number of vNICs without a custom queue count and discard any remainder:
[(number of vCPUs - sum of assigned queues)/(number of remaining vNICs)]  

This calculation always results in a whole number (not a fraction) that is at least equal to one because each vNIC must have at least one queue. 4. Compute Engine assigns each remaining vNIC a queue count as follows:

Example

Assume you have a VM with 20 vCPUs and 6 NICs, and that you assigned 5 queues to nic0, 6 queues to nic1, 4 queues to nic2, and let Compute Engine assign queue counts for nic3, nic4, and nic5.

  1. The sum of the custom-assigned queues is 5 + 6 + 4 = 15.
  2. Compute Engine has 20 - 15 = 5 queues left to assign to the remaining three vNICs (nic3, nic4, nic5). The difference (5) is greater than the number of vNICs that don't have a custom queue count (3).
  3. The difference of 5 is divided by 3, and the remainder discarded. This leaves a value of 1.
  4. Because the calculated number (1) is less than the maximum number of queues per vNIC, the queue count for the remaining vNICs is set to 1.

Configure custom queue counts

To create a compute instance that uses a custom queue count for one or more NICs or vNICs, complete the following steps.

In the following code examples, the VM is created with the network interface type set to GVNIC and per VM Tier_1 networking performance enabled. You can use these code examples to specify the maximum queue counts and queue oversubscription that is available for the supported machine types.

gcloud

  1. If you don't already have aVPC networkwith a subnet for each vNIC interface you plan to configure, create them.

  2. Use thegcloud compute instances create commandto create the compute instance. Repeat the --network-interface flag for each vNIC that you want to configure for the instance, and include the queue-count option.

    gcloud compute instances create INSTANCE_NAME
    --zone=ZONE
    --machine-type=MACHINE_TYPE
    --network-performance-configs=total-egress-bandwidth-tier=TIER_1
    --network-interface=network=NETWORK_NAME_1,subnet=SUBNET_1,nic-type=GVNIC,queue-count=QUEUE_SIZE_1
    --network-interface=network=NETWORK_NAME_2,subnet=SUBNET_2,nic-type=GVNIC,queue-count=QUEUE_SIZE_2

Replace the following:

Terraform

  1. If you don't already have aVPC networkwith a subnet for each vNIC interface you plan to configure, create them.
  2. Create a compute instance with specific queue counts for vNICs using thegoogle_compute_instance resource. Repeat the --network-interface parameter for each vNIC you want to configure for the compute instance, and include the queue-count parameter.

Queue oversubscription instance

resource "google_compute_instance" "INSTANCE_NAME" {
project = "PROJECT_ID"
boot_disk {
auto_delete = true
device_name = "DEVICE_NAME"
initialize_params {
image="IMAGE_NAME"
size = DISK_SIZE
type = "DISK_TYPE"
}
}
machine_type = "MACHINE_TYPE"
name = "INSTANCE_NAME"
zone = "ZONE"
network_performance_config {
total_egress_bandwidth_tier = "TIER_1"
}
network_interface {
nic_type = "GVNIC"
queue_count = QUEUE_COUNT_1
subnetwork_project = "PROJECT_ID"
subnetwork = "SUBNET_1"
}
network_interface {
nic_type = "GVNIC"
queue_count = QUEUE_COUNT_2
subnetwork_project = "PROJECT_ID"
subnetwork = "SUBNET_2"
}
network_interface {
nic_type = "GVNIC"
queue_count = QUEUE_COUNT_3
subnetwork_project = "PROJECT_ID"
subnetwork = "SUBNET_3""
}
network_interface {
nic_type = "GVNIC"
queue_count = QUEUE_COUNT_4
subnetwork_project = "PROJECT_ID"
subnetwork = "SUBNET_4""
}
}

Replace the following:

REST

  1. If you don't already have aVPC networkwith a subnet for each vNIC interface you plan to configure, create them.
  2. Create a compute instance with specific queue counts for vNICs using theinstances.insert method. You can repeat the networkInterfaces property to configure multiple network interfaces.
    POST https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/instances
    {
    "name": "INSTANCE_NAME",
    "machineType": "machineTypes/MACHINE_TYPE",
    "networkPerformanceConfig": {
    "totalEgressBandwidthTier": TIER_1
    },
    "networkInterfaces": [
    {
    "nicType": gVNIC,
    "subnetwork":"regions/region/subnetworks/SUBNET_1",
    "queueCount": "QUEUE_COUNT_1"
    } ],
    "networkInterfaces": [
    {
    "nicType": gVNIC,
    "subnetwork":"regions/region/subnetworks/SUBNET_2",
    "queueCount": "QUEUE_COUNT_2"
    } ]
    }
    Replace the following:
    • PROJECT_ID: ID of the project to create the compute instance in
    • ZONE: zone to create the compute instance in
    • INSTANCE_NAME:name of the new compute instance
    • MACHINE_TYPE: machine type,predefined orcustom, for the new compute instance. To oversubscribe the queue count, you must use a machine type from the N2, N2D, C2 or C2D machine series that uses gVNIC and Tier_1 networking.
    • SUBNET_*: the name of the subnet that the network interface connects to
    • QUEUE_COUNT: Number of queues for the vNIC, subject to the rules discussed inCustom queue allocation.

Queue allocations and changing the machine type

Compute instances are created with adefault queue allocation, or you can assign acustom queue count to each virtual network interface card (vNIC) when you create a new compute instance by using the Compute Engine API. The default or custom vNIC queue assignments are only set when creating a compute instance. If your instance has vNICs that use default queue counts, you canchange its machine type. If the machine type that you are changing to has a different number of vCPUs, the default queue counts for your instance are recalculated based on the new machine type.

If your VM has vNICs which use custom, non-default queue counts, then you can change the machine type by using the Google Cloud CLI or Compute Engine API toupdate the instance properties. The conversion succeeds if the resulting VM supports the same queue count per vNIC as the original instance. For VMs that use the VirtIO-Net interface and have a custom queue count that is higher than 16 per vNIC, you can't change the machine type to a third generation or later machine type, because they use only gVNIC. Instead, you can migrate your VM to a third generation or later machine type by following the instructions inMove your workload to a new compute instance.

What's next