RDMA network profiles (original) (raw)
This page provides an overview ofRemote Direct Memory Access (RDMA)network profiles in Google Cloud.
Overview
RDMA network profiles let you create Virtual Private Cloud (VPC) networks that provide low-latency, high-bandwidth RDMA communication between the memory or GPUs of Compute Engine instances that are created in the network.
RDMA network profiles are useful for running AI workloads. For more information about running AI workloads in Google Cloud, seeAI Hypercomputer overview.
You can create the following types of VPC networks by using RDMA network profiles:
| VPC network type | Protocol | Network profile resource name | Instance types | NIC type |
|---|---|---|---|---|
| Falcon VPC network | RDMA over Falcon transport | ZONE-vpc-falcon | H4D | IRDMA |
| RoCE VPC network | RDMA over converged ethernet v2 (RoCE v2) | For VM instances: ZONE-vpc-roce For bare metal instances: ZONE-vpc-roce-metal | VM: A3 Ultra,A4,A4X Bare metal: A4X Max | MRDMA |
Supported zones
RDMA network profiles are available in a limited set of zones. You can only create a Falcon VPC network or RoCE VPC network in a zone where the corresponding network profile is available.
To view the supported zones, seeList network profiles.
Alternatively, you can view the supported zones for the machine type that you intend to create in the network. RDMA network profiles are available in the same zones as their supported machine types. For more information, see the following:
- For GPU machine types, seeGPU locations.
- For other machine types, seeAvailable regions and zones.
Specifications
VPC networks created with an RDMA network profile have the following specifications:
- Zonal constraint. Resources using a VPC network with an RDMA network profile are limited to the same zone as the RDMA network profile associated with the VPC network during the network creation. This zonal limit has the following effects:
- All instances that have network interfaces in the VPC network must be created in the zone that matches the zone of the RDMA network profile used by the VPC network.
- All subnets created in the VPC network must be located in the region that contains the zone of the RDMA network profile used by the VPC network.
- RDMA network interfaces only. A VPC network with an RDMA network profile supports attachments only from specific network interfaces:
- Falcon VPC networks only support
IRDMAnetwork interfaces (NICs). - RoCE VPC networks only support
MRDMANICs.
All non-RDMA NICs of an instance must be attached to a regular VPC network.
- Falcon VPC networks only support
- 8896 byte MTU. For best performance, we recommend a maximum transmission unit (MTU) of
8896bytes for VPC networks with an RDMA network profile. This allows the RDMA driver in the instance's guest operating system to use smaller MTUs if needed.
If you create a VPC network with an RDMA network profile, then8896bytes is the default MTU. - Firewall differences. See the following information about firewall differences in VPC networks with an RDMA network profile:
- VPC networks with an RDMA network profile use the following implied firewall rules, which are different from the implied firewall rules used by regular VPC networks:
* Implied allow egress
* Implied allow ingress - Cloud NGFW support depends on the type of VPC network:
* For RoCE VPC networks, see the following:
* RoCE VPC networks for VM instances only support regional network firewall policies that have an RoCE firewall policy type. The set of parameters for rules within a supported regional network firewall policy are limited. For more information, see Cloud NGFW for RoCE VPC networks.
* RoCE VPC networks for bare metal instances don't support configuring Cloud NGFW rules or policies.
* Falcon VPC networks don't support configuring Cloud NGFW rules or policies.
- VPC networks with an RDMA network profile use the following implied firewall rules, which are different from the implied firewall rules used by regular VPC networks:
- No Connectivity Tests support.Connectivity Testsdoesn't support VPC networks with an RDMA network profile.
- Other VPC features. VPC networks with an RDMA network profile support a limited set of other VPC features. For more information, see the following Supported and unsupported features section.
Supported and unsupported features
The following sections describe which VPC features are supported and unsupported for each type of VPC network that you can create by using an RDMA network profile:
- Falcon VPC networks (falcon)
- RoCE VPC networks for VM instances (roce)
- RoCE VPC networks for bare metal instances (roce-metal)
Falcon VPC networks (falcon)
The following table lists the features that are supported by Falcon VPC networks, which are created by using thefalcon network profile.
| Feature | Supported | Network profile property | Network profile property value | Details |
|---|---|---|---|---|
| RDMA NICs | interfaceTypes | IRDMA | Falcon VPC networks support only IRDMA NICs Other NIC types, such as MRDMA, GVNIC, and VIRTIO_NET, aren't supported. | |
| Multi-NIC in the same network | allowMultiNicInSameNetwork | MULTI_NIC_IN_SAME_NETWORK_ALLOWED | Falcon VPC networks supportmulti-NIC instances, allowing two or more RDMA NICs of the same instance to be in the same VPC network. Each NIC must attach to a unique subnet in the VPC network. | |
| IPv4-only subnets | subnetworkStackTypes | SUBNET_STACK_TYPE_IPV4_ONLY | Falcon VPC networks support IPv4-only subnets, including the sameValid IPv4 ranges as regular VPC networks. Falcon VPC networks don't support dual-stack or IPv6-only subnets. For more information, see Types of subnets. | |
| PRIVATE subnet purpose | subnetworkPurposes | SUBNET_PURPOSE_PRIVATE | Falcon VPC networks support regular subnets, which have apurpose attribute value of PRIVATE. Falcon VPC networks don't support Private Service Connect subnets, proxy-only subnets, or Private NAT subnets. For more information, see Purposes of subnets. | |
| GCE_ENDPOINT address purpose | addressPurposes | GCE_ENDPOINT | Falcon VPC networks support IP addresses with apurpose attribute value of GCE_ENDPOINT, which is used by internal IP addresses of instance NICs. Falcon VPC networks don't support special purpose IP addresses, such as the SHARED_LOADBALANCER_VIP purpose. For more information, see the addresses resource reference. | |
| Dynamic Network Interfaces | allowSubInterfaces | SUBINTERFACES_BLOCKED | Falcon VPC networks don't supportDynamic NICs. | |
| Attachments from nic0 | allowDefaultNicAttachment | DEFAULT_NIC_ATTACHMENT_BLOCKED | Falcon VPC networks don't support attaching the nic0 network interface of an instance to the network. Each RDMA NIC attached to the VPC network must not be nic0. | |
| External IP addresses for instances | allowExternalIpAccess | EXTERNAL_IP_ACCESS_BLOCKED | Falcon VPC networks don't support assigningexternal IP addresses to RDMA NICs. Consequently, RDMA NICs don't have internet access. | |
| Alias IP ranges | allowAliasIpRanges | ALIAS_IP_RANGE_BLOCKED | Falcon VPC networks don't support assigningalias IP ranges to RDMA NICs. | |
| IP forwarding | allowIpForwarding | IP_FORWARDING_BLOCKED | Falcon VPC networks don't supportIP forwarding. | |
| Instance network migration | allowNetworkMigration | NETWORK_MIGRATION_BLOCKED | Falcon VPC networks don't supportmigrating instance NICs between networks. | |
| Auto mode | allowAutoModeSubnet | AUTO_MODE_SUBNET_BLOCKED | Falcon VPC networks can't be auto mode networks. For more information, see subnet creation mode. | |
| VPC Network Peering | allowVpcPeering | VPC_PEERING_BLOCKED | Falcon VPC networks don't support connecting to other VPC networks using VPC Network Peering. Consequently, Falcon VPC networks don't support connecting to services using private services access. | |
| Static routes | allowStaticRoutes | STATIC_ROUTES_BLOCKED | Falcon VPC networks don't supportstatic routes. | |
| Packet Mirroring | allowPacketMirroring | PACKET_MIRRORING_BLOCKED | Falcon VPC networks don't supportPacket Mirroring. | |
| Cloud NAT | allowCloudNat | CLOUD_NAT_BLOCKED | Falcon VPC networks don't support Cloud NAT. | |
| Cloud Router | allowCloudRouter | CLOUD_ROUTER_BLOCKED | Falcon VPC networks don't supportCloud Routers and dynamic routes. | |
| Cloud Interconnect | allowInterconnect | INTERCONNECT_BLOCKED | Falcon VPC networks don't supportCloud Interconnect VLAN attachments. | |
| Cloud VPN | allowVpn | VPN_BLOCKED | Falcon VPC networks don't supportCloud VPN tunnels. | |
| Network Connectivity Center | allowNcc | NCC_BLOCKED | Falcon VPC networks don't supportNCC. You can't add a Falcon VPC network as a VPC spoke to a NCC hub. | |
| Cloud Load Balancing | allowLoadBalancing | LOAD_BALANCING_BLOCKED | Falcon VPC networks don't supportCloud Load Balancing. Consequently, Falcon VPC networks don't support load balancer features, including Google Cloud Armor. | |
| Private Google Access | allowPrivateGoogleAccess | PRIVATE_GOOGLE_ACCESS_BLOCKED | Falcon VPC networks don't supportPrivate Google Access. | |
| Private Service Connect | allowPsc | PSC_BLOCKED | Falcon VPC networks don't supportPrivate Service Connect. |
RoCE VPC networks for VM instances (roce)
The following table lists the features that are supported by RoCE VPC networks for VM instances, which are created by using theroce network profile.
| Feature | Supported | Network profile property | Network profile property value | Details |
|---|---|---|---|---|
| RDMA NICs | interfaceTypes | MRDMA | RoCE VPC networks for VM instances support only MRDMA NICs Other NIC types, such as IRDMA, GVNIC, and VIRTIO_NET, aren't supported. | |
| Multi-NIC in the same network | allowMultiNicInSameNetwork | MULTI_NIC_IN_SAME_NETWORK_ALLOWED | RoCE VPC networks for VM instances supportmulti-NIC instances, allowing two or more RDMA NICs of the same instance to be in the same VPC network. Each NIC must attach to a unique subnet in the VPC network. See also RoCE VPC network multi-NIC considerations. | |
| IPv4-only subnets | subnetworkStackTypes | SUBNET_STACK_TYPE_IPV4_ONLY | RoCE VPC networks for VM instances support IPv4-only subnets, including the sameValid IPv4 ranges as regular VPC networks. RoCE VPC networks for VM instances don't support dual-stack or IPv6-only subnets. For more information, see Types of subnets. | |
| PRIVATE subnet purpose | subnetworkPurposes | SUBNET_PURPOSE_PRIVATE | RoCE VPC networks for VM instances support regular subnets, which have apurpose attribute value of PRIVATE. RoCE VPC networks for VM instances don't support Private Service Connect subnets, proxy-only subnets, or Private NAT subnets. For more information, see Purposes of subnets. | |
| GCE_ENDPOINT address purpose | addressPurposes | GCE_ENDPOINT | RoCE VPC networks for VM instances support IP addresses with apurpose attribute value of GCE_ENDPOINT, which is used by internal IP addresses of instance NICs. RoCE VPC networks for VM instances don't support special purpose IP addresses, such as the SHARED_LOADBALANCER_VIP purpose. For more information, see the addresses resource reference. | |
| Attachments from nic0 | allowDefaultNicAttachment | DEFAULT_NIC_ATTACHMENT_BLOCKED | RoCE VPC networks for VM instances don't support attaching the nic0 network interfaces of an instance to the network. Each RDMA NIC attached to the VPC network must not be nic0. | |
| External IP addresses for instances | allowExternalIpAccess | EXTERNAL_IP_ACCESS_BLOCKED | RoCE VPC networks for VM instances don't support assigningexternal IP addresses to RDMA NICs. Consequently, RDMA NICs don't have internet access. | |
| Dynamic Network Interfaces | allowSubInterfaces | SUBINTERFACES_BLOCKED | RoCE VPC networks for VM instances don't supportDynamic NICs. | |
| Alias IP ranges | allowAliasIpRanges | ALIAS_IP_RANGE_BLOCKED | RoCE VPC networks for VM instances don't support assigningalias IP ranges to RDMA NICs. | |
| IP forwarding | allowIpForwarding | IP_FORWARDING_BLOCKED | RoCE VPC networks for VM instances don't supportIP forwarding. | |
| Instance network migration | allowNetworkMigration | NETWORK_MIGRATION_BLOCKED | RoCE VPC networks for VM instances don't supportmigrating instance NICs between networks. | |
| Auto mode | allowAutoModeSubnet | AUTO_MODE_SUBNET_BLOCKED | RoCE VPC networks for VM instances can't be auto mode networks. For more information, see subnet creation mode. | |
| VPC Network Peering | allowVpcPeering | VPC_PEERING_BLOCKED | RoCE VPC networks for VM instances don't support connecting to other VPC networks using VPC Network Peering. Consequently, RoCE VPC networks for VM instances don't support connecting to services using private services access. | |
| Static routes | allowStaticRoutes | STATIC_ROUTES_BLOCKED | RoCE VPC networks for VM instances don't supportstatic routes. | |
| Packet Mirroring | allowPacketMirroring | PACKET_MIRRORING_BLOCKED | RoCE VPC networks for VM instances don't supportPacket Mirroring. | |
| Cloud NAT | allowCloudNat | CLOUD_NAT_BLOCKED | RoCE VPC networks for VM instances don't support Cloud NAT. | |
| Cloud Router | allowCloudRouter | CLOUD_ROUTER_BLOCKED | RoCE VPC networks for VM instances don't supportCloud Routers and dynamic routes. | |
| Cloud Interconnect | allowInterconnect | INTERCONNECT_BLOCKED | RoCE VPC networks for VM instances don't supportCloud Interconnect VLAN attachments. | |
| Cloud VPN | allowVpn | VPN_BLOCKED | RoCE VPC networks for VM instances don't supportCloud VPN tunnels. | |
| Network Connectivity Center | allowNcc | NCC_BLOCKED | RoCE VPC networks for VM instances don't supportNCC. You can't add a VPC network with an RDMA network profile as a VPC spoke to a NCC hub. | |
| Cloud Load Balancing | allowLoadBalancing | LOAD_BALANCING_BLOCKED | RoCE VPC networks for VM instances don't supportCloud Load Balancing. Consequently, RoCE VPC networks for VM instances don't support load balancer features, including Google Cloud Armor. | |
| Private Google Access | allowPrivateGoogleAccess | PRIVATE_GOOGLE_ACCESS_BLOCKED | RoCE VPC networks for VM instances don't supportPrivate Google Access. | |
| Private Service Connect | allowPsc | PSC_BLOCKED | RoCE VPC networks for VM instances don't supportPrivate Service Connect. |
RoCE VPC networks for bare metal instances (roce-metal)
The following table lists the features that are supported by RoCE VPC networks for bare metal instances, which are created by using theroce-metal network profile.
In addition to the properties described in the table, see the following limitations that apply to GPU bare metal instances:
- GPU bare metal instances can't use the following UDP ports in RoCE VPC networks because they are reserved by Google Cloud:
3882-3895and13882-13895. - GPU bare metal instances can't use the following IPv6 link-local address range, because it is reserved by Google Cloud:
fe80::badd:c0de:cafe:0/120.
| Feature | Supported | Network profile property | Network profile property value | Details |
|---|---|---|---|---|
| RDMA NICs | interfaceTypes | MRDMA | RoCE VPC networks for bare metal instances support only MRDMA NICs Other NIC types, such as IRDMA, GVNIC, and VIRTIO_NET, aren't supported. | |
| Multi-NIC in the same network | allowMultiNicInSameNetwork | MULTI_NIC_IN_SAME_NETWORK_ALLOWED | RoCE VPC networks for bare metal instances supportmulti-NIC instances, allowing two or more RDMA NICs of the same instance to be in the same VPC network. See also RoCE VPC network multi-NIC considerations. | |
| Multi-NIC in the same subnet | allowMultiNicInSameSubnetwork | MULTI_NIC_IN_SAME_SUBNETWORK_ALLOWED | RoCE VPC networks for bare metal instances supportmulti-NIC instances, allowing two or more RDMA NICs of the same instance to be in the same subnet of a VPC network. See also RoCE VPC network multi-NIC considerations. | |
| Predefined network ULA IPv6 range | predefinedNetworkInternalIpv6Range | ULA_IPV6_RANGE | RoCE VPC networks for bare metal instances support internal IPv6 and have a predefined /48 unique local address (ULA) range. The predefined range is provided by Google Cloud for each network profile instance and isn't configurable. | |
| One IPv6-only subnet | subnetworkStackTypes | SUBNET_STACK_TYPE_IPV6_ONLY | RoCE VPC networks for bare metal instances support a single, automatically-provided IPv6-only subnet that uses a predefined IPv6 subnet range and naming convention as described in the following table row. RoCE VPC networks for bare metal instances don't support dual-stack or IPv4-only subnets, or the creation of additional IPv6-only subnets. For more information, see Types of subnets. | |
| Predefined IPv6 subnet range and name prefix | predefinedSubnetworkRanges | ipv6Range: ULA_IPV6_RANGE namePrefix: default-subnet-1 | RoCE VPC networks for bare metal instances have a single subnet that is automatically-provided and has the following properties: The subnet uses a predefined /48 IPv6 range that exactly matches the network's /48 range. The subnet uses the following naming format: default-subnet-1-NETWORK_NAME, where NETWORK_NAME is the name that you provided when you created the network. | |
| PRIVATE subnet purpose | subnetworkPurposes | SUBNET_PURPOSE_PRIVATE | RoCE VPC networks for bare metal instances support regular subnets, which have apurpose attribute value of PRIVATE. RoCE VPC networks for bare metal instances don't support Private Service Connect subnets, proxy-only subnets, or Private NAT subnets. For more information, see Purposes of subnets. | |
| GCE_ENDPOINT address purpose | addressPurposes | GCE_ENDPOINT | RoCE VPC networks for bare metal instances support IP addresses with apurpose attribute value of GCE_ENDPOINT, which is used by internal IP addresses of instance NICs. RoCE VPC networks for bare metal instances don't support special purpose IP addresses, such as the SHARED_LOADBALANCER_VIP purpose. For more information, see the addresses resource reference. | |
| Subnet creation | allowSubnetworkCreation | SUBNETWORK_CREATION_BLOCKED | RoCE VPC networks for bare metal instances don't supportcreating subnets in the network. | |
| Attachments from nic0 | allowDefaultNicAttachment | DEFAULT_NIC_ATTACHMENT_BLOCKED | RoCE VPC networks for bare metal instances don't support attaching the nic0 network interfaces of an instance to the network. Each RDMA NIC attached to the VPC network must not be nic0. | |
| Static IP addresses for instances | allowAddressCreation | ADDRESS_CREATION_BLOCKED | RoCE VPC networks for bare metal instances don't support reserving staticinternal orexternal IP addresses, or configuring staticinternal orexternal IP addresses for instances. Consequently, instances in the network can use only ephemeral internal IP addresses. | |
| External IP addresses for instances | allowExternalIpAccess | EXTERNAL_IP_ACCESS_BLOCKED | RoCE VPC networks for bare metal instances don't support assigningexternal IP addresses, either static or ephemeral, to RDMA NICs. Consequently, RDMA NICs don't have internet access. | |
| Dynamic Network Interfaces | allowSubInterfaces | SUBINTERFACES_BLOCKED | RoCE VPC networks for bare metal instances don't supportDynamic NICs. | |
| Alias IP ranges | allowAliasIpRanges | ALIAS_IP_RANGE_BLOCKED | RoCE VPC networks for bare metal instances don't support assigningalias IP ranges to RDMA NICs. | |
| IP forwarding | allowIpForwarding | IP_FORWARDING_BLOCKED | RoCE VPC networks for bare metal instances don't supportIP forwarding. | |
| Instance network migration | allowNetworkMigration | NETWORK_MIGRATION_BLOCKED | RoCE VPC networks for bare metal instances don't supportmigrating instance NICs between networks. | |
| Auto mode | allowAutoModeSubnet | AUTO_MODE_SUBNET_BLOCKED | RoCE VPC networks for bare metal instances can't be auto mode networks. For more information, see subnet creation mode. | |
| VPC Network Peering | allowVpcPeering | VPC_PEERING_BLOCKED | RoCE VPC networks for bare metal instances don't support connecting to other VPC networks using VPC Network Peering. Consequently, RoCE VPC networks for bare metal instances don't support connecting to services using private services access. | |
| Static routes | allowStaticRoutes | STATIC_ROUTES_BLOCKED | RoCE VPC networks for bare metal instances don't supportstatic routes. | |
| VPC firewall rules | allowVpcFirewallRules | VPC_FIREWALL_RULES_BLOCKED | RoCE VPC networks for bare metal instances don't supportVPC firewall rules. | |
| Firewall policies | allowFirewallPolicy | FIREWALL_POLICY_BLOCKED | RoCE VPC networks for bare metal instances don't supportfirewall policies. | |
| Packet Mirroring | allowPacketMirroring | PACKET_MIRRORING_BLOCKED | RoCE VPC networks for bare metal instances don't supportPacket Mirroring. | |
| Cloud NAT | allowCloudNat | CLOUD_NAT_BLOCKED | RoCE VPC networks for bare metal instances don't support Cloud NAT. | |
| Cloud Router | allowCloudRouter | CLOUD_ROUTER_BLOCKED | RoCE VPC networks for bare metal instances don't supportCloud Routers and dynamic routes. | |
| Cloud Interconnect | allowInterconnect | INTERCONNECT_BLOCKED | RoCE VPC networks for bare metal instances don't supportCloud Interconnect VLAN attachments. | |
| Cloud VPN | allowVpn | VPN_BLOCKED | RoCE VPC networks for bare metal instances don't supportCloud VPN tunnels. | |
| Network Connectivity Center | allowNcc | NCC_BLOCKED | RoCE VPC networks for bare metal instances don't supportNCC. You can't add a RoCE VPC network as a VPC spoke to a NCC hub. | |
| Cloud Load Balancing | allowLoadBalancing | LOAD_BALANCING_BLOCKED | RoCE VPC networks for bare metal instances don't supportCloud Load Balancing. Consequently, RoCE VPC networks for bare metal instances don't support load balancer features, including Google Cloud Armor. | |
| Private Google Access | allowPrivateGoogleAccess | PRIVATE_GOOGLE_ACCESS_BLOCKED | RoCE VPC networks for bare metal instances don't supportPrivate Google Access. | |
| Private Service Connect | allowPsc | PSC_BLOCKED | RoCE VPC networks for bare metal instances don't supportPrivate Service Connect. |
Multi-NIC considerations for RoCE VPC networks
To support workloads that benefit from cross-rail GPU-to-GPU communication, RoCE VPC networks support instances that have multiple MRDMA NICs in the network. Placing two or moreMRDMA NICs in the same RoCE VPC network might affect network performance, including increased latency. MRDMA NICs useNCCL. NCCL attempts to align all network transfers, even for cross-rail communication. For example, it uses PXN to copy data through NVlink to a rail-aligned GPU before transferring it over the network.