Troubleshoot configuration (original) (raw)

This guide can help you solve common issues with Cloud NAT.

Common issues

VMs can reach the internet unexpectedly, without Cloud NAT

If your virtual machine (VM) instances or container instances can reach the internet without Cloud NAT, but you don't want them to, check for the following issues:

No logs are generated

Certain logs are excluded

Packets dropped with reason: out of resources

If you see packet loss from VMs that use Cloud NAT, this might be because there are not enough available NAT source IP address and source port tuples for the VM to use at the time of the packet loss (port exhaustion). A five-tuple (NAT source IP address, source port, and destination 3-tuple) cannot be reused within the TCP TIME_WAIT timeout.

If there aren't enough available NAT tuples, the dropped_sent_packets_count reason isOUT_OF_RESOURCES. For more information about metrics, see Using VM instance metrics.

See Reduce your port usage for ways to reduce port usage.

If you use dynamic port allocation, see the following section for ways to reduce packet drops when dynamic port allocation is used.

Packets dropped when dynamic port allocation is configured

Dynamic port allocation detects when a VM is close to being out of ports, and doubles the number of ports that are allocated to the VM. This helps ensure that ports aren't wasted, but can result in dropped packets while the number of allocated ports is increasing.

To reduce the number of dropped packets, consider the following:

If needed you can increase the setting:
sudo sysctl -w net.ipv4.tcp_syn_retries=NUM

Packets dropped with reason: endpoint independence conflict

If you see packet loss from VMs that use Public NAT, and you have Endpoint-Independent Mapping turned on, the packet loss might be caused by anendpoint independent conflict. If it is, thedropped_sent_packets_count reason isENDPOINT_INDEPENDENCE_CONFLICT. For more information about metrics, see Using VM instance metrics.

You can reduce the chances of endpoint independent conflicts by using the following techniques:

Dropped received packets

A Cloud NAT gateway maintains a connection tracking table to store active connection details and mappings of VM IP addresses and ports to NAT IP addresses and ports. A Cloud NAT gateway drops an ingress packet if the connection tracking table doesn't contain any entry for the connection.

A connection entry might be missing due to any of the following reasons:

Address application impact

Before troubleshooting, confirm whether ingress packet drops are affecting your application by checking for application errors that coincide with spikes in dropped ingress packets. If the application is impacted, use one or more of the following methods to address the issue:

Need to allocate more IP addresses

Sometimes your VMs are unable to reach the internet because you don't have enough NAT IP addresses. Multiple factors can cause this problem. For more information, see the following table.

Root cause Symptom Solution
You've manually allocated addresses, but you haven't allocated enough of them, given your current port usage. The Google Cloud console displays an error that reads You need to allocate at least 'X' more IP addresses to allow all instances to access the internet. The value of thenat_allocation_failed metric is true. Do one of the following: Minimize your port usage, as described inReduce your port usage. Manually add more IP addresses, as described inUpdate external IP addresses associated with NAT.
You have surpassed a hard limit for NAT IP addresses. The value of thenat_allocation_failed metric is true. Minimize your port usage, as described inReduce your port usage.

To monitor failures caused by an insufficient number of IP addresses, create an alert for thenat_allocation_failedmetric. This metric is set to true if Google Cloud is unable to allocate sufficient IP addresses for any VM in your NAT gateway. For information about alert policies, seeDefining alerting policies.

Reduce your port usage

You can minimize the number of ports that each VM uses in situations where allocating more NAT IP addresses is not possible or desirable.

To reduce port usage, complete the following steps:

  1. Disable Endpoint-Independent Mapping.
  2. Enable dynamic port allocation. To use dynamic port allocation, you set a minimum number of ports per VM and a maximum number of ports per VM. Cloud NAT automatically allocates a number of NAT source IP address and source port tuples between the minimum and maximum number of ports, inclusive. Using a low number for the minimum number of ports reduces wasting NAT source IP address and source port tuples on VMs with fewer active connections. If you encounter connection timeouts while ports are being allocated, see Reduce packet drops with dynamic port allocation.
  3. Determine the lowest possible minimum number of ports to meet your needs. There are several methods to do this, and most rely on reviewing the number of used ports (compute.googleapis.com/nat/port_usage) as input to the decision-making process. For information about how to find port usage, see View port usage. The following are two example methods to determine a minimum number of ports:
    • Consider the average value of compute.googleapis.com/nat/port_usage over a representative time period for a representative number of VMs.
    • Consider the most frequently occurring value ofcompute.googleapis.com/nat/port_usage over a representative time period for a representative number of VMs.
  4. Determine the lowest possible maximum number of ports to meet your needs. Once again, review compute.googleapis.com/nat/port_usage as input to your decision-making process. Consider the maximum value ofcompute.googleapis.com/nat/port_usage over a representative time period for a representative number of VMs as a starting point for the maximum number of ports. Keep in mind that setting the maximum number too high can prevent other VMs from receiving NAT source IP address and source port tuples.
  5. Finding the right values for minimum and maximum ports involves iterative testing. For steps to change minimum and maximum port numbers, seeChange minimum or maximum ports when dynamic port allocation is configured.
  6. Review the NAT timeouts, their meanings, and their default values. If you need to rapidly create a series of TCP connections to the same destination 3-tuple, consider reducing the TCP time wait so that Cloud NAT can more quickly re-use NAT source IP address and source port tuples. This allows Cloud NAT to more quickly use the same 5-tuple instead of needing to use a unique 5-tuple, which might require allocation of additional NAT source IP address and source port tuples for each sending VM. For steps to change NAT timeouts, seeChange NAT timeouts.

Frequently asked questions

Regional restriction for Cloud NAT

Can I use the same Cloud NAT gateway in more than one region?

No. A Cloud NAT gateway cannot be associated with more than one region, VPC network, or Cloud Router.

If you need to provide connectivity for other regions or VPC networks, create additional Cloud NAT gateways for them.

Are the external NAT IP addresses used by Cloud NAT gateways global or regional?

Cloud NAT gateways use regional external IP addresses as NAT IP addresses. Even though they are regional, they are publicly routable. For information about different ways that NAT IP addresses can be allocated or assigned, see NAT IP addresses.

When Cloud NAT can and cannot be used

Does Cloud NAT apply to instances, including GKE node VMs, that have external IP addresses?

Generally, no. If the network interface of a VM has an external IP address, Google Cloud always performs 1-to-1 NAT for packets sent from the primary internal IP address of the network interface without using Cloud NAT. However, Cloud NAT could still provide NAT services to packets sent from alias IP address ranges of that same network interface. For additional details, see Cloud NAT specifications and Compute Engine interaction.

Does Public NAT let a source VM whose network interface lacks an external IP address send traffic to a destination VM or load balancer that has an external IP address, even when the source and destination are in the same VPC network?

Yes. The network path involves sending traffic out of the VPC network through a default internet gateway, and then receiving it in the same network.

When the source VM sends a packet to the destination, Public NAT performs source NAT (SNAT) before delivering the packet to the second instance. Public NAT performs destination NAT (DNAT) for responses from the second instance to the first. For a step-by-step example, seeBasic Public NAT configuration and workflow.

Can I use Private NAT for communication between VMs in the same VPC network?

No, Private NAT doesn't perform NAT on traffic between VMs in the same VPC network.

Unsolicited incoming connections not supported

Does Cloud NAT allow for inbound connections (for example, SSH) to instances without external IP addresses?

No, Cloud NAT does not support unsolicited incoming connections. For more information, seeCloud NAT specifications. However, Google Cloud's network edge might respond to pings if the destination IP address is a Cloud NAT gateway external IP address that has active port mappings to at least one VM instance. To see IP addresses assigned to a Cloud NAT gateway, use thegcloud compute routers get-nat-ip-info command. External IP addresses marked as IN_USE might respond to pings.

If you need to connect to a VM that doesn't have an external IP address, seeChoose a connection option for internal-only VMs. For example, as part of the Cloud NAT example Compute Engine setup, you connect to a VM without an external IP address by using Identity-Aware Proxy.

Cloud NAT and ports

Why does a VM have a fixed number of ports (64 by default)?

When a Cloud NAT gateway provides NAT for a VM, it reserves source address and source port tuples according to the port reservation procedure.

For more information, see port reservation examples.

Can I change the minimum number of ports reserved for a VM?

Yes. You can increase or decrease the minimum number of ports per VM when you create a new Cloud NAT gateway or by editing it later. Each Cloud NAT gateway reserves source address and source port tuples according to the port reservation procedure.

For additional information about decreasing the minimum number of ports, see the next question.

Can I decrease the minimum number of ports per VM after creating the Cloud NAT gateway?

Yes; however, decreasing the minimum number of ports could result in the port reservation procedure reserving a smaller number of ports per VM. When this happens, existing TCP connections might be reset and, if so, must be re-established.

When switching NAT mapping from Primary and Secondary ranges to Primary range only, are additional ports allocated to each instance immediately released?

No. Any additional ports used by secondary ranges are retained by instances until the minimum ports per VM setting is reduced. When Cloud NAT is configured to map Secondary (alias) ranges for subnets, Cloud NAT assigns a minimum of 1,024 ports per instance, based on the port reservation procedure.

By switching to Primary ranges only, Cloud NAT conserves those additional allocated ports for instances that have already had those ports assigned. After changing the ranges for which Cloud NAT is applied to Primary only, the actual number of ports assigned to those instances is not changed until the minimum ports per VM setting is also reduced.

To reduce the amount of ports allocated to those instances, after switching to primary ranges, the minimum ports per VM setting must be reduced. After that value is reduced, Cloud NAT automatically adjusts the number of ports allocated per instance down, which reduces port consumption.

Cloud NAT and other Google services

Does Cloud NAT enable access to Google APIs and services?

When you enable Cloud NAT for a subnet's primary IP range, Google Cloud automatically enablesPrivate Google Access. For more information, seePrivate Google Access interaction.

Investigate Cloud NAT issues with Gemini Cloud Assist

You can use Gemini Cloud Assist investigations to troubleshoot Cloud NAT issues.

To create an investigation, do the following:

  1. In the Google Cloud console, go to the Cloud NAT page.
    Go to Cloud NAT
  2. Click your Cloud NAT gateway.
  3. On the Cloud NAT gateway details page, click Investigate.
  4. In the investigation creation panel, describe the issue that you want to troubleshoot, select the affected resources, and then clickCreate to start the investigation.
    For more information, seeCreate an investigation.

For warnings and errors in your Cloud NAT configuration, theInvestigate button is displayed with the alert. When creating an investigation for a warning or error, an issue description and relevant resources are automatically pre-populated into the investigation creation panel.

What's next