Runtime State Sharing in a Cluster (original) (raw)

  1. Home
  2. F5 NGINX Plus
  3. Admin Guide
  4. High Availability Runtime State Sharing in a Cluster

If several F5 NGINX Plus instances are organized in a cluster, they can share some state data between them, including:

All NGINX Plus instances can exchange state data with all other members in a cluster, provided that the shared memory zone has the same name on all cluster members.

State sharing across a cluster is eventually consistent by nature. It is strongly recommended using data-center grade networks for clustering traffic, as latency, low bandwidth, and packet loss will have a significant negative impact on state consistency. We do not recommend stretching clusters over the Internet, regions, or availability zones.

Configuring zone synchronization

For each NGINX instance in a cluster, open the NGINX configuration file and perform the following steps:

  1. Enable synchronization between cluster nodes: in the top-level stream block, create a server with the zone_sync directive:
    nginx
 stream {  
    #...  
     server {  
         zone_sync;  
         #...  
     }  
 }  
 stream {  
    #...  
     server {  
         zone_sync;  
         #...  
     }  
 }  
  1. Specify all other NGINX instances in a cluster with the zone_sync_server directive. Cluster nodes can be added dynamically using the DNS service if the resolver is used:
    nginx
 stream {  
     resolver 10.0.0.53 valid=20s;  
     server {  
         zone_sync;  
         zone_sync_server nginx-cluster.example.com:9000 resolve;  
     }  
 }  
 stream {  
     resolver 10.0.0.53 valid=20s;  
     server {  
         zone_sync;  
         zone_sync_server nginx-cluster.example.com:9000 resolve;  
     }  
 }  

Otherwise, each cluster node can be added statically as a separate line of the zone_sync_server directive:
nginx

 stream {  
     server {  
         zone_sync;  
         zone_sync_server nginx-node1.example.com:9000;  
         zone_sync_server nginx-node2.example.com:9000;  
         zone_sync_server nginx-node3.example.com:9000;  
     }  
 }  
 stream {  
     server {  
         zone_sync;  
         zone_sync_server nginx-node1.example.com:9000;  
         zone_sync_server nginx-node2.example.com:9000;  
         zone_sync_server nginx-node3.example.com:9000;  
     }  
 }  
  1. Enable SSL by specifying the ssl parameter of the listen directive for the TCP server:
    nginx
stream {  
    resolver 10.0.0.53 valid=20s;  
    server {  
        listen 10.0.0.1:9000 ssl;  
        #...  
        zone_sync;  
        zone_sync_server nginx-cluster.example.com:9000 resolve;  
        #...  
    }  
}  
stream {  
    resolver 10.0.0.53 valid=20s;  
    server {  
        listen 10.0.0.1:9000 ssl;  
        #...  
        zone_sync;  
        zone_sync_server nginx-cluster.example.com:9000 resolve;  
        #...  
    }  
}  
  1. Specify the path to the certificates with the ssl_certificate directive, and to the private key with the ssl_certificate_key directive. Both the certificate and key must be in the PEM format:
    nginx
stream {  
    resolver 10.0.0.53 valid=20s;  
    server {  
        listen 10.0.0.1:9000 ssl;  
        ssl_certificate     /etc/ssl/nginx-1.example.com.server_cert.pem;  
        ssl_certificate_key /etc/ssl/nginx-1.example.com.key.pem;  
        zone_sync;  
        zone_sync_server    nginx-cluster.example.com:9000 resolve;  
        #...  
    }  
}  
stream {  
    resolver 10.0.0.53 valid=20s;  
    server {  
        listen 10.0.0.1:9000 ssl;  
        ssl_certificate     /etc/ssl/nginx-1.example.com.server_cert.pem;  
        ssl_certificate_key /etc/ssl/nginx-1.example.com.key.pem;  
        zone_sync;  
        zone_sync_server    nginx-cluster.example.com:9000 resolve;  
        #...  
    }  
}  
  1. Enable SSL connections between cluster servers with the zone_sync_ssl directive, and enable verification of another cluster server certificate with zone_sync_ssl_verify and zone_sync_ssl_trusted_certificate directives:
    nginx
 stream {  
     resolver 10.0.0.53 valid=20s;  
     server {  
         listen 10.0.0.1:9000 ssl;  
         ssl_certificate        /etc/ssl/nginx-1.example.com.server_cert.pem;  
         ssl_certificate_key    /etc/ssl/nginx-1.example.com.key.pem;  
         zone_sync;  
         zone_sync_server    nginx-cluster.example.com:9000 resolve;  
         zone_sync_ssl                     on;  
         zone_sync_ssl_verify              on;  
         zone_sync_ssl_trusted_certificate /etc/ssl/server_ca.pem;  
         #...  
     }  
 }  
 stream {  
     resolver 10.0.0.53 valid=20s;  
     server {  
         listen 10.0.0.1:9000 ssl;  
         ssl_certificate        /etc/ssl/nginx-1.example.com.server_cert.pem;  
         ssl_certificate_key    /etc/ssl/nginx-1.example.com.key.pem;  
         zone_sync;  
         zone_sync_server    nginx-cluster.example.com:9000 resolve;  
         zone_sync_ssl                     on;  
         zone_sync_ssl_verify              on;  
         zone_sync_ssl_trusted_certificate /etc/ssl/server_ca.pem;  
         #...  
     }  
 }  
  1. Set up certificate-based authentication across cluster nodes.
    Add the zone_sync_ssl_certificate and zone_sync_ssl_certificate_key directives to send the client certificate for outgoing connections.
    Then configure NGINX to require client certificates by setting the ssl_verify_client directive to on and specifying the location of your client certificates CA with the ssl_trusted_certificate directive:
    nginx
 stream {  
     resolver 10.0.0.53 valid=20s;  
     server {  
         listen 10.0.0.1:9000 ssl;  
         ssl_certificate        /etc/ssl/nginx-1.example.com.server_cert.pem;  
         ssl_certificate_key    /etc/ssl/nginx-1.example.com.key.pem;  
         zone_sync;  
         zone_sync_server    nginx-cluster.example.com:9000 resolve;  
         zone_sync_ssl                     on;  
         zone_sync_ssl_verify              on;  
         zone_sync_ssl_trusted_certificate /etc/ssl/server_ca.pem;  
         zone_sync_ssl_certificate     localhost.crt;  
         zone_sync_ssl_certificate_key localhost.key;  
         ssl_verify_client       on;  
         ssl_trusted_certificate /etc/ssl/client_ca.pem;  
         #...  
    }  
}  
 stream {  
     resolver 10.0.0.53 valid=20s;  
     server {  
         listen 10.0.0.1:9000 ssl;  
         ssl_certificate        /etc/ssl/nginx-1.example.com.server_cert.pem;  
         ssl_certificate_key    /etc/ssl/nginx-1.example.com.key.pem;  
         zone_sync;  
         zone_sync_server    nginx-cluster.example.com:9000 resolve;  
         zone_sync_ssl                     on;  
         zone_sync_ssl_verify              on;  
         zone_sync_ssl_trusted_certificate /etc/ssl/server_ca.pem;  
         zone_sync_ssl_certificate     localhost.crt;  
         zone_sync_ssl_certificate_key localhost.key;  
         ssl_verify_client       on;  
         ssl_trusted_certificate /etc/ssl/client_ca.pem;  
         #...  
    }  
}  

Fine-tuning Synchronization

Generally you do not have to tune sync options, but in some cases it can be useful to adjust some of these values:

nginx

#...
zone_sync;
zone_sync_server nginx-cluster.example.com:9000 resolve;

zone_sync_buffers                256 4k;
zone_sync_connect_retry_interval 1s;
zone_sync_connect_timeout        5s;
zone_sync_interval               1s;
zone_sync_timeout                5s;
#...
#...
zone_sync;
zone_sync_server nginx-cluster.example.com:9000 resolve;

zone_sync_buffers                256 4k;
zone_sync_connect_retry_interval 1s;
zone_sync_connect_timeout        5s;
zone_sync_interval               1s;
zone_sync_timeout                5s;
#...

zone_sync_buffers - controls the number of buffers and their size. Increasing the number of buffers will increase the number of information stored in them.

zone_sync_connect_retry_interval - sets the timeout between connection attempts to a cluster node.

zone_sync_connect_timeout - sets the time required to connect to a cluster node.

zone_sync_interval - sets an interval for polling updates in a shared memory zone. Increasing this value may result in data inconsistency between cluster nodes, decreasing the value may lead to high consumption of cpu and memory resources.

zone_sync_timeout - sets the lifetime of a TCP stream between cluster nodes. If a TCP stream is idle for longer than the value, the connection will be closed.

To start a new node:

When the node is started, it discovers other nodes from DNS or static configuration and starts sending updates. Other nodes eventually discover the new node using DNS and start pushing updates to it.

To stop a node, send the ‘QUIT’ signal:

As soon as the node receives the signal, it finishes zone synchronization and gracefully closes open connections.

To remove a node:

When the node is removed, other nodes close connections to the removed node and will no longer try to connect to it. After the node is removed, it can be stopped.

Using synchronization in a cluster

Sticky learn zone synchronization

If your existing configuration already uses the sticky learn feature, existing state can be simply shared across a cluster by adding the sync parameter to the existing sticky directive in the configuration file of each NGINX instance in a cluster. Note that the zone name must be the same in all other NGINX nodes in the cluster:

nginx

upstream my_backend {
    zone my_backend 64k;

        server backends.example.com resolve;
            sticky learn zone=sessions:1m
            create=$upstream_cookie_session
            lookup=$cookie_session
            sync;
}

server {
    listen 80;
        location / {
            proxy_pass http://my_backend;
        }
}
upstream my_backend {
    zone my_backend 64k;

        server backends.example.com resolve;
            sticky learn zone=sessions:1m
            create=$upstream_cookie_session
            lookup=$cookie_session
            sync;
}

server {
    listen 80;
        location / {
            proxy_pass http://my_backend;
        }
}

See Enabling Session Persistence for information how to configure the “sticky learn” session persistence method.

Request limits zone synchronization

If your existing configuration already uses rate limiting, these limits can be applied across a cluster by simply adding the sync parameter to the limit_req_zone directive in the configuration file of each NGINX instance in a cluster:

nginx

limit_req_zone $remote_addr zone=req:1M rate=100r/s sync;

server {
    listen 80;
    location / {
        limit_req zone=req;

        proxy_pass http://my_backend;
    }
}
limit_req_zone $remote_addr zone=req:1M rate=100r/s sync;

server {
    listen 80;
    location / {
        limit_req zone=req;

        proxy_pass http://my_backend;
    }
}

The zone name also must be the same in all other NGINX nodes in the cluster.

See Limiting the Request Rate for more information.

Key-value storage zone synchronization

Similar to rate limiting and sticky learn, the contents of the key-value shared memory zone can be shared across NGINX machines in a cluster with the sync parameter of the keyval_zone directive:

nginx

keyval_zone zone=one:32k state=/var/lib/nginx/state/one.keyval sync;
keyval <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>a</mi><mi>r</mi><msub><mi>g</mi><mi>t</mi></msub><mi>e</mi><mi>x</mi><mi>t</mi></mrow><annotation encoding="application/x-tex">arg_text </annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8095em;vertical-align:-0.1944em;"></span><span class="mord mathnormal">a</span><span class="mord mathnormal" style="margin-right:0.02778em;">r</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.03588em;">g</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.2806em;"><span style="top:-2.55em;margin-left:-0.0359em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">t</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mord mathnormal">e</span><span class="mord mathnormal">x</span><span class="mord mathnormal">t</span></span></span></span>text zone=one;
#...
server {
    #...
    location / {
        return 200 $text;
    }

    location /api {
        api write=on;
    }
}
keyval_zone zone=one:32k state=/var/lib/nginx/state/one.keyval sync;
keyval <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>a</mi><mi>r</mi><msub><mi>g</mi><mi>t</mi></msub><mi>e</mi><mi>x</mi><mi>t</mi></mrow><annotation encoding="application/x-tex">arg_text </annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8095em;vertical-align:-0.1944em;"></span><span class="mord mathnormal">a</span><span class="mord mathnormal" style="margin-right:0.02778em;">r</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.03588em;">g</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.2806em;"><span style="top:-2.55em;margin-left:-0.0359em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">t</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mord mathnormal">e</span><span class="mord mathnormal">x</span><span class="mord mathnormal">t</span></span></span></span>text zone=one;
#...
server {
    #...
    location / {
        return 200 $text;
    }

    location /api {
        api write=on;
    }
}

See Dynamic Denylisting of IP Addresses for information how to configure and manage the key-value storage.

Cluster state data can be monitored with NGINX Plus API metrics:

In order to get access to API metrics, you will need to configure the API:

  1. Enable the NGINX Plus API in read‑write mode with the api directive:
    nginx
# ...  
 server {  
     listen 80;  
     server_name www.example.com;  
     location /api {  
         api write=on;  
     }  
 }  
# ...  
 server {  
     listen 80;  
     server_name www.example.com;  
     location /api {  
         api write=on;  
     }  
 }  
  1. It is highly recommended to restrict access to this location, for example by allowing access only from localhost (127.0.0.1), and by restricting access to PATCH, POST, and DELETE methods to some users with HTTP basic authentication:
    nginx
# ...  
 server {  
     listen 80;  
     server_name www.example.com;  
     location /api {  
         limit_except GET {  
             auth_basic "NGINX Plus API";  
             auth_basic_user_file /path/to/passwd/file;  
         }  
         api   write=on;  
         allow 127.0.0.1;  
         deny  all;  
     }  
 }  
# ...  
 server {  
     listen 80;  
     server_name www.example.com;  
     location /api {  
         limit_except GET {  
             auth_basic "NGINX Plus API";  
             auth_basic_user_file /path/to/passwd/file;  
         }  
         api   write=on;  
         allow 127.0.0.1;  
         deny  all;  
     }  
 }  

See Using the API for Dynamic Configuration for instructions how to configure and use NGINX Plus API.

Polling Sync Status with the API

To get the synchronization status of the shared memory zone, send the API command, for example, with curl:

curl -s '127.0.0.1/api/9/stream/zone_sync' | jq
curl -s '127.0.0.1/api/9/stream/zone_sync' | jq

The output will be:

json

{
  "zones" : {
    "zone1" : {
      "records_pending" : 2061,
      "records_total" : 260575
    },
    "zone2" : {
      "records_pending" : 0,
      "records_total" : 14749
    }
  },
  "status" : {
    "bytes_in" : 1364923761,
    "msgs_in" : 337236,
    "msgs_out" : 346717,
    "bytes_out" : 1402765472,
    "nodes_online" : 15
  }
}
{
  "zones" : {
    "zone1" : {
      "records_pending" : 2061,
      "records_total" : 260575
    },
    "zone2" : {
      "records_pending" : 0,
      "records_total" : 14749
    }
  },
  "status" : {
    "bytes_in" : 1364923761,
    "msgs_in" : 337236,
    "msgs_out" : 346717,
    "bytes_out" : 1402765472,
    "nodes_online" : 15
  }
}

If all nodes have approximately the same number of records (records_total) and almost empty outgoing queue (records_pending), the cluster may be considered healthy.