Cryptographic error when adding controller to cluster (original) (raw)

November 21, 2025, 3:27pm 1

Hello!

I have an issue when adding a controller to a cluster from another controller. Here is what I did:

My entire infrastructure is running inside a Docker Compose setup, which allows me to test different types of infrastructures.

I currently have three controllers (I’m trying to add them to a cluster).

First, I created the PKI with a root CA and three intermediate CAs, as specified in the documentation:

#!/bin/bash

function wait_for_internet {
    while ! echo > /dev/tcp/1.1.1.1/80 ; do
        echo "Internet unavailable, retrying..."
        sleep 1
    done
    echo "Internet is available ✅"
}

function install_openziti_binary {
    wait_for_internet
    curl -sS https://get.openziti.io/install.bash | bash -s openziti
    if command -v ziti &> /dev/null; then
        echo "OpenZiti binary installed successfully ✅"
    else
        echo "Failed to install OpenZiti binary ❌"
        exit 1
    fi
}

function create_root_ca {
    # Create the trust root, a self-signed CA
    ziti pki create ca \
        --pki-root /pki --ca-file root --ca-name 'Cluster Root CA' \
        --trust-domain ha.test 
}

function create_controllers_certs {
    # Create the controller 1 intermediate/signing cert
    ziti pki create intermediate \
        --pki-root /pki \
        --ca-name root \
        --intermediate-file ctrl1 \
        --intermediate-name 'Controller One Signing Cert'

    # Create the controller 1 server cert
    ziti pki create server \
        --pki-root /pki \
        --ca-name ctrl1 \
        --dns "localhost,ctrl1.ziti.example.com,controller1" \
        --ip "127.0.0.1,::1,controller1" \
        --server-name ctrl1 \
        --spiffe-id 'controller/ctrl1'

    # Create the controller 1 client cert
    ziti pki create client \
        --pki-root /pki \
        --ca-name ctrl1 \
        --client-name ctrl1 \
        --spiffe-id 'controller/ctrl1'

    # Create the controller 2 intermediate/signing cert
    ziti pki create intermediate \
        --pki-root /pki \
        --ca-name root \
        --intermediate-file ctrl2 \
        --intermediate-name 'Controller Two Signing Cert'

    # Create the controller 2 server cert
    ziti pki create server \
        --pki-root /pki \
        --ca-name ctrl2 \
        --dns "localhost,ctrl2.ziti.example.com,controller2" \
        --ip "127.0.0.1,::1,controller2" \
        --server-name ctrl2 \
        --spiffe-id 'controller/ctrl2'

    # Create the controller 2 client cert
    ziti pki create client \
        --pki-root /pki \
        --ca-name ctrl2 \
        --client-name ctrl2 \
        --spiffe-id 'controller/ctrl2'

    # Create the controller 3 intermediate/signing cert
    ziti pki create intermediate \
        --pki-root /pki \
        --ca-name root \
        --intermediate-file ctrl3 \
        --intermediate-name 'Controller Three Signing Cert'

    # Create the controller 3 server cert
    ziti pki create server \
        --pki-root /pki \
        --ca-name ctrl3 \
        --dns "localhost,ctrl3.ziti.example.com,controller3" \
        --ip "127.0.0.1,::1,controller3" \
        --server-name ctrl3 \
        --spiffe-id 'controller/ctrl3'

    # Create the controller 3 client cert
    ziti pki create client \
        --pki-root /pki \
        --ca-name ctrl3 \
        --client-name ctrl3 \
        --spiffe-id 'controller/ctrl3'
}

function main {
    wait_for_internet
    apt update
    apt install -y iproute2 jq tcpdump iptables curl iputils-ping wget iproute2 net-tools gnupg dnsutils 
    if ! install_openziti_binary ; then
        echo "Installation of OpenZiti binary failed. Exiting."
        exit 1
    fi

    rm -rf /pki/*
    create_root_ca
    create_controllers_certs
    rm -rf /shared_pki/*
    cp -Rv /pki /shared_pki
}

main "$@"

So it will create a shared volume with those certs:

root@controller1:/# tree /controller_certs/pki/
/controller_certs/pki/
|-- ctrl1
|   |-- certs
|   |   |-- client.cert
|   |   |-- client.chain.pem
|   |   |-- ctrl1.cert
|   |   |-- ctrl1.chain.pem
|   |   |-- server.cert
|   |   `-- server.chain.pem
|   |-- crlnumber
|   |-- crls
|   |-- index.txt
|   |-- index.txt.attr
|   |-- keys
|   |   |-- client.key
|   |   |-- ctrl1.key
|   |   `-- server.key
|   `-- serial
|-- ctrl2
|   |-- certs
|   |   |-- client.cert
|   |   |-- client.chain.pem
|   |   |-- ctrl2.cert
|   |   |-- ctrl2.chain.pem
|   |   |-- server.cert
|   |   `-- server.chain.pem
|   |-- crlnumber
|   |-- crls
|   |-- index.txt
|   |-- index.txt.attr
|   |-- keys
|   |   |-- client.key
|   |   |-- ctrl2.key
|   |   `-- server.key
|   `-- serial
|-- ctrl3
|   |-- certs
|   |   |-- client.cert
|   |   |-- client.chain.pem
|   |   |-- ctrl3.cert
|   |   |-- ctrl3.chain.pem
|   |   |-- server.cert
|   |   `-- server.chain.pem
|   |-- crlnumber
|   |-- crls
|   |-- index.txt
|   |-- index.txt.attr
|   |-- keys
|   |   |-- client.key
|   |   |-- ctrl3.key
|   |   `-- server.key
|   `-- serial
`-- root
    |-- certs
    |   |-- ctrl1.cert
    |   |-- ctrl2.cert
    |   |-- ctrl3.cert
    |   `-- root.cert
    |-- crlnumber
    |-- crls
    |-- index.txt
    |-- index.txt.attr
    |-- keys
    |   |-- ctrl1.key
    |   |-- ctrl2.key
    |   |-- ctrl3.key
    |   `-- root.key
    `-- serial

17 directories, 51 files

And on each controller container, I move those certificates into the controller’s actual root PKI directory (I download openziti-controller before doing it, so it will erase the create root pki with rootca and intermediate controller):

function load_files_for_ha {
    while [ "$(find /controller_certs/pki/"$CONTROLLER_NAME" -mindepth 1 -maxdepth 1 2>/dev/null | wc -l)" -lt 6 ]; do
        echo "Certificates for $CONTROLLER_NAME not found, retrying..."
        sleep 1
    done
    echo "Certificates for $CONTROLLER_NAME found ✅"
    rm -rf ./pki/*
    mkdir -p ./pki/"$CONTROLLER_NAME"
    mkdir -p ./pki/root
    if ! cp -Rv /controller_certs/pki/"$CONTROLLER_NAME"/* ./pki/"$CONTROLLER_NAME"; then
        echo "Failed to copy certificates for $CONTROLLER_NAME ❌"
        exit 1
    fi
    if ! cp -Rv /controller_certs/pki/root/* ./pki/root/; then
        echo "Failed to copy certificates for ROOTCA ❌"
        exit 1
    fi
    #mv ./pki/intermediate/certs/"$CONTROLLER_NAME".cert ./pki/intermediate/certs/intermediate.cert
    #mv ./pki/intermediate/certs/"$CONTROLLER_NAME".chain.pem ./pki/intermediate/certs/intermediate.chain.pem
    #mv ./pki/intermediate/keys/"$CONTROLLER_NAME".key ./pki/intermediate/keys/intermediate.key

    # Edit the controller config.yml
    mkdir -p /var/lib/private/ziti-controller/cluster
    # enable clustering
    echo -e "cluster:\n  dataDir: /var/lib/private/ziti-controller/cluster" >> config.yml
    # Replacing paths to use the copied certs
    sed -i 's|pki/root/certs/root.cert|pki/root/certs/root.cert|g' config.yml
    sed -i 's|pki/intermediate/certs/client.chain.pem|pki/'"$CONTROLLER_NAME"'/certs/client.chain.pem|g' config.yml
    sed -i 's|pki/intermediate/certs/server.chain.pem|pki/'"$CONTROLLER_NAME"'/certs/server.chain.pem|g' config.yml
    sed -i 's|pki/intermediate/keys/server.key|pki/'"$CONTROLLER_NAME"'/keys/server.key|g' config.yml

    sed -i 's|pki/intermediate/certs/intermediate.cert|pki/'"$CONTROLLER_NAME"'/certs/'"$CONTROLLER_NAME"'.cert|g' config.yml
    sed -i 's|pki/intermediate/keys/intermediate.key|pki/'"$CONTROLLER_NAME"'/keys/'"$CONTROLLER_NAME"'.key|g' config.yml
}

While updating the paths to point to the new certificates:

[...]
db:                     "/var/lib/private/ziti-controller/bbolt.db"
identity:
  cert:        "pki/ctrl1/certs/client.chain.pem"
  server_cert: "pki/ctrl1/certs/server.chain.pem"
  key:         "pki/ctrl1/keys/server.key"
  ca:          "pki/root/certs/root.cert"
  #alt_server_certs:
  #  - server_cert:  ""
  #    server_key:   ""
[...]
web:
  - name: client-management
    bindPoints:
      - interface: 0.0.0.0:6262
        address: controller1:6262
    identity:
      ca:          "pki/root/certs/root.cert"
      key:         "pki/ctrl1/keys/server.key"
      server_cert: "pki/ctrl1/certs/server.chain.pem"
      cert:        "pki/ctrl1/certs/client.chain.pem"

    options:
      idleTimeout: 5000ms
      readTimeout: 5000ms
      writeTimeout: 100000ms
      minTLSVersion: TLS1.2
      maxTLSVersion: TLS1.3
    apis:
      - binding: edge-management
        options: { }
      - binding: edge-client
        options: { }
      - binding: fabric
        options: { }
      - binding: edge-oidc
        options: { }
      - binding: zac
        options:
          location: /opt/openziti/share/console
          indexFile: index.html
cluster:
  dataDir: /var/lib/private/ziti-controller/cluster

After loading the certificates and editing the configuration file, I tried to figure out why this error occurred:

root@controller1:/# ziti agent cluster add tls:controller2:6262
cluster add failed: unable to dial tls:controller2:6262: remote error: tls: error decrypting message

Logs of controller1:

{"_context":"tls:controller2:6262","error":"remote error: tls: error decrypting message","file":"github.com/openziti/channel/v4@v4.2.35/classic_dialer.go:96","func":"github.com/openziti/channel/v4.(*classicDialer).CreateWithHeaders","level":"warning","msg":"error initiating channel with hello","time":"2025-11-21T15:02:23.359Z"}
{"file":"github.com/openziti/channel/v4@v4.2.35/message.go:732","func":"github.com/openziti/channel/v4.getRetryVersionFor","level":"info","msg":"defaulting to version 2","time":"2025-11-21T15:02:23.359Z"}
{"_context":"tls:controller2:6262","file":"github.com/openziti/channel/v4@v4.2.35/classic_dialer.go:100","func":"github.com/openziti/channel/v4.(*classicDialer).CreateWithHeaders","level":"warning","msg":"Retrying dial with protocol version 2","time":"2025-11-21T15:02:23.359Z"}
{"_context":"ch{agent}-\u003eu{existing}-\u003ei{ABMD}","file":"github.com/openziti/ziti/common/handler_common/common.go:34","func":"github.com/openziti/ziti/common/handler_common.SendOpResult","level":"error","msg":"agent error performing cluster.add-peer: (unable to dial tls:controller2:6262: remote error: tls: error decrypting message)","operation":"cluster.add-peer","time":"2025-11-21T15:02:23.387Z"}

Logs of controller2:

{"_context":"tls:0.0.0.0:6262","error":"tls: invalid signature by the client certificate: crypto/rsa: verification error","file":"github.com/openziti/transport/v2@v2.0.193/tls/listener.go:260","func":"github.com/openziti/transport/v2/tls.(*sharedListener).processConn","level":"error","msg":"handshake failed","remote":"172.30.2.5:41816","time":"2025-11-21T15:02:23.359Z"}
{"_context":"tls:0.0.0.0:6262","error":"tls: invalid signature by the client certificate: crypto/rsa: verification error","file":"github.com/openziti/transport/v2@v2.0.193/tls/listener.go:260","func":"github.com/openziti/transport/v2/tls.(*sharedListener).processConn","level":"error","msg":"handshake failed","remote":"172.30.2.5:41820","time":"2025-11-21T15:02:23.387Z"}

And to help with debugging, here is the complete tree of the shared volume (middle) and the certificates automatically loaded in each controller’s root PKI (left: controller1, right: controller2):

image

Do you have an idea of what could be the issue? I know the HA Controller is still in Beta mode and I didn’t saw any error like that on the discourse…

Thanks

plorenz November 24, 2025, 3:49pm 2

Hi @Damien , looking for differences in config, I see that in the example HA setup, for the identity.ca it's using:

identity:
  ca: ./pki/ctrl1/certs/ctrl1.chain.pem

where you have

identity:
  ca:          "pki/root/certs/root.cert"

Do you want to try switching that and see if it resolves the issue?

Paul

Damien November 28, 2025, 9:23am 3

Hi, I already tested the configuration you recommended. I even brute-forced almost all possibilities of certificate paths, including those in the web section, but this resulted in the same error.

Damien December 3, 2025, 10:21am 4

Hi, any idea of what the root cause could be?

The error indicates that the wrong private keys are being used for the controller’s client certs.

Looking at your PKI generation, I do not see the --key-file flag to re-use the same server private key, so the client cert for each controller creates its own private key.

Seen here from your tree output, this is for ctrl1

|   |   |-- client.key
|   |   |-- ctrl1.key
|   |   `-- server.key

Then in your controller’s configuration you specify the identity as

identity:
  cert:        "pki/ctrl1/certs/client.chain.pem"
  server_cert: "pki/ctrl1/certs/server.chain.pem"
  key:         "pki/ctrl1/keys/server.key"
  ca:          "pki/root/certs/root.cert"

You only have one key file described (key), which the controller will pair with the server_cert and cert. This is a convenience feature that allows you to use 1 private key for both the client and server certs.

Either add the --key-file flag when generating the client cert so that it uses the same private key as the server cert, or alter your config and ensure you copy over the client.key and update your identity configuration to include two keys:

identity:
  cert:        "pki/ctrl1/certs/client.chain.pem"
  server_cert: "pki/ctrl1/certs/server.chain.pem"
  key:         "pki/ctrl1/keys/client.key"
  server_key:  "pki/ctrl1/keys/server.key"
  ca:          "pki/root/certs/root.cert"

In situations where the private keys live next to each other on a file system (rather than in hardware like an HSM), having two keys doesn’t make anything more secure or performant. We generally deploy with one key used for both the cert and server_cert.

Additionally, if you use openssl to generate your own certificates, it is possible to create a certificate that is both suitable for client and server usage. It does leak some of the client in server connections and server in client scenarios, but if someone did that identity blocks support it by just specifying cert and key fields. Hence the built in flexibility of this configuration block.

Damien December 15, 2025, 8:33am 6

Hello!

Sorry for the late reply, I was not available.

Just tried the --key-file @andrew.martinez said and it worked perfectly!

Here is the full rootca configuration I wrote in bash:

#!/bin/bash

function wait_for_internet {
    while ! echo > /dev/tcp/1.1.1.1/80 ; do
        echo "Internet unavailable, retrying..."
        sleep 1
    done
    echo "Internet is available ✅"
}

function install_openziti_binary {
    wait_for_internet
    curl -sS https://get.openziti.io/install.bash | bash -s openziti
    if command -v ziti &> /dev/null; then
        echo "OpenZiti binary installed successfully ✅"
    else
        echo "Failed to install OpenZiti binary ❌"
        exit 1
    fi
}

function create_root_ca {
    # Create the trust root, a self-signed CA
    ziti pki create ca \
        --pki-root /pki --ca-file root --ca-name 'Cluster Root CA' \
        --trust-domain ha.test 
}

function create_controllers_certs {
    # Create the controller 1 intermediate/signing cert
    ziti pki create intermediate \
        --pki-root /pki \
        --ca-name root \
        --intermediate-file ctrl1 \
        --intermediate-name 'Controller One Signing Cert'

    # Create the controller 1 server cert
    ziti pki create server \
        --pki-root /pki \
        --ca-name ctrl1 \
        --dns "localhost,controller1" \
        --ip "127.0.0.1,::1,controller1" \
        --server-name ctrl1 \
        --spiffe-id 'controller/ctrl1'

    # Create the controller 1 client cert
    ziti pki create client \
        --pki-root /pki \
        --ca-name ctrl1 \
        --client-name ctrl1 \
        --key-file server \
        --spiffe-id 'controller/ctrl1'

    # Create the controller 2 intermediate/signing cert
    ziti pki create intermediate \
        --pki-root /pki \
        --ca-name root \
        --intermediate-file ctrl2 \
        --intermediate-name 'Controller Two Signing Cert'

    # Create the controller 2 server cert
    ziti pki create server \
        --pki-root /pki \
        --ca-name ctrl2 \
        --dns "localhost,controller2" \
        --ip "127.0.0.1,::1,controller2" \
        --server-name ctrl2 \
        --spiffe-id 'controller/ctrl2'

    # Create the controller 2 client cert
    ziti pki create client \
        --pki-root /pki \
        --ca-name ctrl2 \
        --client-name ctrl2 \
        --key-file server \
        --spiffe-id 'controller/ctrl2'

    # Create the controller 3 intermediate/signing cert
    ziti pki create intermediate \
        --pki-root /pki \
        --ca-name root \
        --intermediate-file ctrl3 \
        --intermediate-name 'Controller Three Signing Cert'

    # Create the controller 3 server cert
    ziti pki create server \
        --pki-root /pki \
        --ca-name ctrl3 \
        --dns "localhost,controller3" \
        --ip "127.0.0.1,::1,controller3" \
        --server-name ctrl3 \
        --spiffe-id 'controller/ctrl3'

    # Create the controller 3 client cert
    ziti pki create client \
        --pki-root /pki \
        --ca-name ctrl3 \
        --client-name ctrl3 \
        --key-file server \
        --spiffe-id 'controller/ctrl3'
}

function main {
    wait_for_internet
    apt update
    apt install -y iproute2 jq tcpdump iptables curl iputils-ping wget iproute2 net-tools gnupg dnsutils 
    if ! install_openziti_binary ; then
        echo "Installation of OpenZiti binary failed. Exiting."
        exit 1
    fi

    rm -rf /pki/*
    create_root_ca
    create_controllers_certs
    rm -rf /shared_pki/*
    cp -Rv /pki /shared_pki
}

main "$@"

for each create client, I added --key-file server and ziti wrapper will automatically takes the good correct path to the server.key file regarding what we specified (ca-name in the same command. i.e.: ctrl2).

Now the server.key is the private key used for both client and server cryptographic communications.

After that, I just executed the command to add the ctrl2 to the ctrl1’s cluster:

root@controller1:/# ziti agent cluster add tls:controller2:6262
success, added peer at tls:controller2:6262 to cluster

Logs:

[  25.386]    INFO github.com/hashicorp/raft.(*Raft).appendConfigurationEntry: {server-addr=[tls:controller2:6262] command=[AddVoter] servers=[[[{Suffrage:Voter ID:ctrl1 Address:tls:controller1:6262} {Suffrage:Voter ID:ctrl2 Address:tls:controller2:6262}]]] server-id=[ctrl2]} updating configuration
[  25.387]    INFO github.com/hashicorp/raft.(*Raft).startStopReplication: {peer=[ctrl2]} added peer, starting replication
[  25.387]    INFO ziti/controller/raft/mesh.(*impl).Dial: {address=[tls:controller2:6262]} dialing raft peer channel
[  25.387]    INFO ziti/controller/raft/mesh.(*impl).GetOrConnectPeer: {address=[tls:controller2:6262]} establishing new raft peer channel
[  25.388]    INFO ziti/controller.(*Controller).routerDispatchCallback: {addresses=[[tls:controller1:6262 tls:controller2:6262]] index=[7]} syncing updated ctrl addresses to connected routers
[  25.400]    INFO ziti/common/metrics.ConfigureGoroutinesPoolMetrics.ConfigureGoroutinesPoolMetrics.GoroutinesPoolMetricsConfigF.func1.func2: {idleTime=[1s] minWorkers=[0] maxWorkers=[1] poolType=[command_handler] maxQueueSize=[250]} starting goroutine pool
[  25.400]    INFO ziti/controller/raft/mesh.(*impl).PeerConnected: {peerId=[ctrl2] peerAddr=[tls:controller2:6262]} peer connected
[  25.400]    INFO ziti/controller/raft/mesh.(*impl).GetOrConnectPeer: {peerId=[ctrl2] address=[tls:controller2:6262]} established new raft peer channel
[  25.400]    INFO ziti/controller/raft/mesh.(*impl).Dial: {address=[tls:controller2:6262] peerId=[ctrl2]} invoking raft connect on established peer channel
[  25.400]    INFO ziti/controller/raft/mesh.(*Peer).Connect: {peerId=[ctrl2] address=[tls:controller2:6262]} sending connect msg to raft peer
[  25.401]    INFO ziti/controller/model.(*ControllerManager).UpdateControllerState: acting as leader, updating controllers from peers, connectEvt? true, self: 9221f0963e32beb0040e99461efa77e36d8397da, peer count: 1, peers: 594378559ce5521355a2d36622da7008e8f1eddb
[  25.401]    INFO ziti/controller/raft/mesh.(*Peer).Connect: {peerId=[ctrl2] address=[tls:controller2:6262]} raft peer connected
[  25.403]    INFO github.com/hashicorp/raft.(*Raft).pipelineReplicate: {peer=[{Voter ctrl2 tls:controller2:6262}]} pipelining replication
[  25.403]    INFO ziti/controller/raft.(*BoltDbFsm).Apply: {index=[8]} apply log with type *command.CreateEntityCommand[*github.com/openziti/ziti/controller/model.Controller]
[  25.406]    INFO ziti/controller/raft.(*BoltDbFsm).Apply: {index=[9]} apply log with type *command.UpdateEntityCommand[*github.com/openziti/ziti/controller/model.Controller]
[  25.931]    INFO ziti/controller/raft/mesh.(*impl).Dial: {address=[tls:controller2:6262]} dialing raft peer channel
[  25.931]    INFO ziti/controller/raft/mesh.(*impl).Dial: {address=[tls:controller2:6262] peerId=[ctrl2]} invoking raft connect on established peer channel
[  25.931]    INFO ziti/controller/raft/mesh.(*Peer).Connect: {peerId=[ctrl2] address=[tls:controller2:6262]} sending connect msg to raft peer
[  25.931]    INFO ziti/controller/raft/mesh.(*Peer).Connect: {address=[tls:controller2:6262] peerId=[ctrl2]} raft peer connected

Worked perfectly, thanks y’all.

Have a nice day :slight_smile: