Cryptographic error when adding controller to cluster (original) (raw)
November 21, 2025, 3:27pm 1
Hello!
I have an issue when adding a controller to a cluster from another controller. Here is what I did:
My entire infrastructure is running inside a Docker Compose setup, which allows me to test different types of infrastructures.
I currently have three controllers (I’m trying to add them to a cluster).
First, I created the PKI with a root CA and three intermediate CAs, as specified in the documentation:
#!/bin/bash
function wait_for_internet {
while ! echo > /dev/tcp/1.1.1.1/80 ; do
echo "Internet unavailable, retrying..."
sleep 1
done
echo "Internet is available ✅"
}
function install_openziti_binary {
wait_for_internet
curl -sS https://get.openziti.io/install.bash | bash -s openziti
if command -v ziti &> /dev/null; then
echo "OpenZiti binary installed successfully ✅"
else
echo "Failed to install OpenZiti binary ❌"
exit 1
fi
}
function create_root_ca {
# Create the trust root, a self-signed CA
ziti pki create ca \
--pki-root /pki --ca-file root --ca-name 'Cluster Root CA' \
--trust-domain ha.test
}
function create_controllers_certs {
# Create the controller 1 intermediate/signing cert
ziti pki create intermediate \
--pki-root /pki \
--ca-name root \
--intermediate-file ctrl1 \
--intermediate-name 'Controller One Signing Cert'
# Create the controller 1 server cert
ziti pki create server \
--pki-root /pki \
--ca-name ctrl1 \
--dns "localhost,ctrl1.ziti.example.com,controller1" \
--ip "127.0.0.1,::1,controller1" \
--server-name ctrl1 \
--spiffe-id 'controller/ctrl1'
# Create the controller 1 client cert
ziti pki create client \
--pki-root /pki \
--ca-name ctrl1 \
--client-name ctrl1 \
--spiffe-id 'controller/ctrl1'
# Create the controller 2 intermediate/signing cert
ziti pki create intermediate \
--pki-root /pki \
--ca-name root \
--intermediate-file ctrl2 \
--intermediate-name 'Controller Two Signing Cert'
# Create the controller 2 server cert
ziti pki create server \
--pki-root /pki \
--ca-name ctrl2 \
--dns "localhost,ctrl2.ziti.example.com,controller2" \
--ip "127.0.0.1,::1,controller2" \
--server-name ctrl2 \
--spiffe-id 'controller/ctrl2'
# Create the controller 2 client cert
ziti pki create client \
--pki-root /pki \
--ca-name ctrl2 \
--client-name ctrl2 \
--spiffe-id 'controller/ctrl2'
# Create the controller 3 intermediate/signing cert
ziti pki create intermediate \
--pki-root /pki \
--ca-name root \
--intermediate-file ctrl3 \
--intermediate-name 'Controller Three Signing Cert'
# Create the controller 3 server cert
ziti pki create server \
--pki-root /pki \
--ca-name ctrl3 \
--dns "localhost,ctrl3.ziti.example.com,controller3" \
--ip "127.0.0.1,::1,controller3" \
--server-name ctrl3 \
--spiffe-id 'controller/ctrl3'
# Create the controller 3 client cert
ziti pki create client \
--pki-root /pki \
--ca-name ctrl3 \
--client-name ctrl3 \
--spiffe-id 'controller/ctrl3'
}
function main {
wait_for_internet
apt update
apt install -y iproute2 jq tcpdump iptables curl iputils-ping wget iproute2 net-tools gnupg dnsutils
if ! install_openziti_binary ; then
echo "Installation of OpenZiti binary failed. Exiting."
exit 1
fi
rm -rf /pki/*
create_root_ca
create_controllers_certs
rm -rf /shared_pki/*
cp -Rv /pki /shared_pki
}
main "$@"
So it will create a shared volume with those certs:
root@controller1:/# tree /controller_certs/pki/
/controller_certs/pki/
|-- ctrl1
| |-- certs
| | |-- client.cert
| | |-- client.chain.pem
| | |-- ctrl1.cert
| | |-- ctrl1.chain.pem
| | |-- server.cert
| | `-- server.chain.pem
| |-- crlnumber
| |-- crls
| |-- index.txt
| |-- index.txt.attr
| |-- keys
| | |-- client.key
| | |-- ctrl1.key
| | `-- server.key
| `-- serial
|-- ctrl2
| |-- certs
| | |-- client.cert
| | |-- client.chain.pem
| | |-- ctrl2.cert
| | |-- ctrl2.chain.pem
| | |-- server.cert
| | `-- server.chain.pem
| |-- crlnumber
| |-- crls
| |-- index.txt
| |-- index.txt.attr
| |-- keys
| | |-- client.key
| | |-- ctrl2.key
| | `-- server.key
| `-- serial
|-- ctrl3
| |-- certs
| | |-- client.cert
| | |-- client.chain.pem
| | |-- ctrl3.cert
| | |-- ctrl3.chain.pem
| | |-- server.cert
| | `-- server.chain.pem
| |-- crlnumber
| |-- crls
| |-- index.txt
| |-- index.txt.attr
| |-- keys
| | |-- client.key
| | |-- ctrl3.key
| | `-- server.key
| `-- serial
`-- root
|-- certs
| |-- ctrl1.cert
| |-- ctrl2.cert
| |-- ctrl3.cert
| `-- root.cert
|-- crlnumber
|-- crls
|-- index.txt
|-- index.txt.attr
|-- keys
| |-- ctrl1.key
| |-- ctrl2.key
| |-- ctrl3.key
| `-- root.key
`-- serial
17 directories, 51 files
And on each controller container, I move those certificates into the controller’s actual root PKI directory (I download openziti-controller before doing it, so it will erase the create root pki with rootca and intermediate controller):
function load_files_for_ha {
while [ "$(find /controller_certs/pki/"$CONTROLLER_NAME" -mindepth 1 -maxdepth 1 2>/dev/null | wc -l)" -lt 6 ]; do
echo "Certificates for $CONTROLLER_NAME not found, retrying..."
sleep 1
done
echo "Certificates for $CONTROLLER_NAME found ✅"
rm -rf ./pki/*
mkdir -p ./pki/"$CONTROLLER_NAME"
mkdir -p ./pki/root
if ! cp -Rv /controller_certs/pki/"$CONTROLLER_NAME"/* ./pki/"$CONTROLLER_NAME"; then
echo "Failed to copy certificates for $CONTROLLER_NAME ❌"
exit 1
fi
if ! cp -Rv /controller_certs/pki/root/* ./pki/root/; then
echo "Failed to copy certificates for ROOTCA ❌"
exit 1
fi
#mv ./pki/intermediate/certs/"$CONTROLLER_NAME".cert ./pki/intermediate/certs/intermediate.cert
#mv ./pki/intermediate/certs/"$CONTROLLER_NAME".chain.pem ./pki/intermediate/certs/intermediate.chain.pem
#mv ./pki/intermediate/keys/"$CONTROLLER_NAME".key ./pki/intermediate/keys/intermediate.key
# Edit the controller config.yml
mkdir -p /var/lib/private/ziti-controller/cluster
# enable clustering
echo -e "cluster:\n dataDir: /var/lib/private/ziti-controller/cluster" >> config.yml
# Replacing paths to use the copied certs
sed -i 's|pki/root/certs/root.cert|pki/root/certs/root.cert|g' config.yml
sed -i 's|pki/intermediate/certs/client.chain.pem|pki/'"$CONTROLLER_NAME"'/certs/client.chain.pem|g' config.yml
sed -i 's|pki/intermediate/certs/server.chain.pem|pki/'"$CONTROLLER_NAME"'/certs/server.chain.pem|g' config.yml
sed -i 's|pki/intermediate/keys/server.key|pki/'"$CONTROLLER_NAME"'/keys/server.key|g' config.yml
sed -i 's|pki/intermediate/certs/intermediate.cert|pki/'"$CONTROLLER_NAME"'/certs/'"$CONTROLLER_NAME"'.cert|g' config.yml
sed -i 's|pki/intermediate/keys/intermediate.key|pki/'"$CONTROLLER_NAME"'/keys/'"$CONTROLLER_NAME"'.key|g' config.yml
}
While updating the paths to point to the new certificates:
[...]
db: "/var/lib/private/ziti-controller/bbolt.db"
identity:
cert: "pki/ctrl1/certs/client.chain.pem"
server_cert: "pki/ctrl1/certs/server.chain.pem"
key: "pki/ctrl1/keys/server.key"
ca: "pki/root/certs/root.cert"
#alt_server_certs:
# - server_cert: ""
# server_key: ""
[...]
web:
- name: client-management
bindPoints:
- interface: 0.0.0.0:6262
address: controller1:6262
identity:
ca: "pki/root/certs/root.cert"
key: "pki/ctrl1/keys/server.key"
server_cert: "pki/ctrl1/certs/server.chain.pem"
cert: "pki/ctrl1/certs/client.chain.pem"
options:
idleTimeout: 5000ms
readTimeout: 5000ms
writeTimeout: 100000ms
minTLSVersion: TLS1.2
maxTLSVersion: TLS1.3
apis:
- binding: edge-management
options: { }
- binding: edge-client
options: { }
- binding: fabric
options: { }
- binding: edge-oidc
options: { }
- binding: zac
options:
location: /opt/openziti/share/console
indexFile: index.html
cluster:
dataDir: /var/lib/private/ziti-controller/cluster
After loading the certificates and editing the configuration file, I tried to figure out why this error occurred:
root@controller1:/# ziti agent cluster add tls:controller2:6262
cluster add failed: unable to dial tls:controller2:6262: remote error: tls: error decrypting message
Logs of controller1:
{"_context":"tls:controller2:6262","error":"remote error: tls: error decrypting message","file":"github.com/openziti/channel/v4@v4.2.35/classic_dialer.go:96","func":"github.com/openziti/channel/v4.(*classicDialer).CreateWithHeaders","level":"warning","msg":"error initiating channel with hello","time":"2025-11-21T15:02:23.359Z"}
{"file":"github.com/openziti/channel/v4@v4.2.35/message.go:732","func":"github.com/openziti/channel/v4.getRetryVersionFor","level":"info","msg":"defaulting to version 2","time":"2025-11-21T15:02:23.359Z"}
{"_context":"tls:controller2:6262","file":"github.com/openziti/channel/v4@v4.2.35/classic_dialer.go:100","func":"github.com/openziti/channel/v4.(*classicDialer).CreateWithHeaders","level":"warning","msg":"Retrying dial with protocol version 2","time":"2025-11-21T15:02:23.359Z"}
{"_context":"ch{agent}-\u003eu{existing}-\u003ei{ABMD}","file":"github.com/openziti/ziti/common/handler_common/common.go:34","func":"github.com/openziti/ziti/common/handler_common.SendOpResult","level":"error","msg":"agent error performing cluster.add-peer: (unable to dial tls:controller2:6262: remote error: tls: error decrypting message)","operation":"cluster.add-peer","time":"2025-11-21T15:02:23.387Z"}
Logs of controller2:
{"_context":"tls:0.0.0.0:6262","error":"tls: invalid signature by the client certificate: crypto/rsa: verification error","file":"github.com/openziti/transport/v2@v2.0.193/tls/listener.go:260","func":"github.com/openziti/transport/v2/tls.(*sharedListener).processConn","level":"error","msg":"handshake failed","remote":"172.30.2.5:41816","time":"2025-11-21T15:02:23.359Z"}
{"_context":"tls:0.0.0.0:6262","error":"tls: invalid signature by the client certificate: crypto/rsa: verification error","file":"github.com/openziti/transport/v2@v2.0.193/tls/listener.go:260","func":"github.com/openziti/transport/v2/tls.(*sharedListener).processConn","level":"error","msg":"handshake failed","remote":"172.30.2.5:41820","time":"2025-11-21T15:02:23.387Z"}
And to help with debugging, here is the complete tree of the shared volume (middle) and the certificates automatically loaded in each controller’s root PKI (left: controller1, right: controller2):
Do you have an idea of what could be the issue? I know the HA Controller is still in Beta mode and I didn’t saw any error like that on the discourse…
Thanks
plorenz November 24, 2025, 3:49pm 2
Hi @Damien , looking for differences in config, I see that in the example HA setup, for the identity.ca it's using:
identity:
ca: ./pki/ctrl1/certs/ctrl1.chain.pem
where you have
identity:
ca: "pki/root/certs/root.cert"
Do you want to try switching that and see if it resolves the issue?
Paul
Damien November 28, 2025, 9:23am 3
Hi, I already tested the configuration you recommended. I even brute-forced almost all possibilities of certificate paths, including those in the web section, but this resulted in the same error.
Damien December 3, 2025, 10:21am 4
Hi, any idea of what the root cause could be?
The error indicates that the wrong private keys are being used for the controller’s client certs.
Looking at your PKI generation, I do not see the --key-file flag to re-use the same server private key, so the client cert for each controller creates its own private key.
Seen here from your tree output, this is for ctrl1
| | |-- client.key
| | |-- ctrl1.key
| | `-- server.key
Then in your controller’s configuration you specify the identity as
identity:
cert: "pki/ctrl1/certs/client.chain.pem"
server_cert: "pki/ctrl1/certs/server.chain.pem"
key: "pki/ctrl1/keys/server.key"
ca: "pki/root/certs/root.cert"
You only have one key file described (key), which the controller will pair with the server_cert and cert. This is a convenience feature that allows you to use 1 private key for both the client and server certs.
Either add the --key-file flag when generating the client cert so that it uses the same private key as the server cert, or alter your config and ensure you copy over the client.key and update your identity configuration to include two keys:
identity:
cert: "pki/ctrl1/certs/client.chain.pem"
server_cert: "pki/ctrl1/certs/server.chain.pem"
key: "pki/ctrl1/keys/client.key"
server_key: "pki/ctrl1/keys/server.key"
ca: "pki/root/certs/root.cert"
In situations where the private keys live next to each other on a file system (rather than in hardware like an HSM), having two keys doesn’t make anything more secure or performant. We generally deploy with one key used for both the cert and server_cert.
Additionally, if you use openssl to generate your own certificates, it is possible to create a certificate that is both suitable for client and server usage. It does leak some of the client in server connections and server in client scenarios, but if someone did that identity blocks support it by just specifying cert and key fields. Hence the built in flexibility of this configuration block.
Damien December 15, 2025, 8:33am 6
Hello!
Sorry for the late reply, I was not available.
Just tried the --key-file @andrew.martinez said and it worked perfectly!
Here is the full rootca configuration I wrote in bash:
#!/bin/bash
function wait_for_internet {
while ! echo > /dev/tcp/1.1.1.1/80 ; do
echo "Internet unavailable, retrying..."
sleep 1
done
echo "Internet is available ✅"
}
function install_openziti_binary {
wait_for_internet
curl -sS https://get.openziti.io/install.bash | bash -s openziti
if command -v ziti &> /dev/null; then
echo "OpenZiti binary installed successfully ✅"
else
echo "Failed to install OpenZiti binary ❌"
exit 1
fi
}
function create_root_ca {
# Create the trust root, a self-signed CA
ziti pki create ca \
--pki-root /pki --ca-file root --ca-name 'Cluster Root CA' \
--trust-domain ha.test
}
function create_controllers_certs {
# Create the controller 1 intermediate/signing cert
ziti pki create intermediate \
--pki-root /pki \
--ca-name root \
--intermediate-file ctrl1 \
--intermediate-name 'Controller One Signing Cert'
# Create the controller 1 server cert
ziti pki create server \
--pki-root /pki \
--ca-name ctrl1 \
--dns "localhost,controller1" \
--ip "127.0.0.1,::1,controller1" \
--server-name ctrl1 \
--spiffe-id 'controller/ctrl1'
# Create the controller 1 client cert
ziti pki create client \
--pki-root /pki \
--ca-name ctrl1 \
--client-name ctrl1 \
--key-file server \
--spiffe-id 'controller/ctrl1'
# Create the controller 2 intermediate/signing cert
ziti pki create intermediate \
--pki-root /pki \
--ca-name root \
--intermediate-file ctrl2 \
--intermediate-name 'Controller Two Signing Cert'
# Create the controller 2 server cert
ziti pki create server \
--pki-root /pki \
--ca-name ctrl2 \
--dns "localhost,controller2" \
--ip "127.0.0.1,::1,controller2" \
--server-name ctrl2 \
--spiffe-id 'controller/ctrl2'
# Create the controller 2 client cert
ziti pki create client \
--pki-root /pki \
--ca-name ctrl2 \
--client-name ctrl2 \
--key-file server \
--spiffe-id 'controller/ctrl2'
# Create the controller 3 intermediate/signing cert
ziti pki create intermediate \
--pki-root /pki \
--ca-name root \
--intermediate-file ctrl3 \
--intermediate-name 'Controller Three Signing Cert'
# Create the controller 3 server cert
ziti pki create server \
--pki-root /pki \
--ca-name ctrl3 \
--dns "localhost,controller3" \
--ip "127.0.0.1,::1,controller3" \
--server-name ctrl3 \
--spiffe-id 'controller/ctrl3'
# Create the controller 3 client cert
ziti pki create client \
--pki-root /pki \
--ca-name ctrl3 \
--client-name ctrl3 \
--key-file server \
--spiffe-id 'controller/ctrl3'
}
function main {
wait_for_internet
apt update
apt install -y iproute2 jq tcpdump iptables curl iputils-ping wget iproute2 net-tools gnupg dnsutils
if ! install_openziti_binary ; then
echo "Installation of OpenZiti binary failed. Exiting."
exit 1
fi
rm -rf /pki/*
create_root_ca
create_controllers_certs
rm -rf /shared_pki/*
cp -Rv /pki /shared_pki
}
main "$@"
for each create client, I added --key-file server and ziti wrapper will automatically takes the good correct path to the server.key file regarding what we specified (ca-name in the same command. i.e.: ctrl2).
Now the server.key is the private key used for both client and server cryptographic communications.
After that, I just executed the command to add the ctrl2 to the ctrl1’s cluster:
root@controller1:/# ziti agent cluster add tls:controller2:6262
success, added peer at tls:controller2:6262 to cluster
Logs:
[ 25.386] INFO github.com/hashicorp/raft.(*Raft).appendConfigurationEntry: {server-addr=[tls:controller2:6262] command=[AddVoter] servers=[[[{Suffrage:Voter ID:ctrl1 Address:tls:controller1:6262} {Suffrage:Voter ID:ctrl2 Address:tls:controller2:6262}]]] server-id=[ctrl2]} updating configuration
[ 25.387] INFO github.com/hashicorp/raft.(*Raft).startStopReplication: {peer=[ctrl2]} added peer, starting replication
[ 25.387] INFO ziti/controller/raft/mesh.(*impl).Dial: {address=[tls:controller2:6262]} dialing raft peer channel
[ 25.387] INFO ziti/controller/raft/mesh.(*impl).GetOrConnectPeer: {address=[tls:controller2:6262]} establishing new raft peer channel
[ 25.388] INFO ziti/controller.(*Controller).routerDispatchCallback: {addresses=[[tls:controller1:6262 tls:controller2:6262]] index=[7]} syncing updated ctrl addresses to connected routers
[ 25.400] INFO ziti/common/metrics.ConfigureGoroutinesPoolMetrics.ConfigureGoroutinesPoolMetrics.GoroutinesPoolMetricsConfigF.func1.func2: {idleTime=[1s] minWorkers=[0] maxWorkers=[1] poolType=[command_handler] maxQueueSize=[250]} starting goroutine pool
[ 25.400] INFO ziti/controller/raft/mesh.(*impl).PeerConnected: {peerId=[ctrl2] peerAddr=[tls:controller2:6262]} peer connected
[ 25.400] INFO ziti/controller/raft/mesh.(*impl).GetOrConnectPeer: {peerId=[ctrl2] address=[tls:controller2:6262]} established new raft peer channel
[ 25.400] INFO ziti/controller/raft/mesh.(*impl).Dial: {address=[tls:controller2:6262] peerId=[ctrl2]} invoking raft connect on established peer channel
[ 25.400] INFO ziti/controller/raft/mesh.(*Peer).Connect: {peerId=[ctrl2] address=[tls:controller2:6262]} sending connect msg to raft peer
[ 25.401] INFO ziti/controller/model.(*ControllerManager).UpdateControllerState: acting as leader, updating controllers from peers, connectEvt? true, self: 9221f0963e32beb0040e99461efa77e36d8397da, peer count: 1, peers: 594378559ce5521355a2d36622da7008e8f1eddb
[ 25.401] INFO ziti/controller/raft/mesh.(*Peer).Connect: {peerId=[ctrl2] address=[tls:controller2:6262]} raft peer connected
[ 25.403] INFO github.com/hashicorp/raft.(*Raft).pipelineReplicate: {peer=[{Voter ctrl2 tls:controller2:6262}]} pipelining replication
[ 25.403] INFO ziti/controller/raft.(*BoltDbFsm).Apply: {index=[8]} apply log with type *command.CreateEntityCommand[*github.com/openziti/ziti/controller/model.Controller]
[ 25.406] INFO ziti/controller/raft.(*BoltDbFsm).Apply: {index=[9]} apply log with type *command.UpdateEntityCommand[*github.com/openziti/ziti/controller/model.Controller]
[ 25.931] INFO ziti/controller/raft/mesh.(*impl).Dial: {address=[tls:controller2:6262]} dialing raft peer channel
[ 25.931] INFO ziti/controller/raft/mesh.(*impl).Dial: {address=[tls:controller2:6262] peerId=[ctrl2]} invoking raft connect on established peer channel
[ 25.931] INFO ziti/controller/raft/mesh.(*Peer).Connect: {peerId=[ctrl2] address=[tls:controller2:6262]} sending connect msg to raft peer
[ 25.931] INFO ziti/controller/raft/mesh.(*Peer).Connect: {address=[tls:controller2:6262] peerId=[ctrl2]} raft peer connected
Worked perfectly, thanks y’all.
Have a nice day 
