-
Notifications
You must be signed in to change notification settings - Fork 176
Readiness probe missing SSL flags when TLS is required (v1.22.0) #2294
Description
Bug Report
Description
In operator v1.22.0, the MongodReadinessCheck was changed from a simple TCP dial to a full MongoDB client connection with RSStatus check (readiness.go diff). However, the default readiness probe command in psmdb_defaults.go was not updated to include SSL flags, unlike the liveness probe which already has them.
This causes the readiness probe to fail when tls.mode: requireTLS is set, because the healthcheck binary tries to connect without TLS to a server that only accepts TLS connections.
Steps to Reproduce
- Deploy a PSMDB cluster with
tls.mode: requireTLSandcrVersion: 1.22.0 - Wait for or trigger a pod restart / rolling update
- Observe the new pod stays in
1/2 Readystate indefinitely
Expected Behavior
The readiness probe should include --ssl --sslInsecure --sslCAFile --sslPEMKeyFile flags when TLS is enabled, similar to how the liveness probe is configured.
Actual Behavior
The readiness probe command is:
/opt/percona/mongodb-healthcheck k8s readiness --component mongod
Missing the SSL flags. The liveness probe correctly has them:
/opt/percona/mongodb-healthcheck k8s liveness --ssl --sslInsecure --sslCAFile /etc/mongodb-ssl/ca.crt --sslPEMKeyFile /tmp/tls.pem --startupDelaySeconds 7200
MongoDB logs show a flood of SSLHandshakeFailed errors from 127.0.0.1 (the readiness probe):
"msg":"Error receiving request from client. Ending connection from remote",
"attr":{"error":{"code":141,"codeName":"SSLHandshakeFailed","errmsg":"The server is configured to only allow SSL connections"}}
The readiness probe times out after 2s and the pod never becomes ready.
Impact
This creates a deadlock during rolling updates:
- Pod N is updated with the new StatefulSet template (missing SSL flags on readiness probe)
- Pod N never becomes ready
- The operator's SmartUpdate waits for all pods to be ready before continuing
- The cluster is stuck in
initializingstate
Root Cause
In v1.21.2, MongodReadinessCheck only performed a raw TCP dial, so TLS was irrelevant. In v1.22.0, it was changed to use db.Dial() + RSStatus, which requires TLS when the server mandates it. But the probe command in psmdb_defaults.go was not updated to pass SSL flags for the readiness probe.
Suggested Fix
In pkg/apis/psmdb/v1/psmdb_defaults.go, add SSL flags to the readiness probe when TLS is enabled, similar to how the liveness probe handles it:
if replset.ReadinessProbe.TCPSocket == nil && replset.ReadinessProbe.Exec == nil {
replset.ReadinessProbe.Exec = &corev1.ExecAction{
Command: []string{
"/opt/percona/mongodb-healthcheck",
"k8s", "readiness",
"--component", "mongod",
},
}
// Add SSL flags when TLS is enabled
if cr.TLSEnabled() {
replset.ReadinessProbe.Exec.Command = append(replset.ReadinessProbe.Exec.Command,
"--ssl", "--sslInsecure",
"--sslCAFile", "/etc/mongodb-ssl/ca.crt",
"--sslPEMKeyFile", "/tmp/tls.pem")
}
}Environment
- Operator version: 1.22.0
- MongoDB version: 8.0.19-7
- Kubernetes: Talos Linux
- PSMDB CR:
tls.mode: requireTLS,crVersion: 1.22.0