CNTRLPLANE-3363: Add KMS plugin health reporter design by ibihim · Pull Request #2005 · openshift/enhancements

ibihim · 2026-05-08T17:56:26Z

What

A health reporter sidecar runs alongside every API server pod replica when KMS is enabled. It probes the colocated KMS plugin(s) and writes a single advisory KMSHealthReporter_<nodeName> condition per node on the apiserver operator CR.

Why

Exposes plugin health state through the operator CRs and onward into the ClusterOperator's Degraded condition, so a misbehaving KMS plugin is visible in oc get co rather than silently waiting until KAS encryption fails.

Supports future key rotation: per-plugin keyID in the reporter's Message lets a rotation controller verify all nodes agree on the active key before initiating rotation.

openshift-ci-robot · 2026-05-08T17:56:50Z

@ibihim: This pull request references CNTRLPLANE-3363 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set.

Details

In response to this:

What

A health reporter sidecar runs alongside every API server pod replica when KMS is enabled. It probes the colocated KMS plugin(s) and writes a single advisory KMSHealthReporter_<nodeName> condition per node on the apiserver operator CR. A separate aggregator controller reads those conditions and emits a single KMSPluginsDegraded rollup; library-go's StatusSyncer propagates the _Degraded suffix into the ClusterOperator's Degraded condition. The Message field on each per-node condition is structured input for the aggregator: one key=value line per probed plugin, carrying keyID, status, lastChecked, and an optional trailing detail.

Why

Exposes plugin health state through the operator CRs and onward into the ClusterOperator's Degraded condition, so a misbehaving KMS plugin is visible in oc get co rather than silently waiting until KAS encryption fails.

Supports future key rotation: per-plugin keyID in the reporter's Message lets a rotation controller verify all nodes agree on the active key before initiating rotation.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

- Per-node health reporter sidecar publishes one advisory KMSHealthReporter_<nodeName> condition on the apiserver operator CR. - Aggregator controller reads those conditions and emits a single KMSPluginsDegraded rollup; library-go's StatusSyncer routes the _Degraded suffix into the ClusterOperator's Degraded condition. - Message format: one key=value line per probed plugin (keyID, status, lastChecked, optional trailing detail). - Risks: stale reporter conditions, orphaned conditions on KMS disable, cold-start window.

ardaguclu · 2026-05-11T09:54:05Z


+#### Health Reporter Sidecar
+
+When KMS encryption is enabled, a health reporter sidecar runs alongside every API server pod replica. The sidecar probes the colocated KMS plugin(s) and publishes the outcome to the owning operator's CR as a per-node condition. A separate aggregator controller picks up these conditions and emits a single `KMSPluginsDegraded` rollup, which propagates to the `ClusterOperator`'s `Degraded` condition.


I was thinking there will be one health reporter sidecar per each kms plugin (i.e. health reporter a for kms plugin a, health reporter b for kms plugin b, etc.) not single health reporter plugin for all kms plugins. Is that correct?.

Plugin lifecycle is supposed to inject kms plugins. That means it will additionally inject one health monitor side car?.

yeah it's a bit contradicting with the sections below. I was also under the assumption we had a 1:1 monitor:plugin relation.

What

One reporter sidecar per API server pod, not one per plugin. It probes every colocated KMS socket and emits a single per-node condition whose Message carries one line per plugin (see Message format below).

Why

One monitor per node keeps condition cardinality constant. With one monitor per plugin, it won't (because of SSA / field ownership).

I'm assuming the injector can run one reporter per pod with all plugin sockets exposed to it, by passing socket paths as flags at startup.

If the above turns out to be harder than 1:1 injection in practice, I will change it.

ardaguclu · 2026-05-11T09:55:15Z

+- One per `openshift-oauth-apiserver` Deployment replica
+- One per `openshift-apiserver` Deployment replica
+
+During KMS-to-KMS migration, the same sidecar probes every active KMS plugin in its pod (see [Multiple Concurrent Sidecars](#multiple-concurrent-sidecars)) and reports their combined state in the Message field of its single per-node condition.


by checking the current EncryptionConfiguration?

The health reporter is given the UDS paths as a flag argument. Technically it could also scan for mounted UDS sockets.

ardaguclu · 2026-05-11T09:57:09Z

+                                    │  SSA (per-node fieldManager)
+                                    ▼
+operator CR (kubeapiservers.operator.openshift.io/cluster):
+  ├─ KMSHealthReporter_<nodeName>    ◄─ written by each per-pod reporter


Can you give an example about how kms-2.sock and kms-1.sock will be represented in operator CR?

I saw example below.

ardaguclu · 2026-05-11T09:58:06Z

+|---|---|---|
+| All plugins healthy | True | `AsExpected` |
+| Any plugin returns not-ok | False | `Unhealthy` |
+| Any plugin RPC error, no explicit not-ok | Unknown | `Unreachable` |


From where the details of RPC error will be seen?

Every call to the KMS plugin can create an error as per interface:

Status(ctx context.Context) (*StatusResponse, error) // error = RPC error.

How can cluster admin see the RPC error details from the operator conditions reported by health monitor?. cluster admin can not send request to Status endpoint of kms plugin.

The cluster admin will see the aggregated status created by the condition controller.

Is this a CTA, to add this notion to the EP?

Added a paragraph for this: 771b367

ardaguclu · 2026-05-11T09:58:53Z

+The Message is the structured input the aggregator controller consumes. One line per probed plugin:
+
+```
+keyID=<id> status=healthy lastChecked=<RFC 3339 timestamp>


How can this format be parsed from other controllers (i.e. rotation controller)

I tried to replicate

message: |- Available: The registry is ready NodeCADaemonAvailable: The daemon set node-ca has available replicas ImagePrunerAvailable: Pruner CronJob has been created

If you prefer, I can dump json encoded data into it.

Agreed to use minified JSON

ardaguclu · 2026-05-11T09:59:50Z

+      status: "True"
+      reason: AsExpected
+      message: |
+        keyID=2 status=healthy lastChecked=2026-05-08T12:34:56Z


This data probably should be in decodable format instead of one liner. So that it can be usable.

Agreed to use minified JSON

ardaguclu · 2026-05-11T10:00:44Z

+
+##### SSA mechanism
+
+Two classes of writers share the conditions array: the operator (writing its own conditions like `NodeControllerDegraded` plus the aggregator's `KMSPluginsDegraded` rollup) and N reporter sidecars (each writing its `KMSHealthReporter_<nodeName>`). Per-entry ownership is enabled by `+listType=map +listMapKey=type` on `OperatorStatus.Conditions`. Each reporter uses a per-node fieldManager (`kms-health-reporter-<nodeName>`); all writers apply with `Force: true`.


TBH, I didn't understand anything from this paragraph :). Can we update it better clarify.

I tried to explain the SSA dynamic. I can drop the section completely.

operator.openshift.io/v1.KubeAPIServer has a status that, where the health reporter would write conditions to. So there is a potential conflict that can be resolved by SSA with +listType=map, but this is a thing we should be cautious of.

ardaguclu · 2026-05-11T10:02:35Z

+
+##### Naming convention
+
+Each reporter sidecar writes one condition per pod replica to the owning operator's CR (`kubeapiservers.operator.openshift.io/cluster`, etc.), keyed by the node:


Could you please mention about who is going to inject this side car.

https://github.com/ibihim/enhancements/blob/2bb10c537d05893622c0340a2b2322d6d0765abd/enhancements/kube-apiserver/kms-encryption-foundations.md#health-reporter-sidecar now mentions the pluginlifecycle package.

ardaguclu · 2026-05-11T10:04:39Z

+
+##### Destination
+
+The sidecar writes one advisory condition per pod replica to the owning operator's `*.operator.openshift.io/cluster` CR via Server-Side Apply. The aggregator controller reads these advisory conditions and emits the `KMSPluginsDegraded` rollup. See [KMS Plugin Health Conditions](#kms-plugin-health-conditions) for the exact naming, status mapping, and rollup behavior.


Is this "aggregator controller" referring to existing controller or will be implemented?

Honestly, I was not sure. I would have hoped to let it handle by the conditionController.

Will there be a change in condition controller?. That means there is a change in encryption controllers. I think we should reflect this what changes are expected?.

As we discussed, we will see if it works to put it into the condition controller and if not, we will create a new one.

p0lyn0mial · 2026-05-11T10:02:40Z

+
+- `State`: healthy / unhealthy / RPC error
+- `KeyID`: keyID currently active in the probed plugin
+- `Timestamp`: time of this check (last-check time; not the condition's `LastTransitionTime`, which records state flips only)


LastTransitionTime, which records state flips only

what does it mean ? does changing the msg counts as changing the state ?

Timestamp means, whenever the response from the KMS plugin arrived to the health reporter.
It explicitly is not a LTT, based on the suggestion you made earlier.

This means you can derive staleness from now > timestamp + probe interval + skew

p0lyn0mial · 2026-05-11T10:17:49Z


 During KMS-to-KMS migration, the encryption-configuration secret contains provider configs for all active keys. The operator creates a separate sidecar for each key, listening on its own unix domain socket (e.g., `kms-1.sock`, `kms-2.sock`).

+#### Health Reporter Sidecar


I think that in general we miss:

Specifying the probe interval

Specifying the creds used to publish the condition

Explaining what happens with orphaned conditions

A plugin identity, when multiple plugins are specified - otherwise we cannot map status to a specific plugin.

Listing the alternatives we discussed

Specifying how the reporter connects to the apiserver. Perhaps for KAS we could go via localhost

Yes, this is worthy to elaborate to balance detection and hammering KAS with SSA request.

I assumed we would have a SA with RBAC.

https://github.com/openshift/enhancements/pull/2005/changes#diff-c0e8034f61b2841cee9b2b1d07403961148c3a30f096b0d35720a57ead8d3697R660

In the status of the condition each health report has its own line. I assumed that the keyID would make it unique. Does it not?

Ok.

See 2.

Set to 30s.

legacy SA token with RBAC for now.

Node + keyID

oh damn. I will add that.

service is slightly better, but localhost is worth mentioning.

Added a section in alternatives to cover the options discussed: b411a09

p0lyn0mial · 2026-05-11T10:20:51Z

+
+Each probe yields a HealthStatus carrying:
+
+- `State`: healthy / unhealthy / RPC error


State or Status?

p0lyn0mial · 2026-05-11T10:21:39Z

+Each probe yields a HealthStatus carrying:
+
+- `State`: healthy / unhealthy / RPC error
+- `KeyID`: keyID currently active in the probed plugin


I would drop it for now. We don't know how rotation will work yet.

/cc @tjungblu

no we should include this :) at least that makes my life easier to reference this mechanism in the other enhancement.

@tjungblu work on the key rotation was the motivation to include it.

We talked about the possibility of using the mechanism produced by the reporter to publish information about the keys. But we still don’t know the detail, so for now I’d prefer to remove the information about the keyID until then.

I also spoke with @benluddy yesterday, and he would like to add an API instead of using conditions.

We will probably also change the health reporting later, but he agreed that initially we can use conditions for health reporting.

I think we agreed, to show keyID and kekID for now as minified JSON in the message. Correct me if I am wrong.

p0lyn0mial · 2026-05-11T10:25:04Z

+        keyID=1 status=healthy lastChecked=2026-05-08T12:34:56Z
+    - type: KMSHealthReporter_master-1
+      status: "False"
+      reason: Unhealthy


not sure if we should change this filed. The msg could be interpreted by the controller

OK, this is a good point that would need investigation.

p0lyn0mial · 2026-05-11T10:25:12Z

+        keyID=2 status=healthy lastChecked=2026-05-08T12:34:56Z
+        keyID=1 status=healthy lastChecked=2026-05-08T12:34:56Z
+    - type: KMSHealthReporter_master-1
+      status: "False"


not sure if we should change this filed.

p0lyn0mial · 2026-05-11T10:27:02Z

+keyID=<id> status=unhealthy lastChecked=<RFC 3339 timestamp> detail=<healthz.detail>
+```
+
+Each line is a sequence of `key=value` pairs separated by spaces. `detail=...` is the only field whose value may contain spaces; it is always last on its line. `lastChecked` is per-plugin so partial probe failures (one plugin stuck while others probe normally) are visible.


is the only field whose value may contain spaces; it is always last on its line

does it mean we need to escape spaces in the detail field ?

not directly, but as I mentioned to @ardaguclu, I tried to replicate the existing format. I could use JSON if this is preferred.

ardaguclu · 2026-05-11T10:31:37Z

+
+##### Probe contract
+
+Each sidecar probes its colocated KMS plugin(s) over the local UDS at `unix:///var/run/kmsplugin/kms-{keyID}.sock` (the same socket path scheme described in [Sidecar Injection](#sidecar-injection)).


However, this states each sidecar probes its colocated KMS Plugins

... and is it the keyID though? I would assume we would allocate a unique socket path on each node like:

/var/run/kmsplugin/kms-kube-apiserver/{encryptionconfig.Name (?) }/plugin.sock

not sure whether we need to separate via folders here though

there's also no sidecar-injection section either 😄

@ardaguclu, is this a question? Each sidecar probes its colocated KMS Plugins. Yes, this is correct.

I added a "Naming caveat" section.

nodeName + keyID should be unique.

There is a sidecar injection section: https://github.com/ibihim/enhancements/blob/2bb10c537d05893622c0340a2b2322d6d0765abd/enhancements/kube-apiserver/kms-encryption-foundations.md#sidecar-injection

ardaguclu · 2026-05-11T10:32:43Z

+KMSHealthReporter_<nodeName>
+```
+
+The Type has no `_Available` or `_Degraded` suffix, so it stays advisory on the operator CR (library-go's `StatusSyncer` ignores it). The aggregator controller consumes these conditions and emits the `KMSPluginsDegraded` rollup separately (see [Aggregator behavior](#aggregator-behavior)).


Why are we trying to skip StatusSyncer?

Based on a discussion with @p0lyn0mial, a dedicated controller would create a condition that will be read by the StatusSyncer. This will reduce the amount of conditions.

So the "aggregator controller" will create a condition that will be read by the StatusSyncer.

tjungblu · 2026-05-11T11:05:08Z

+- One per `openshift-oauth-apiserver` Deployment replica
+- One per `openshift-apiserver` Deployment replica
+
+During KMS-to-KMS migration, the same sidecar probes every active KMS plugin in its pod (see [Multiple Concurrent Sidecars](#multiple-concurrent-sidecars)) and reports their combined state in the Message field of its single per-node condition.


heh there's no multiple-concurrent-sidecars section :)

There is?

https://github.com/ibihim/enhancements/blob/2bb10c537d05893622c0340a2b2322d6d0765abd/enhancements/kube-apiserver/kms-encryption-foundations.md#multiple-concurrent-sidecars

hah, that was two weeks ago. I think you just forgot to push a few commits back then.

lol. It isn't part of my PR. It is outside of my PR (part of master already). Maybe this is why you missed it, it isn't visible if you look solely on the changes that I did.

Interesting but if this is not part of the PR (not addition), shouldn't we see a removal section in the left hand side? :)

Note to myself, never ever reference anything outside of a PR.

tjungblu · 2026-05-11T11:10:48Z

+
+Each probe yields a HealthStatus carrying:
+
+- `State`: healthy / unhealthy / RPC error


what's the difference between RPC error and unhealthy?

The KMS Plugin can report healthy / unhealthy. An error would indicate that something is wrong with the connection itself.

ardaguclu · 2026-05-12T11:03:09Z

+      reason: Unreachable
+      message: |
+        keyID=2 status=unreachable lastChecked=2026-05-08T12:34:56Z detail=connection refused
+        keyID=1 status=unreachable lastChecked=2026-05-08T12:34:56Z detail=connection refused


Errors are usually very long (not short like connection refused). Would this formatting still work for very long error traces?

Reconcile terminology and arguments across the Health Reporter Sidecar and KMS Plugin Health Conditions sections: - Add a naming caveat distinguishing the socket-path keyID (encryption key secret id) from the plugin-reported kekID. - Collapse the duplicate per-tick struct into a single PluginHealthCondition definition; rename KMSPluginID to KeyID. - Drop the stale LastTransitionTime contrast (Status is now hardcoded). - Strengthen the connection rationale with the HA-routing argument and a Single-Node OpenShift caveat. - Acknowledge the legacy SA token lifetime tradeoff.

Address review nits on the health reporter design: - Drop the deep link to AddKMSPluginSidecarToPodSpec; keep the pluginlifecycle package reference (function names and master URLs rot). - Describe the socket-path keyID as a monotonically incrementing sequence number instead of a "revision number", which collides with the kas-o RevisionController concept. - Remove the "advisory" label from the per-node condition: it is misleading since the aggregator does act on it. Describe the StatusSyncer mechanic directly instead. - Move Auth and connection out of the Proposal into a new KMS Health Reporter Connectivity section under Implementation Details. - Rename the unreachable probe status to error, since a failed Status RPC can fail for reasons beyond unreachability. - Replace the GCP-style example kekIDs with short opaque values, decorrelated from keyID to reinforce the naming caveat.

Resolve the open probe-interval question raised in review: - Add a Probe interval subsection under KMS Plugin Health Conditions: fixed 30s default passed as a sidecar flag, n UDS probes per cycle followed by a single SSA write of all results. - Define emission as unconditional and best-effort: the reporter writes every tick so lastChecked keeps advancing, and discards a result it cannot deliver rather than queuing it, since the next interval produces a fresher one. - Derive the aggregator staleness threshold as 4 x probe interval and reference it from the Stale Reporter Conditions risk.

ardaguclu · 2026-05-21T06:27:49Z


+#### KMS Health Reporter Connectivity
+
+**Auth**: the reporter uses a **legacy ServiceAccount token** (mounted from a Secret) bound to a minimal Role that only permits applying its single per-node condition entry on the operator CR. The projected SA tokens available in API-server-adjacent namespaces are admin-grade, as are the auth client certificates on disk; both would over-privilege a sidecar whose only job is one SSA apply. The legacy SA token keeps the blast radius minimal if the sidecar is compromised. The tradeoff is lifetime: a legacy token does not expire, whereas a projected token rotates. We accept this, since a token scoped to one verb on one resource is a far smaller prize than an admin-grade token, expiring or not.


I agree with the justification here.

yep, same here

Having more long-lived, non-rotating tokens seems like a not great direction.

Does a bound SA token add a new dependency on the aggregated APIs like openshift-apiserver ?

For KAS maybe it would be better to use a client cert which would be rotated (and can be recovered) ?

yolo admin tokens.

I put a long term solution into the backlog.

ardaguclu · 2026-05-21T06:32:27Z

+
+**Risk: Orphaned Conditions on Mode Switch**
+- **Impact:** When KMS is disabled (e.g., switching to `aescbc`), reporter sidecars are removed. Without explicit cleanup, `KMSHealthReporter_<nodeName>` and `KMSPluginsDegraded` entries remain stale on the operator CR.
+- **Mitigation:** The aggregator controller owns cleanup. It removes orphaned `KMSHealthReporter_<nodeName>` entries (when their owning sidecar is no longer present) and removes its own `KMSPluginsDegraded` entry on KMS disable.


In SSA, will there be a fight between health reporter that tries to update the condition and aggregator controller that tries to cleanup the condition?.

I don't think it's going to be a big problem when they stomp on each other, we might have some flapping during non-ready/deleted states though.

I wonder if we rather have top level condition of the KMSHealthReporter_<nodeName> owned by the aggregator controller.

Kinda similar to how the node controller works:
https://github.com/openshift/library-go/blob/3a6f949c22c3ffc0a2b28ee50e78173086e7adae/pkg/operator/staticpod/controller/node/node_controller.go

the sidecar would only ever write to it when the condition for said node exists. Sounds like too much coordination for a few conflicts between node state transitions...

for the later API, we could just use a map and key it off the nodename directly https://github.com/openshift/api/pull/2850/changes#diff-616d67895c3421c2d091662d30ed47b4b6f0b57db9411e29618d22b964ddb9efR362

You know better than me those conditions. In SSA, it is always better a controller fully owns the field.

The *Degraded, *Available and so on are owned by the aggregator controller.
The aggregator controller itself would remove KMSHealthReporter_ that are owned by the KMS health reporters, not write into them. So there is some flapping, but it shouldn't be for too long / it shouldn't be a problem. Because the removal happens during garbage collection.

We could tear down the health reporter down before we tear down the KMS plugin, so we don't have wrong error conditions.

We can't remove-SSA it. We would need to patch-it-away.

@p0lyn0mial

ardaguclu · 2026-05-21T06:34:33Z

I have just one question. Other than that changes look good to me.

tjungblu · 2026-05-21T06:44:26Z

+- **Impact:** A reporter that hangs leaves its last `KMSHealthReporter_<nodeName>` condition in etcd unchanged.
+- **Mitigation:** Per-plugin `lastChecked` timestamps in Message expose staleness. The aggregator controller treats a condition whose `lastChecked` exceeds the freshness threshold (`4 × probe interval`; see [Probe interval](#probe-interval)) as effectively `Unknown`.
+
+**Risk: Orphaned Conditions on Mode Switch**


Suggested change

**Risk: Orphaned Conditions on Mode Switch**

**Risk: Orphaned Conditions on Mode Switch**

**Risk: Orphaned Conditions on encryption type change and node replacement**

Just wanted to ask who is cleaning up the old orphaned nodes when a node goes away, seems it is just buried in the Mitigation here :) Let's make the title clearer.

Encryption type change is more accurate, but "node replacement", is this something that is a major concern? I mean we can add it, it is correct.

we can scale the control plane up and down anytime, so yes, this is kinda important

I will change it. Can't apply your commit as it contains the previous line as well.

…the pod

ardaguclu · 2026-06-01T08:04:58Z

Proposed mechanism here makes sense to me. We can update it later on if there are changes in the actual implementation
LGTM

tjungblu · 2026-06-01T14:10:14Z

/approve

openshift-ci · 2026-06-01T14:10:31Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: tjungblu

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [tjungblu]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

p0lyn0mial · 2026-06-02T12:31:49Z


+#### KMS Health Reporter Connectivity
+
+**Auth**: the reporter uses a **legacy ServiceAccount token** (mounted from a Secret) bound to a minimal Role that only permits applying its single per-node condition entry on the operator CR. The projected SA tokens available in API-server-adjacent namespaces are admin-grade, as are the auth client certificates on disk; both would over-privilege a sidecar whose only job is one SSA apply. The legacy SA token keeps the blast radius minimal if the sidecar is compromised. The tradeoff is lifetime: a legacy token does not expire, whereas a projected token rotates. We accept this, since a token scoped to one verb on one resource is a far smaller prize than an admin-grade token, expiring or not.


we also need to describe how the reporter will validate the server's certificate.

p0lyn0mial · 2026-06-02T12:31:52Z

+
+**Auth**: the reporter uses a **legacy ServiceAccount token** (mounted from a Secret) bound to a minimal Role that only permits applying its single per-node condition entry on the operator CR. The projected SA tokens available in API-server-adjacent namespaces are admin-grade, as are the auth client certificates on disk; both would over-privilege a sidecar whose only job is one SSA apply. The legacy SA token keeps the blast radius minimal if the sidecar is compromised. The tradeoff is lifetime: a legacy token does not expire, whereas a projected token rotates. We accept this, since a token scoped to one verb on one resource is a far smaller prize than an admin-grade token, expiring or not.
+
+**Connection**: all reporters reach the kube-apiserver through the in-cluster Service `kubernetes.default.svc`.


kubernetes.default.svc simply won't resolve for kas. The default dnsPolicy is ClusterFirst

p0lyn0mial · 2026-06-02T12:32:17Z

+
+In an HA control plane this survives KMS failure: if one node's KMS plugin breaks, that node's KAS degrades, but the Service still has healthy endpoints on the other nodes, so the affected node's reporter can still deliver its condition. The Service approach only fully breaks when every KMS plugin is down, and that is a cluster-down event already surfaced by far louder signals (`ClusterOperator`, etcd, kubelet probes) than a missing reporter condition. On Single-Node OpenShift the Service has a single endpoint, so a broken local KMS plugin does leave the reporter unable to write; this is acceptable for the same reason: the cluster is already hard-down and the condition would be redundant.
+
+Dialing `127.0.0.1:6443` directly (the kube-apiserver static pod uses `hostNetwork: true`) was considered and rejected. It would bridge the post-start window where the local KAS accepts TLS connections but is still absent from `kubernetes.default` `Endpoints` (the Service reconciler self-gates on `/readyz`; see [`kubernetesservice/controller.go`](https://github.com/kubernetes/kubernetes/blob/master/pkg/controlplane/controller/kubernetesservice/controller.go)). But reporting KMS plugin health is not on KAS's critical startup path, and a not-ready KAS is already surfaced with higher signal-to-noise by `ClusterOperator`, kubelet probes, and KAS's own readiness machinery.


see my previous comment.

Rework the KMS Health Reporter Connectivity section to resolve two review points. Connection: the kube-apiserver reporter no longer targets kubernetes.default.svc, which does not resolve from the hostNetwork static pod (ClusterFirst dnsPolicy inherits the node resolv.conf, not cluster DNS). It now dials the localhost:6443 loopback the static-pod kubeconfigs already use. Document why a KMS outage does not block the write: KMS health gates /readyz and /healthz but not /livez, so the local kube-apiserver is dropped from the Service load balancer but never restarted, and the operator CR is not in the encrypted set, so the write uses the identity transformer and never calls KMS. Credentials: split connectivity by pod type instead of proposing a new token. Short term the reporter reuses the admin-grade identity each pod already mounts (cert-syncer's localhost-recovery token on the static pod, the ServiceAccount token on the aggregated servers). Long term it moves to a dedicated rotated identity: a client cert minted by the operator's certrotation on the static pod (as check-endpoints already gets), and a scoped ServiceAccount token via TokenRequest on the aggregated servers.

openshift-ci · 2026-06-04T10:34:24Z

@ibihim: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

ardaguclu · 2026-06-08T07:26:44Z

/lgtm
/hold
in case @p0lyn0mial has any more comments

p0lyn0mial · 2026-06-08T07:40:15Z

This is a nicely written proposal.

Thanks for putting it together, and thanks for all the reviews!

/lgtm
/hold cancel

This PR adds a structured API for the kms health reports, which are currently stored as json in the operator conditions. Corresponding enhancement in openshift/enhancements#2005 Signed-off-by: Thomas Jungblut <tjungblu@redhat.com>

openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label May 8, 2026

openshift-ci Bot requested review from JoelSpeed and derekwaynecarr May 8, 2026 17:57

ibihim force-pushed the 2026-05-07_kmsv2tpv2-health-monitor branch from bb85f9a to b719627 Compare May 8, 2026 18:22

ardaguclu reviewed May 11, 2026

View reviewed changes

p0lyn0mial reviewed May 11, 2026

View reviewed changes

openshift-ci Bot requested a review from tjungblu May 11, 2026 10:27

ardaguclu reviewed May 11, 2026

View reviewed changes

tjungblu reviewed May 11, 2026

View reviewed changes

ardaguclu reviewed May 12, 2026

View reviewed changes

ibihim added 3 commits May 20, 2026 14:16

ardaguclu reviewed May 21, 2026

View reviewed changes

tjungblu reviewed May 21, 2026

View reviewed changes

ibihim added 3 commits May 26, 2026 11:12

kms: mark rollup conditions as the admin-facing KMS health signal

771b367

kms: broaden orphaned-conditions risk to cover node replacement

022627c

kms: record rejected alternatives for exposing plugin Status outside …

b411a09

…the pod

openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 1, 2026

p0lyn0mial reviewed Jun 2, 2026

View reviewed changes

This was referenced Jun 3, 2026

[Auto-Generated] KMS Team PR Dashboard openshift/library-go#2266

Closed

[Auto-Generated] KMS Team PR Dashboard openshift/library-go#2267

Closed

[Auto-Generated] KMS Team PR Dashboard openshift/library-go#2268

Open

openshift-ci Bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 8, 2026

openshift-ci Bot assigned ardaguclu Jun 8, 2026

openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label Jun 8, 2026

openshift-ci Bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 8, 2026

openshift-merge-bot Bot merged commit 0e8f0db into openshift:master Jun 8, 2026
2 checks passed

openshift-ci Bot assigned p0lyn0mial Jun 8, 2026

ibihim deleted the 2026-05-07_kmsv2tpv2-health-monitor branch June 8, 2026 08:37

tjungblu mentioned this pull request Jun 9, 2026

CNTRLPLANE-3513: add kms health reports openshift/api#2881

Merged


		#### Health Reporter Sidecar

		When KMS encryption is enabled, a health reporter sidecar runs alongside every API server pod replica. The sidecar probes the colocated KMS plugin(s) and publishes the outcome to the owning operator's CR as a per-node condition. A separate aggregator controller picks up these conditions and emits a single `KMSPluginsDegraded` rollup, which propagates to the `ClusterOperator`'s `Degraded` condition.


		##### SSA mechanism

		Two classes of writers share the conditions array: the operator (writing its own conditions like `NodeControllerDegraded` plus the aggregator's `KMSPluginsDegraded` rollup) and N reporter sidecars (each writing its `KMSHealthReporter_<nodeName>`). Per-entry ownership is enabled by `+listType=map +listMapKey=type` on `OperatorStatus.Conditions`. Each reporter uses a per-node fieldManager (`kms-health-reporter-<nodeName>`); all writers apply with `Force: true`.


		##### Naming convention

		Each reporter sidecar writes one condition per pod replica to the owning operator's CR (`kubeapiservers.operator.openshift.io/cluster`, etc.), keyed by the node:


		##### Destination

		The sidecar writes one advisory condition per pod replica to the owning operator's `*.operator.openshift.io/cluster` CR via Server-Side Apply. The aggregator controller reads these advisory conditions and emits the `KMSPluginsDegraded` rollup. See [KMS Plugin Health Conditions](#kms-plugin-health-conditions) for the exact naming, status mapping, and rollup behavior.


		During KMS-to-KMS migration, the encryption-configuration secret contains provider configs for all active keys. The operator creates a separate sidecar for each key, listening on its own unix domain socket (e.g., `kms-1.sock`, `kms-2.sock`).

		#### Health Reporter Sidecar


		Each probe yields a HealthStatus carrying:

		- `State`: healthy / unhealthy / RPC error


		##### Probe contract

		Each sidecar probes its colocated KMS plugin(s) over the local UDS at `unix:///var/run/kmsplugin/kms-{keyID}.sock` (the same socket path scheme described in [Sidecar Injection](#sidecar-injection)).


		#### KMS Health Reporter Connectivity

		Auth: the reporter uses a legacy ServiceAccount token (mounted from a Secret) bound to a minimal Role that only permits applying its single per-node condition entry on the operator CR. The projected SA tokens available in API-server-adjacent namespaces are admin-grade, as are the auth client certificates on disk; both would over-privilege a sidecar whose only job is one SSA apply. The legacy SA token keeps the blast radius minimal if the sidecar is compromised. The tradeoff is lifetime: a legacy token does not expire, whereas a projected token rotates. We accept this, since a token scoped to one verb on one resource is a far smaller prize than an admin-grade token, expiring or not.

	Risk: Orphaned Conditions on Mode Switch
	Risk: Orphaned Conditions on Mode Switch
	Risk: Orphaned Conditions on encryption type change and node replacement


		Auth: the reporter uses a legacy ServiceAccount token (mounted from a Secret) bound to a minimal Role that only permits applying its single per-node condition entry on the operator CR. The projected SA tokens available in API-server-adjacent namespaces are admin-grade, as are the auth client certificates on disk; both would over-privilege a sidecar whose only job is one SSA apply. The legacy SA token keeps the blast radius minimal if the sidecar is compromised. The tradeoff is lifetime: a legacy token does not expire, whereas a projected token rotates. We accept this, since a token scoped to one verb on one resource is a far smaller prize than an admin-grade token, expiring or not.

		Connection: all reporters reach the kube-apiserver through the in-cluster Service `kubernetes.default.svc`.


		In an HA control plane this survives KMS failure: if one node's KMS plugin breaks, that node's KAS degrades, but the Service still has healthy endpoints on the other nodes, so the affected node's reporter can still deliver its condition. The Service approach only fully breaks when every KMS plugin is down, and that is a cluster-down event already surfaced by far louder signals (`ClusterOperator`, etcd, kubelet probes) than a missing reporter condition. On Single-Node OpenShift the Service has a single endpoint, so a broken local KMS plugin does leave the reporter unable to write; this is acceptable for the same reason: the cluster is already hard-down and the condition would be redundant.

		Dialing `127.0.0.1:6443` directly (the kube-apiserver static pod uses `hostNetwork: true`) was considered and rejected. It would bridge the post-start window where the local KAS accepts TLS connections but is still absent from `kubernetes.default` `Endpoints` (the Service reconciler self-gates on `/readyz`; see [`kubernetesservice/controller.go`](https://github.com/kubernetes/kubernetes/blob/master/pkg/controlplane/controller/kubernetesservice/controller.go)). But reporting KMS plugin health is not on KAS's critical startup path, and a not-ready KAS is already surfaced with higher signal-to-noise by `ClusterOperator`, kubelet probes, and KAS's own readiness machinery.

Uh oh!

Conversation

ibihim commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

openshift-ci-robot commented May 8, 2026 • edited by openshift-ci Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ibihim May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ibihim May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ibihim May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

p0lyn0mial May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ibihim commented May 8, 2026 •

edited

Loading

openshift-ci-robot commented May 8, 2026 •

edited by openshift-ci Bot

Loading

ibihim May 12, 2026 •

edited

Loading

ibihim May 26, 2026 •

edited

Loading

ibihim May 12, 2026 •

edited

Loading

p0lyn0mial May 11, 2026 •

edited

Loading