[WIP] KMS: add a post check function to revision controller#2121
[WIP] KMS: add a post check function to revision controller#2121bertinatto wants to merge 2 commits into
Conversation
WalkthroughAdds a KMS revision post-check that verifies revision-suffixed encryption Secret and apiserver pod ConfigMap for KMS presence, and wires this post-check into the static pod controller. Also adds a go.mod replace rule for Changes
Sequence Diagram(s)sequenceDiagram
participant Controller as StaticPodController
participant FG as FeatureGateAccessor
participant K8s as Kubernetes API
participant Checker as KMSRevisionPostCheck
Controller->>FG: Request feature gates
FG-->>Controller: Feature gates (observed)
Controller->>Checker: Invoke post-check with observed gates
Checker->>FG: Verify KMSEncryption enabled?
FG-->>Checker: KMSEncryption = true/false
alt KMSEncryption enabled
Checker->>K8s: GET Secret `encryption-config-<rev>`
K8s-->>Checker: Secret (or NotFound)
Checker->>K8s: GET ConfigMap `kube-apiserver-pod-<rev>`
K8s-->>Checker: ConfigMap (or NotFound)
Checker->>Checker: Decode EncryptionConfiguration
Checker->>Checker: Parse pod.yaml for KMS sidecar
Checker-->>Controller: Return pass or descriptive error if mismatch
else KMSEncryption disabled
Checker-->>Controller: Return pass (no KMS checks)
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Important Pre-merge checks failedPlease resolve all errors before merging. Addressing warnings is optional. ❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (10 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: bertinatto The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
60cdeb6 to
6f09bec
Compare
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (1)
pkg/operator/kms/postcheck.go (1)
70-71: Replace hardcoded sidecar name with a shared constant/config source.The FIXME is valid: hardcoding
"kms-plugin"can drift from actual rendered sidecar naming and weaken this check over time.I can draft a follow-up patch that sources the sidecar name from an existing shared constant (or operator config) and adds a focused unit test—want me to prepare it?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/operator/kms/postcheck.go` around lines 70 - 71, The check hardcodes the sidecar name ("kms-plugin") when calling podHasKMSSidecar, which risks drift; change the call to use a shared constant or operator config value (e.g., KMSPluginName constant or OperatorConfig.KMSSidecarName) instead of the string literal, update the variable kmsInPod assignment to reference that constant/config, and add/adjust a unit test for podHasKMSSidecar to validate behavior when the configured sidecar name is present and when it changes.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@go.mod`:
- Line 140: The go.mod contains a replace directive rerouting
github.com/openshift/library-go to a personal fork (replace
github.com/openshift/library-go => github.com/bertinatto/library-go
v0.0.0-20260423204751-1ae67025ddad); remove this replace entry (or replace it
with an official upstream version tag) before merging, and if the forked change
is needed, upstream the patch and reference the official module version instead
of the personal fork.
In `@pkg/operator/kms/postcheck.go`:
- Around line 43-45: The code is silently swallowing errors from
featureGateAccessor.CurrentFeatureGates(); instead of returning nil on err,
propagate or return the error (or a wrapped context error) from the enclosing
post-check function so callers know retrieval failed; update the error handling
around featureGates, err := featureGateAccessor.CurrentFeatureGates() to return
the err (or fmt.Errorf("failed to get feature gates: %w", err)) and ensure any
callers of the post-check function handle that error appropriately.
- Around line 53-64: The current NotFound handling in the post-check returns
early and skips revision validation; instead, when
kubeClient.CoreV1().Secrets(...).Get for "encryption-config"+revSuffix returns
apierrors.IsNotFound(err) set kmsConfigured = false and continue processing (do
not return), and when kubeClient.CoreV1().ConfigMaps(...).Get for
"kube-apiserver-pod"+revSuffix returns apierrors.IsNotFound(err) treat it as a
failure of the post-check (return a descriptive error) so the revision mismatch
is detected; update the logic around encryptionSecret, podCM, the
apierrors.IsNotFound checks and the post-check return paths to implement these
behaviors.
---
Nitpick comments:
In `@pkg/operator/kms/postcheck.go`:
- Around line 70-71: The check hardcodes the sidecar name ("kms-plugin") when
calling podHasKMSSidecar, which risks drift; change the call to use a shared
constant or operator config value (e.g., KMSPluginName constant or
OperatorConfig.KMSSidecarName) instead of the string literal, update the
variable kmsInPod assignment to reference that constant/config, and add/adjust a
unit test for podHasKMSSidecar to validate behavior when the configured sidecar
name is present and when it changes.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 7502e673-bffa-45a9-a34b-e1d0079271b8
⛔ Files ignored due to path filters (4)
go.sumis excluded by!**/*.sumvendor/github.com/openshift/library-go/pkg/operator/revisioncontroller/revision_controller.gois excluded by!**/vendor/**,!vendor/**vendor/github.com/openshift/library-go/pkg/operator/staticpod/controllers.gois excluded by!**/vendor/**,!vendor/**vendor/modules.txtis excluded by!**/vendor/**,!vendor/**
📒 Files selected for processing (3)
go.modpkg/operator/kms/postcheck.gopkg/operator/starter.go
|
|
||
| replace github.com/onsi/ginkgo/v2 => github.com/openshift/onsi-ginkgo/v2 v2.6.1-0.20251001123353-fd5b1fb35db1 | ||
|
|
||
| replace github.com/openshift/library-go => github.com/bertinatto/library-go v0.0.0-20260423204751-1ae67025ddad |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "1) Confirm active replace directive:"
rg -n '^replace github.com/openshift/library-go' go.mod
echo
echo "2) Check whether pseudo-version suffix exists in fork vs upstream:"
SHORT_SHA='1ae67025ddad'
echo "Fork:"
git ls-remote https://github.com/bertinatto/library-go.git | rg "$SHORT_SHA" || true
echo "Upstream:"
git ls-remote https://github.com/openshift/library-go.git | rg "$SHORT_SHA" || trueRepository: openshift/cluster-kube-apiserver-operator
Length of output: 407
Remove the personal-fork replace directive before merge.
Line 140 reroutes all github.com/openshift/library-go imports to a personal fork on a non-upstream feature branch (kms-plugin-lifecycle-postcondition), creating a supply-chain and reproducibility risk for CI and releases. Either remove this directive or resolve it by contributing the change upstream and using the official version.
🔧 Proposed fix
-replace github.com/openshift/library-go => github.com/bertinatto/library-go v0.0.0-20260423204751-1ae67025ddad📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| replace github.com/openshift/library-go => github.com/bertinatto/library-go v0.0.0-20260423204751-1ae67025ddad |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@go.mod` at line 140, The go.mod contains a replace directive rerouting
github.com/openshift/library-go to a personal fork (replace
github.com/openshift/library-go => github.com/bertinatto/library-go
v0.0.0-20260423204751-1ae67025ddad); remove this replace entry (or replace it
with an official upstream version tag) before merging, and if the forked change
is needed, upstream the patch and reference the official module version instead
of the personal fork.
| featureGates, err := featureGateAccessor.CurrentFeatureGates() | ||
| if err != nil { | ||
| return nil |
There was a problem hiding this comment.
Do not swallow feature-gate retrieval errors.
At Line 45, err != nil returns nil, so post-check silently fails open.
🔧 Proposed fix
featureGates, err := featureGateAccessor.CurrentFeatureGates()
if err != nil {
- return nil
+ return fmt.Errorf("kms revision post-check: failed to read current feature gates: %w", err)
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| featureGates, err := featureGateAccessor.CurrentFeatureGates() | |
| if err != nil { | |
| return nil | |
| featureGates, err := featureGateAccessor.CurrentFeatureGates() | |
| if err != nil { | |
| return fmt.Errorf("kms revision post-check: failed to read current feature gates: %w", err) | |
| } |
🧰 Tools
🪛 golangci-lint (2.11.4)
[error] 45-45: error is not nil (line 43) but it returns nil
(nilerr)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/operator/kms/postcheck.go` around lines 43 - 45, The code is silently
swallowing errors from featureGateAccessor.CurrentFeatureGates(); instead of
returning nil on err, propagate or return the error (or a wrapped context error)
from the enclosing post-check function so callers know retrieval failed; update
the error handling around featureGates, err :=
featureGateAccessor.CurrentFeatureGates() to return the err (or
fmt.Errorf("failed to get feature gates: %w", err)) and ensure any callers of
the post-check function handle that error appropriately.
| encryptionSecret, err := kubeClient.CoreV1().Secrets(operatorclient.TargetNamespace).Get(ctx, "encryption-config"+revSuffix, metav1.GetOptions{}) | ||
| if apierrors.IsNotFound(err) { | ||
| return nil | ||
| } | ||
| if err != nil { | ||
| return fmt.Errorf("kms revision post-check: failed to get encryption-config for revision %d: %w", revision, err) | ||
| } | ||
|
|
||
| podCM, err := kubeClient.CoreV1().ConfigMaps(operatorclient.TargetNamespace).Get(ctx, "kube-apiserver-pod"+revSuffix, metav1.GetOptions{}) | ||
| if apierrors.IsNotFound(err) { | ||
| return nil | ||
| } |
There was a problem hiding this comment.
Current NotFound handling bypasses revision validation.
Lines 55 and 63 return early, which skips mismatch detection entirely. For encryption-config NotFound, continue with kmsConfigured=false; for kube-apiserver-pod NotFound, fail the post-check.
🔧 Proposed fix
- encryptionSecret, err := kubeClient.CoreV1().Secrets(operatorclient.TargetNamespace).Get(ctx, "encryption-config"+revSuffix, metav1.GetOptions{})
- if apierrors.IsNotFound(err) {
- return nil
- }
- if err != nil {
- return fmt.Errorf("kms revision post-check: failed to get encryption-config for revision %d: %w", revision, err)
- }
+ encryptionSecret, err := kubeClient.CoreV1().Secrets(operatorclient.TargetNamespace).Get(ctx, "encryption-config"+revSuffix, metav1.GetOptions{})
+ kmsConfigured := false
+ if apierrors.IsNotFound(err) {
+ kmsConfigured = false
+ } else if err != nil {
+ return fmt.Errorf("kms revision post-check: failed to get encryption-config for revision %d: %w", revision, err)
+ } else {
+ kmsConfigured = hasKMSEncryptionProvider(encryptionSecret)
+ }
podCM, err := kubeClient.CoreV1().ConfigMaps(operatorclient.TargetNamespace).Get(ctx, "kube-apiserver-pod"+revSuffix, metav1.GetOptions{})
- if apierrors.IsNotFound(err) {
- return nil
- }
if err != nil {
return fmt.Errorf("kms revision post-check: failed to get kube-apiserver-pod for revision %d: %w", revision, err)
}
-
- kmsConfigured := hasKMSEncryptionProvider(encryptionSecret)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/operator/kms/postcheck.go` around lines 53 - 64, The current NotFound
handling in the post-check returns early and skips revision validation; instead,
when kubeClient.CoreV1().Secrets(...).Get for "encryption-config"+revSuffix
returns apierrors.IsNotFound(err) set kmsConfigured = false and continue
processing (do not return), and when kubeClient.CoreV1().ConfigMaps(...).Get for
"kube-apiserver-pod"+revSuffix returns apierrors.IsNotFound(err) treat it as a
failure of the post-check (return a descriptive error) so the revision mismatch
is detected; update the logic around encryptionSecret, podCM, the
apierrors.IsNotFound checks and the post-check return paths to implement these
behaviors.
There was a problem hiding this comment.
♻️ Duplicate comments (2)
pkg/operator/kms/postcheck.go (2)
53-64:⚠️ Potential issue | 🟠 MajorCurrent
NotFoundbranches bypass revision validation.Returning early on
encryption-config/kube-apiserver-podNotFound skips mismatch detection. Treat missing secret askmsConfigured=falseand missing pod ConfigMap as post-check failure.🔧 Proposed fix
encryptionSecret, err := kubeClient.CoreV1().Secrets(operatorclient.TargetNamespace).Get(ctx, "encryption-config"+revSuffix, metav1.GetOptions{}) - if apierrors.IsNotFound(err) { - return nil - } - if err != nil { + kmsConfigured := false + if apierrors.IsNotFound(err) { + kmsConfigured = false + } else if err != nil { return fmt.Errorf("kms revision post-check: failed to get encryption-config for revision %d: %w", revision, err) + } else { + kmsConfigured = hasKMSEncryptionProvider(encryptionSecret) } podCM, err := kubeClient.CoreV1().ConfigMaps(operatorclient.TargetNamespace).Get(ctx, "kube-apiserver-pod"+revSuffix, metav1.GetOptions{}) if apierrors.IsNotFound(err) { - return nil + return fmt.Errorf("kms revision post-check: kube-apiserver-pod for revision %d not found", revision) } if err != nil { return fmt.Errorf("kms revision post-check: failed to get kube-apiserver-pod for revision %d: %w", revision, err) } - kmsConfigured := hasKMSEncryptionProvider(encryptionSecret)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/operator/kms/postcheck.go` around lines 53 - 64, The NotFound branches in the post-check currently return early and skip revision validation; instead, when CoreV1().Secrets(...).Get(...) for "encryption-config"+revSuffix returns IsNotFound, set kmsConfigured=false (don’t return nil) so the flow continues to detect mismatches, and when CoreV1().ConfigMaps(...).Get(...) for "kube-apiserver-pod"+revSuffix returns IsNotFound treat that as a post-check failure (return a descriptive error including revision) rather than returning nil; update the logic around encryptionSecret, podCM, revSuffix and revision to reflect these behaviors and preserve existing error handling for non-NotFound errors.
43-46:⚠️ Potential issue | 🟠 MajorDo not fail open when feature-gate lookup fails.
At Line 45, returning
nilonCurrentFeatureGates()error silently skips validation and can accept bad revisions.🔧 Proposed fix
featureGates, err := featureGateAccessor.CurrentFeatureGates() if err != nil { - return nil + return fmt.Errorf("kms revision post-check: failed to read current feature gates: %w", err) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/operator/kms/postcheck.go` around lines 43 - 46, The code currently swallows errors from featureGateAccessor.CurrentFeatureGates() by returning nil, which fails open; instead propagate the failure to the caller so validation does not silently succeed—replace the `if err { return nil }` with returning the error (e.g., `return nil, fmt.Errorf("getting feature gates: %w", err)` or simply `return nil, err` depending on the surrounding signature) so the caller of the function that calls featureGateAccessor.CurrentFeatureGates() receives and handles the error rather than accepting bad revisions.
🧹 Nitpick comments (1)
pkg/operator/kms/postcheck.go (1)
70-71: Avoid hardcoding the KMS sidecar name.Using
"kms-plugin"inline risks drift from the actual rendered container name. Prefer a shared constant/source of truth or make it configurable where the manifest is generated.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/operator/kms/postcheck.go` around lines 70 - 71, The code currently hardcodes the sidecar name when calling podHasKMSSidecar(podCM, "kms-plugin"); replace the inline string with a single source-of-truth identifier (e.g., a package-level constant or a getter) so the rendered manifest and runtime check stay in sync. Update the call to use the new constant/func (for example KMSPluginContainerName or GetKMSPluginName()) and ensure that constant is documented and set from the manifest generation/configuration codepath so podHasKMSSidecar and any other checks reference the same name.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@pkg/operator/kms/postcheck.go`:
- Around line 53-64: The NotFound branches in the post-check currently return
early and skip revision validation; instead, when CoreV1().Secrets(...).Get(...)
for "encryption-config"+revSuffix returns IsNotFound, set kmsConfigured=false
(don’t return nil) so the flow continues to detect mismatches, and when
CoreV1().ConfigMaps(...).Get(...) for "kube-apiserver-pod"+revSuffix returns
IsNotFound treat that as a post-check failure (return a descriptive error
including revision) rather than returning nil; update the logic around
encryptionSecret, podCM, revSuffix and revision to reflect these behaviors and
preserve existing error handling for non-NotFound errors.
- Around line 43-46: The code currently swallows errors from
featureGateAccessor.CurrentFeatureGates() by returning nil, which fails open;
instead propagate the failure to the caller so validation does not silently
succeed—replace the `if err { return nil }` with returning the error (e.g.,
`return nil, fmt.Errorf("getting feature gates: %w", err)` or simply `return
nil, err` depending on the surrounding signature) so the caller of the function
that calls featureGateAccessor.CurrentFeatureGates() receives and handles the
error rather than accepting bad revisions.
---
Nitpick comments:
In `@pkg/operator/kms/postcheck.go`:
- Around line 70-71: The code currently hardcodes the sidecar name when calling
podHasKMSSidecar(podCM, "kms-plugin"); replace the inline string with a single
source-of-truth identifier (e.g., a package-level constant or a getter) so the
rendered manifest and runtime check stay in sync. Update the call to use the new
constant/func (for example KMSPluginContainerName or GetKMSPluginName()) and
ensure that constant is documented and set from the manifest
generation/configuration codepath so podHasKMSSidecar and any other checks
reference the same name.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: c657835c-0831-4ab6-af4b-99aa729bb7cc
📒 Files selected for processing (2)
pkg/operator/kms/postcheck.gopkg/operator/starter.go
|
@bertinatto: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
|
||
| encryptionSecret, err := kubeClient.CoreV1().Secrets(operatorclient.TargetNamespace).Get(ctx, "encryption-config"+revSuffix, metav1.GetOptions{}) | ||
| if apierrors.IsNotFound(err) { | ||
| return nil |
There was a problem hiding this comment.
I assume that notfound revisioned Secret is totally benign?
There was a problem hiding this comment.
I assume so, that's why I return nil here. Should I handle it differently?
The idea is to validate whether the revision is valid or not.
Summary by CodeRabbit
New Features
Chores