[WIP] kms: block revisions until encryption config and kms sidecar are in sync by bertinatto · Pull Request #2102 · openshift/cluster-kube-apiserver-operator

bertinatto · 2026-04-13T20:08:36Z

Add a revision controller precondition that prevents rolling out a new revision when there is a mismatch between the encryption configuration and the pod spec.

CC @ardaguclu @p0lyn0mial

Summary by CodeRabbit

New Features
- Added a KMS revision consistency check that validates KMS encryption configuration and the presence of the KMS plugin in API server pods when KMS encryption is enabled.
Chores
- Updated dependency resolution mapping for an internal library to a specific module version.

coderabbitai · 2026-04-13T20:09:35Z

Walkthrough

Adds a go.mod replace directive and a KMS-specific revision post-check wired into the static pod operator’s revision controller startup to validate KMS configuration vs. pod presence after feature gate observation.

Changes

Cohort / File(s)	Summary
Dependency Resolution `go.mod`	Added a `replace` directive redirecting `github.com/openshift/library-go` to `github.com/bertinatto/library-go` at a specific pseudo-version.
KMS Revision Post-check `pkg/operator/kms/postcheck.go`	New exported function `KMSRevisionPostCheck(...)` implemented. Retrieves `encryption-config-<rev>` Secret and `kube-apiserver-pod-<rev>` ConfigMap, decodes/unmarshals to detect KMS providers and `kms-plugin` container, and returns an error on mismatch.
Operator Startup Wiring `pkg/operator/starter.go`	Imported `kms` and added `kms.KMSRevisionPostCheck(kubeClient, featureGateAccessor)` to the revision-controller post-check chain in RunOperator.

Sequence Diagram(s)

sequenceDiagram
    participant Controller as RevisionController
    participant Feature as FeatureGateAccessor
    participant KubeAPI as Kubernetes API (Secrets/ConfigMaps)
    participant Logger as Logger

    Controller->>Feature: Check KMSEncryption feature gate
    alt KMSEncryption enabled
        Controller->>KubeAPI: GET Secret `encryption-config-<rev>`
        KubeAPI-->>Controller: Secret (or NotFound / Error)
        Controller->>KubeAPI: GET ConfigMap `kube-apiserver-pod-<rev>`
        KubeAPI-->>Controller: ConfigMap (or NotFound / Error)
        Controller->>Controller: Decode Secret -> EncryptionConfiguration
        Controller->>Controller: Parse pod.yaml -> corev1.Pod
        alt KMS configured != KMS present
            Controller->>Logger: Log mismatch error
            Controller-->>Controller: Return error
        else
            Controller->>Logger: Log success (v=4)
            Controller-->>Controller: Return nil
        end
    else
        Controller->>Logger: Skip KMS post-check
        Controller-->>Controller: Return nil
    end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 10 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 11.11% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Test Structure And Quality	❓ Inconclusive	PR adds implementation code (postcheck.go with 119 lines) but does not include any Ginkgo test files for the new KMSRevisionPostCheck functionality.	Clarify whether test files exist for the KMSRevisionPostCheck function. If tests were added, provide the test file paths for review against Ginkgo quality requirements.

✅ Passed checks (10 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main change: adding a KMS revision post-check to ensure encryption config and KMS sidecar synchronization before rolling out revisions.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names	✅ Passed	This pull request does not add or modify any Ginkgo test files. The PR introduces changes to `go.mod`, a new production function in `pkg/operator/kms/postcheck.go`, and modifications to `pkg/operator/starter.go`. None of these files contain Ginkgo test declarations (Describe, Context, It, When, etc.), so the check for stable and deterministic test names is not applicable to this PR.
Microshift Test Compatibility	✅ Passed	The MicroShift Test Compatibility check is not applicable to this PR. The PR adds a go.mod directive, KMS validation functions, and KMS post-check wiring—none of which include new Ginkgo e2e test definitions.
Single Node Openshift (Sno) Test Compatibility	✅ Passed	The pull request does not add any new Ginkgo e2e tests. Changes consist only of operator implementation code: a dependency replace directive in go.mod, a new KMS post-check function in pkg/operator/kms/postcheck.go, and integration into the operator startup in pkg/operator/starter.go.
Topology-Aware Scheduling Compatibility	✅ Passed	PR introduces KMS validation post-check hook with no topology-specific constraints, scheduling rules, or cluster topology assumptions; validation operates uniformly across all OpenShift topologies.
Ote Binary Stdout Contract	✅ Passed	Pull request introduces no process-level stdout writes violating the OTE Binary Stdout Contract.
Ipv6 And Disconnected Network Test Compatibility	✅ Passed	No Ginkgo e2e tests were added. The PR modifies operator implementation files (KMS post-check logic) and updates dependencies, containing no test code.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

bertinatto · 2026-04-13T20:09:49Z

+
+		kmsConfigured := hasKMSEncryptionProvider(encryptionConfig)
+		// FIXME(bertinatto): plugin name is hardcoded here
+		kmsInPod := podHasKMSSidecar(podCM, "kms-plugin")


The validation that these 2 functions do is still premature. More work needs to be done here.

openshift-ci · 2026-04-13T20:10:05Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: bertinatto
Once this PR has been reviewed and has the lgtm label, please assign sanchezl for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (2)

pkg/operator/kms/precondition.go (1)
104-104: Consider using comma-ok pattern for type assertion.

The direct type assertion obj.(*apiserverv1.EncryptionConfiguration) will panic if Decode returns an unexpected type. While the GVK hint should ensure the correct type is returned on success, using the comma-ok pattern would be more defensive.
Defensive type assertion
-	config := obj.(*apiserverv1.EncryptionConfiguration)
+	config, ok := obj.(*apiserverv1.EncryptionConfiguration)
+	if !ok {
+		return false
+	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/operator/kms/precondition.go` at line 104, Replace the direct type
assertion on obj with the comma-ok pattern to avoid a panic: attempt config, ok
:= obj.(*apiserverv1.EncryptionConfiguration) and if !ok handle the unexpected
type by returning an error (or logging and returning false) from the enclosing
function/condition; update any downstream uses of config to only run when ok is
true so the code is defensive against Decode returning a different type.
pkg/operator/kms/precondition_test.go (1)
35-37: Test case "feature gate accessor returns error" may not exercise the error path.

featureGateAccessorWithError() passes a non-closed channel, which means AreInitialFeatureGatesObserved() will return false before CurrentFeatureGates() is called. The precondition returns early, so the error from CurrentFeatureGates() is never reached.

Consider closing the channel to properly test the error-handling path:
Fix to test the actual error path
 func featureGateAccessorWithError() featuregates.FeatureGateAccess {
-	return featuregates.NewHardcodedFeatureGateAccessForTesting(nil, nil, make(chan struct{}), fmt.Errorf("test error"))
+	observed := make(chan struct{})
+	close(observed)
+	return featuregates.NewHardcodedFeatureGateAccessForTesting(nil, nil, observed, fmt.Errorf("test error"))
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/operator/kms/precondition_test.go` around lines 35 - 37, The test helper
featureGateAccessorWithError currently passes an open channel so
AreInitialFeatureGatesObserved() returns false and the error from
CurrentFeatureGates() is never hit; modify featureGateAccessorWithError to
create the channel, close it (e.g., ch := make(chan struct{}); close(ch)), and
pass that closed channel into
featuregates.NewHardcodedFeatureGateAccessForTesting(nil, nil, ch,
fmt.Errorf("test error")) so AreInitialFeatureGatesObserved() returns true and
the CurrentFeatureGates() error path is exercised.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/operator/kms/precondition.go`:
- Around line 65-71: The precondition silently treats parse/lookup failures in
podHasKMSSidecar as "false", causing permanent blocking when kmsConfigured is
true; update podHasKMSSidecar to return (bool, error) (or a distinct error value
on parse/missing-key), propagate that change into the precondition here so you
capture (kmsInPod, err) from podHasKMSSidecar, and handle err explicitly: log a
warning with the parsing error and decide to either return false, err to block
with an actionable error, or return false, nil only for transient conditions;
ensure callers of hasKMSEncryptionProvider/kmsConfigured logic remain unchanged
except for proper error handling and use the function names podHasKMSSidecar,
hasKMSEncryptionProvider, and the kmsConfigured/kmsInPod variables to locate and
update the code paths.

---

Nitpick comments:
In `@pkg/operator/kms/precondition_test.go`:
- Around line 35-37: The test helper featureGateAccessorWithError currently
passes an open channel so AreInitialFeatureGatesObserved() returns false and the
error from CurrentFeatureGates() is never hit; modify
featureGateAccessorWithError to create the channel, close it (e.g., ch :=
make(chan struct{}); close(ch)), and pass that closed channel into
featuregates.NewHardcodedFeatureGateAccessForTesting(nil, nil, ch,
fmt.Errorf("test error")) so AreInitialFeatureGatesObserved() returns true and
the CurrentFeatureGates() error path is exercised.

In `@pkg/operator/kms/precondition.go`:
- Line 104: Replace the direct type assertion on obj with the comma-ok pattern
to avoid a panic: attempt config, ok :=
obj.(*apiserverv1.EncryptionConfiguration) and if !ok handle the unexpected type
by returning an error (or logging and returning false) from the enclosing
function/condition; update any downstream uses of config to only run when ok is
true so the code is defensive against Decode returning a different type.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 41e8cb51-44ee-4991-b3c7-4fb5b8636a1c

📥 Commits

Reviewing files that changed from the base of the PR and between 8cb1bc8 and d5f6194.

📒 Files selected for processing (3)

pkg/operator/kms/precondition.go
pkg/operator/kms/precondition_test.go
pkg/operator/starter.go

coderabbitai · 2026-04-13T20:15:33Z

+		kmsConfigured := hasKMSEncryptionProvider(encryptionConfig)
+		// FIXME(bertinatto): plugin name is hardcoded here
+		kmsInPod := podHasKMSSidecar(podCM, "kms-plugin")
+		if kmsConfigured != kmsInPod {
+			klog.Infof("kms revision precondition not met: kmsConfigured=%v kmsInPod=%v, waiting for reconciliation", kmsConfigured, kmsInPod)
+			return false, nil
+		}


⚠️ Potential issue | 🟡 Minor

Silent parse failures in helpers can cause permanent blocking.

When podHasKMSSidecar fails to parse pod.yaml (malformed JSON or missing key), it returns false. If KMS is configured in the encryption config (kmsConfigured=true), the precondition will block indefinitely waiting for a "reconciliation" that cannot happen because the underlying issue is a parse error, not a timing mismatch.

Consider returning an error from the helper functions when parsing fails, and either logging a warning or returning an error from the precondition to surface the issue.

Proposed approach

-func podHasKMSSidecar(podCM *corev1.ConfigMap, sidecarName string) bool { +func podHasKMSSidecar(podCM *corev1.ConfigMap, sidecarName string) (bool, error) { podYAML, ok := podCM.Data["pod.yaml"] if !ok { - return false + return false, fmt.Errorf("pod.yaml key not found in configmap") } var pod corev1.Pod if err := json.Unmarshal([]byte(podYAML), &pod); err != nil { - return false + return false, fmt.Errorf("failed to unmarshal pod.yaml: %w", err) } - return slices.ContainsFunc(pod.Spec.Containers, func(c corev1.Container) bool { + return slices.ContainsFunc(pod.Spec.Containers, func(c corev1.Container) bool { return c.Name == sidecarName - }) + }), nil }

Then in the main function, handle the error appropriately (log warning and proceed, or block with error).

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@pkg/operator/kms/precondition.go` around lines 65 - 71, The precondition silently treats parse/lookup failures in podHasKMSSidecar as "false", causing permanent blocking when kmsConfigured is true; update podHasKMSSidecar to return (bool, error) (or a distinct error value on parse/missing-key), propagate that change into the precondition here so you capture (kmsInPod, err) from podHasKMSSidecar, and handle err explicitly: log a warning with the parsing error and decide to either return false, err to block with an actionable error, or return false, nil only for transient conditions; ensure callers of hasKMSEncryptionProvider/kmsConfigured logic remain unchanged except for proper error handling and use the function names podHasKMSSidecar, hasKMSEncryptionProvider, and the kmsConfigured/kmsInPod variables to locate and update the code paths.

ardaguclu · 2026-04-14T07:17:03Z

+			return true, nil
+		}
+
+		encryptionConfig, err := kubeClient.CoreV1().Secrets("openshift-config-managed").Get(ctx, "encryption-config-openshift-kube-apiserver", metav1.GetOptions{})


(My questions may not be accurate)

shouldn't we query from revisioned Secret in openshift-kube-apiserver-operator ns rather than openshift-config-managed (which is unrevisioned)

p0lyn0mial · 2026-04-14T07:54:02Z

+		if err != nil {
+			return true, nil
+		}
+		if !featureGates.Enabled(features.FeatureGateKMSEncryption) {


Is the FG check required if the presence of a KMS configuration is also behind the same FG?

p0lyn0mial · 2026-04-14T07:54:56Z

+			return true, nil
+		}
+
+		encryptionConfig, err := kubeClient.CoreV1().Secrets("openshift-config-managed").Get(ctx, "encryption-config-openshift-kube-apiserver", metav1.GetOptions{})


I don’t think this is what we want. We need to compare revisioned resources. This will likely require changes in library-go. We need to validate revisioned resources to ensure they won’t change and we need to do that before the system switches to that revision.

I think, there might be an issue. This function is called from targetconfigcontroller which does not know revision number. So that how can targetconfigcontroller query the revisioned encryption config Secret under openshift-kube-apiserver. Is there a missing part in our design.

So that how can targetconfigcontroller query the revisioned encryption config Secret under openshift-kube-apiserver

it cannot, that's why we decided to add an additional function to the revision controller (library-go).

https://docs.google.com/document/d/1htjUQmCk60AVkcOe1QzLSYNrBhO7T8hVsFDiTbR4AJM/edit?tab=t.0#heading=h.q9rx93jt2t5r

from that doc

The third solution is to add a validation function to the RevisionController. This function validated the revisioned resources. In our case the kas manifest and the encryption-configuration. If the configuration is invalid we return an error. As a result a new revision is not generated until the system converges to a good state.

We choose option 3. This solution seems the simplest. It doesn't change the existing flow and doesn't introduce new states.

Thanks for explanation, it makes sense. It seems everything is under control and there is no missing pieces for this part :)

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@go.mod`:
- Line 140: The go.mod replace directive replacing
github.com/openshift/library-go with github.com/bertinatto/library-go is
pointing to an inaccessible/personal fork and must be removed or replaced with
an org-owned/upstream module; update the replace in go.mod to either use the
upstream github.com/openshift/library-go (or a vetted org-owned fork) at a valid
version, remove the transient personal fork entry, and add a short comment above
the replace directive describing the temporary nature and explicit criteria for
removal (e.g., upstream PR merged or version published) so reviewers know when
it can be deleted.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 81f62f60-3ab8-46a5-88a0-588813068f96

📥 Commits

Reviewing files that changed from the base of the PR and between d5f6194 and 8888acf.

⛔ Files ignored due to path filters (4)

go.sum is excluded by !**/*.sum
vendor/github.com/openshift/library-go/pkg/operator/revisioncontroller/revision_controller.go is excluded by !**/vendor/**, !vendor/**
vendor/github.com/openshift/library-go/pkg/operator/staticpod/controllers.go is excluded by !**/vendor/**, !vendor/**
vendor/modules.txt is excluded by !**/vendor/**, !vendor/**

📒 Files selected for processing (1)

go.mod

coderabbitai · 2026-04-23T11:32:17Z


 replace github.com/onsi/ginkgo/v2 => github.com/openshift/onsi-ginkgo/v2 v2.6.1-0.20251001123353-fd5b1fb35db1
+
+replace github.com/openshift/library-go => github.com/bertinatto/library-go v0.0.0-20260423111950-e491ed3b41e5


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash set -euo pipefail # Verify replacement target and version from current go.mod replace_line="$(awk '/^replace github.com\/openshift\/library-go =>/{print $0}' go.mod)" echo "replace line: ${replace_line}" target_mod="$(awk '/^replace github.com\/openshift\/library-go =>/{print $4}' go.mod)" target_ver="$(awk '/^replace github.com\/openshift\/library-go =>/{print $5}' go.mod)" if [[ -z "${target_mod}" || -z "${target_ver}" ]]; then echo "ERROR: could not parse replacement module/version" >&2 exit 1 fi echo "target module: ${target_mod}" echo "target version/ref: ${target_ver}" # Check that pinned ref exists and inspect declared module path url="https://raw.githubusercontent.com/${target_mod}/${target_ver}/go.mod" echo "fetching: ${url}" curl -fsSL "${url}" | sed -n '1,10p' declared_module="$(curl -fsSL "${url}" | awk '/^module /{print $2; exit}')" echo "declared module in replacement go.mod: ${declared_module}" # Expected outcome: # 1) URL resolves (ref is retrievable/stable) # 2) declared module path is compatible with replace target usage # 3) team confirms provenance policy for using a personal fork in CI

Repository: openshift/cluster-kube-apiserver-operator

Length of output: 498

🏁 Script executed:

#!/bin/bash set -euo pipefail # Extract replacement details from go.mod replace_line="$(grep '^replace github.com/openshift/library-go' go.mod)" echo "Replace directive: ${replace_line}" # Parse out the replacement module and version # Format: replace github.com/openshift/library-go => github.com/bertinatto/library-go v0.0.0-20260423111950-e491ed3b41e5 target_mod=$(echo "${replace_line}" | awk '{print $4}') target_ver=$(echo "${replace_line}" | awk '{print $5}') echo "Target module: ${target_mod}" echo "Target version: ${target_ver}" # Convert module path to GitHub user/repo # github.com/bertinatto/library-go => bertinatto/library-go github_path=$(echo "${target_mod}" | sed 's|github.com/||') echo "GitHub path: ${github_path}" # Check if repository exists using GitHub API echo "" echo "Checking if repository exists..." api_response=$(curl -fsSL "https://api.github.com/repos/${github_path}" 2>&1 || echo "FAILED") if echo "${api_response}" | grep -q '"id"'; then echo "✓ Repository exists" echo "${api_response}" | grep -E '"name"|"private"|"fork"' | head -5 else echo "✗ Repository not found or API error" fi # Try to check if the specific commit/tag exists echo "" echo "Checking commit/tag: ${target_ver}" commit_check=$(curl -fsSL "https://api.github.com/repos/${github_path}/commits/${target_ver}" 2>&1 || echo "FAILED") if echo "${commit_check}" | grep -q '"sha"'; then echo "✓ Commit/tag exists" else echo "✗ Commit/tag not found" fi

Repository: openshift/cluster-kube-apiserver-operator

Length of output: 171

🏁 Script executed:

#!/bin/bash set -euo pipefail # First, verify the replace directive is in go.mod as shown echo "=== Verifying go.mod content ===" grep -n "^replace github.com/openshift/library-go" go.mod # Extract the exact replacement target replace_line=$(grep "^replace github.com/openshift/library-go" go.mod) echo "" echo "Full replace line: ${replace_line}" # Parse module and version target_mod=$(echo "${replace_line}" | awk -F' ' '{print $4}') target_ver=$(echo "${replace_line}" | awk -F' ' '{print $5}') echo "" echo "Target module: ${target_mod}" echo "Target version: ${target_ver}" # Try a simple HTTP HEAD request to the GitHub repo (simpler than API) github_path=$(echo "${target_mod}" | sed 's|github.com/||') echo "GitHub repo: ${github_path}" echo "" echo "Attempting to verify repository is publicly accessible..." # Try to fetch the main branch README as a simple existence check if timeout 10 curl -fsSLI "https://github.com/${github_path}" > /dev/null 2>&1; then echo "✓ GitHub repository appears to exist" else echo "✗ Cannot reach repository or repository does not exist" fi # Also check: is this a personal fork (bertinatto) or org-owned? echo "" echo "Assessment:" echo "- Module is from personal account: ${github_path} (bertinatto)" echo "- Original: github.com/openshift/library-go" echo "- Redirects to personal fork instead of org-owned repository"

Repository: openshift/cluster-kube-apiserver-operator

Length of output: 800

Replace directive references an inaccessible or non-existent repository.

The replace directive on line 140 redirects github.com/openshift/library-go to github.com/bertinatto/library-go, which cannot be reached or does not exist. This will cause build failures. Additionally, redirecting to a personal fork creates supply-chain and reproducibility risk for the operator runtime and test utilities.

Switch to an org-owned (or upstream) replacement before merge, and document removal criteria for this replace directive.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@go.mod` at line 140, The go.mod replace directive replacing github.com/openshift/library-go with github.com/bertinatto/library-go is pointing to an inaccessible/personal fork and must be removed or replaced with an org-owned/upstream module; update the replace in go.mod to either use the upstream github.com/openshift/library-go (or a vetted org-owned fork) at a valid version, remove the transient personal fork entry, and add a short comment above the replace directive describing the temporary nature and explicit criteria for removal (e.g., upstream PR merged or version published) so reviewers know when it can be deleted.

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/operator/kms/postcheck.go`:
- Around line 58-69: The post-check incorrectly treats a missing
encryption-config-<rev> as a successful check; update the logic in the kms
revision post-check (around the encryptionSecret variable and the podCM
retrieval) so that when kubeClient.CoreV1().Secrets(...).Get returns IsNotFound
you do NOT return nil but instead mark kmsConfigured=false and continue to
compare with the kube-apiserver-pod config; however when
kubeClient.CoreV1().ConfigMaps(...).Get for "kube-apiserver-pod"+revSuffix
returns IsNotFound you should still treat that as an error (returning the error)
because the revision is incomplete. Ensure the code paths reference
encryptionSecret, podCM, and revSuffix and preserve existing error wrapping for
non-NotFound errors.
- Around line 48-50: The code calls featureGateAccessor.CurrentFeatureGates()
and currently returns nil on error, which disables the guard; change this to
fail closed by returning an error (wrap or return the received err) from the
surrounding function (the post-check handler that uses featureGates) instead of
nil so the revision is rejected when feature gate state cannot be read; update
the error path where featureGateAccessor.CurrentFeatureGates() is invoked to
propagate a descriptive error rather than returning nil.
- Around line 91-94: podHasKMSSidecar is unmarshalling YAML manifest using
json.Unmarshal which fails; replace the json unmarshal with a YAML unmarshal
(e.g., use sigs.k8s.io/yaml.Unmarshal) when decoding podYAML into the
corev1.Pod, and update the imports to remove "encoding/json" and add the YAML
package import so the ConfigMap key "pod.yaml" is parsed correctly.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: c879a035-2c29-4cbd-a8cf-e7b617379062

📥 Commits

Reviewing files that changed from the base of the PR and between 8888acf and 0e3c048.

📒 Files selected for processing (2)

pkg/operator/kms/postcheck.go
pkg/operator/starter.go

coderabbitai · 2026-04-23T14:43:34Z

+	featureGates, err := featureGateAccessor.CurrentFeatureGates()
+	if err != nil {
+		return nil


⚠️ Potential issue | 🟠 Major

Fail closed when feature-gate state can't be read.

Line 50 returns nil on CurrentFeatureGates() error, which disables this guard exactly when the gate state is unknown and allows the revision to proceed unvalidated.

Suggested fix

featureGates, err := featureGateAccessor.CurrentFeatureGates() if err != nil { - return nil + return fmt.Errorf("kms revision post-check: failed to read feature gates: %w", err) }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

featureGates, err := featureGateAccessor.CurrentFeatureGates()

if err != nil {

return nil

featureGates, err := featureGateAccessor.CurrentFeatureGates()

if err != nil {

return fmt.Errorf("kms revision post-check: failed to read feature gates: %w", err)

}

🧰 Tools

🪛 golangci-lint (2.11.4)

[error] 50-50: error is not nil (line 48) but it returns nil

(nilerr)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@pkg/operator/kms/postcheck.go` around lines 48 - 50, The code calls featureGateAccessor.CurrentFeatureGates() and currently returns nil on error, which disables the guard; change this to fail closed by returning an error (wrap or return the received err) from the surrounding function (the post-check handler that uses featureGates) instead of nil so the revision is rejected when feature gate state cannot be read; update the error path where featureGateAccessor.CurrentFeatureGates() is invoked to propagate a descriptive error rather than returning nil.

coderabbitai · 2026-04-23T14:43:35Z

+	encryptionSecret, err := kubeClient.CoreV1().Secrets(operatorclient.TargetNamespace).Get(ctx, "encryption-config"+revSuffix, metav1.GetOptions{})
+	if apierrors.IsNotFound(err) {
+		return nil
+	}
+	if err != nil {
+		return fmt.Errorf("kms revision post-check: failed to get encryption-config for revision %d: %w", revision, err)
+	}
+
+	podCM, err := kubeClient.CoreV1().ConfigMaps(operatorclient.TargetNamespace).Get(ctx, "kube-apiserver-pod"+revSuffix, metav1.GetOptions{})
+	if apierrors.IsNotFound(err) {
+		return nil
+	}


⚠️ Potential issue | 🟠 Major

Don't treat missing revision resources as a successful check.

Because encryption-config is optional in pkg/operator/starter.go Lines 651-653, the IsNotFound path on Line 59 is a real state here. Returning nil means a revision can still pass when kube-apiserver-pod-<rev> already has kms-plugin but encryption-config-<rev> is absent—the exact mismatch this hook is supposed to block.

A safer split is: treat a missing encryption-config-<rev> as kmsConfigured=false and continue the comparison, but treat a missing kube-apiserver-pod-<rev> as an error because the revision is incomplete.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@pkg/operator/kms/postcheck.go` around lines 58 - 69, The post-check incorrectly treats a missing encryption-config-<rev> as a successful check; update the logic in the kms revision post-check (around the encryptionSecret variable and the podCM retrieval) so that when kubeClient.CoreV1().Secrets(...).Get returns IsNotFound you do NOT return nil but instead mark kmsConfigured=false and continue to compare with the kube-apiserver-pod config; however when kubeClient.CoreV1().ConfigMaps(...).Get for "kube-apiserver-pod"+revSuffix returns IsNotFound you should still treat that as an error (returning the error) because the revision is incomplete. Ensure the code paths reference encryptionSecret, podCM, and revSuffix and preserve existing error wrapping for non-NotFound errors.

coderabbitai · 2026-04-23T14:43:35Z

+	var pod corev1.Pod
+	if err := json.Unmarshal([]byte(podYAML), &pod); err != nil {
+		return false
+	}


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🌐 Web query:

Does Go's encoding/json.Unmarshal accept YAML input, or only JSON?

💡 Result:

No, Go's encoding/json.Unmarshal accepts only JSON input. It explicitly parses "JSON-encoded data" and returns a SyntaxError for invalid JSON syntax, such as YAML-specific features like colons without quotes or indentation-based structures.

Citations:

1: https://pkg.go.dev/encoding/json

2: https://pkg.go.dev/encoding/json@go1.25.5

3: https://pkg.go.dev/encoding/json/

🏁 Script executed:

cat -n pkg/operator/kms/postcheck.go | head -120 | tail -50

Repository: openshift/cluster-kube-apiserver-operator

Length of output: 2114

🏁 Script executed:

cat -n pkg/operator/kms/postcheck.go | head -100

Repository: openshift/cluster-kube-apiserver-operator

Length of output: 3983

🏁 Script executed:

head -30 pkg/operator/kms/postcheck.go

Repository: openshift/cluster-kube-apiserver-operator

Length of output: 964

Use yaml.Unmarshal to parse the pod manifest from ConfigMap.

Line 92 uses json.Unmarshal on podYAML retrieved from a ConfigMap with key "pod.yaml" (line 86). The encoding/json package only accepts JSON-formatted data and will fail when parsing YAML. This causes podHasKMSSidecar() to always return false, misclassifying KMS-enabled revisions and blocking all rollouts when KMS encryption is configured.

Suggested fix

- if err := json.Unmarshal([]byte(podYAML), &pod); err != nil { + if err := yaml.Unmarshal([]byte(podYAML), &pod); err != nil { return false }

Also update the imports from "encoding/json" to include a YAML unmarshaller such as sigs.k8s.io/yaml.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@pkg/operator/kms/postcheck.go` around lines 91 - 94, podHasKMSSidecar is unmarshalling YAML manifest using json.Unmarshal which fails; replace the json unmarshal with a YAML unmarshal (e.g., use sigs.k8s.io/yaml.Unmarshal) when decoding podYAML into the corev1.Pod, and update the imports to remove "encoding/json" and add the YAML package import so the ConfigMap key "pod.yaml" is parsed correctly.

openshift-ci · 2026-04-23T18:28:14Z

@bertinatto: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

bertinatto · 2026-04-23T21:21:37Z

/close
superseded by #2121

openshift-ci · 2026-04-23T21:21:47Z

@bertinatto: Closed this PR.

Details

In response to this:

/close
superseded by #2121

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

openshift-ci Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 13, 2026

bertinatto commented Apr 13, 2026

View reviewed changes

openshift-ci Bot requested review from p0lyn0mial and sanchezl April 13, 2026 20:09

coderabbitai Bot reviewed Apr 13, 2026

View reviewed changes

ardaguclu reviewed Apr 14, 2026

View reviewed changes

p0lyn0mial reviewed Apr 14, 2026

View reviewed changes

wip

8888acf

bertinatto force-pushed the kms-plugin-lifecycle-precondition branch from d5f6194 to 8888acf Compare April 23, 2026 11:21

coderabbitai Bot reviewed Apr 23, 2026

View reviewed changes

wip

0e3c048

bertinatto force-pushed the kms-plugin-lifecycle-precondition branch from 0d56138 to 0e3c048 Compare April 23, 2026 14:31

coderabbitai Bot reviewed Apr 23, 2026

View reviewed changes

openshift-ci Bot closed this Apr 23, 2026


		replace github.com/onsi/ginkgo/v2 => github.com/openshift/onsi-ginkgo/v2 v2.6.1-0.20251001123353-fd5b1fb35db1

		replace github.com/openshift/library-go => github.com/bertinatto/library-go v0.0.0-20260423111950-e491ed3b41e5

Conversation

bertinatto commented Apr 13, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

Choose a reason for hiding this comment

Uh oh!

openshift-ci Bot commented Apr 13, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

openshift-ci Bot commented Apr 23, 2026

Uh oh!

bertinatto commented Apr 23, 2026

Uh oh!

openshift-ci Bot commented Apr 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

bertinatto commented Apr 13, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 13, 2026 •

edited

Loading