Skip to content

[WIP] kms: block revisions until encryption config and kms sidecar are in sync#2102

Closed
bertinatto wants to merge 2 commits into
openshift:mainfrom
bertinatto:kms-plugin-lifecycle-precondition
Closed

[WIP] kms: block revisions until encryption config and kms sidecar are in sync#2102
bertinatto wants to merge 2 commits into
openshift:mainfrom
bertinatto:kms-plugin-lifecycle-precondition

Conversation

@bertinatto
Copy link
Copy Markdown
Member

@bertinatto bertinatto commented Apr 13, 2026

Add a revision controller precondition that prevents rolling out a new revision when there is a mismatch between the encryption configuration and the pod spec.

CC @ardaguclu @p0lyn0mial

Summary by CodeRabbit

  • New Features

    • Added a KMS revision consistency check that validates KMS encryption configuration and the presence of the KMS plugin in API server pods when KMS encryption is enabled.
  • Chores

    • Updated dependency resolution mapping for an internal library to a specific module version.

@openshift-ci openshift-ci Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 13, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 13, 2026

Walkthrough

Adds a go.mod replace directive and a KMS-specific revision post-check wired into the static pod operator’s revision controller startup to validate KMS configuration vs. pod presence after feature gate observation.

Changes

Cohort / File(s) Summary
Dependency Resolution
go.mod
Added a replace directive redirecting github.com/openshift/library-go to github.com/bertinatto/library-go at a specific pseudo-version.
KMS Revision Post-check
pkg/operator/kms/postcheck.go
New exported function KMSRevisionPostCheck(...) implemented. Retrieves encryption-config-<rev> Secret and kube-apiserver-pod-<rev> ConfigMap, decodes/unmarshals to detect KMS providers and kms-plugin container, and returns an error on mismatch.
Operator Startup Wiring
pkg/operator/starter.go
Imported kms and added kms.KMSRevisionPostCheck(kubeClient, featureGateAccessor) to the revision-controller post-check chain in RunOperator.

Sequence Diagram(s)

sequenceDiagram
    participant Controller as RevisionController
    participant Feature as FeatureGateAccessor
    participant KubeAPI as Kubernetes API (Secrets/ConfigMaps)
    participant Logger as Logger

    Controller->>Feature: Check KMSEncryption feature gate
    alt KMSEncryption enabled
        Controller->>KubeAPI: GET Secret `encryption-config-<rev>`
        KubeAPI-->>Controller: Secret (or NotFound / Error)
        Controller->>KubeAPI: GET ConfigMap `kube-apiserver-pod-<rev>`
        KubeAPI-->>Controller: ConfigMap (or NotFound / Error)
        Controller->>Controller: Decode Secret -> EncryptionConfiguration
        Controller->>Controller: Parse pod.yaml -> corev1.Pod
        alt KMS configured != KMS present
            Controller->>Logger: Log mismatch error
            Controller-->>Controller: Return error
        else
            Controller->>Logger: Log success (v=4)
            Controller-->>Controller: Return nil
        end
    else
        Controller->>Logger: Skip KMS post-check
        Controller-->>Controller: Return nil
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 10 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 11.11% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Test Structure And Quality ❓ Inconclusive PR adds implementation code (postcheck.go with 119 lines) but does not include any Ginkgo test files for the new KMSRevisionPostCheck functionality. Clarify whether test files exist for the KMSRevisionPostCheck function. If tests were added, provide the test file paths for review against Ginkgo quality requirements.
✅ Passed checks (10 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: adding a KMS revision post-check to ensure encryption config and KMS sidecar synchronization before rolling out revisions.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed This pull request does not add or modify any Ginkgo test files. The PR introduces changes to go.mod, a new production function in pkg/operator/kms/postcheck.go, and modifications to pkg/operator/starter.go. None of these files contain Ginkgo test declarations (Describe, Context, It, When, etc.), so the check for stable and deterministic test names is not applicable to this PR.
Microshift Test Compatibility ✅ Passed The MicroShift Test Compatibility check is not applicable to this PR. The PR adds a go.mod directive, KMS validation functions, and KMS post-check wiring—none of which include new Ginkgo e2e test definitions.
Single Node Openshift (Sno) Test Compatibility ✅ Passed The pull request does not add any new Ginkgo e2e tests. Changes consist only of operator implementation code: a dependency replace directive in go.mod, a new KMS post-check function in pkg/operator/kms/postcheck.go, and integration into the operator startup in pkg/operator/starter.go.
Topology-Aware Scheduling Compatibility ✅ Passed PR introduces KMS validation post-check hook with no topology-specific constraints, scheduling rules, or cluster topology assumptions; validation operates uniformly across all OpenShift topologies.
Ote Binary Stdout Contract ✅ Passed Pull request introduces no process-level stdout writes violating the OTE Binary Stdout Contract.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed No Ginkgo e2e tests were added. The PR modifies operator implementation files (KMS post-check logic) and updates dependencies, containing no test code.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Comment thread pkg/operator/kms/precondition.go Outdated

kmsConfigured := hasKMSEncryptionProvider(encryptionConfig)
// FIXME(bertinatto): plugin name is hardcoded here
kmsInPod := podHasKMSSidecar(podCM, "kms-plugin")
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The validation that these 2 functions do is still premature. More work needs to be done here.

@openshift-ci openshift-ci Bot requested review from p0lyn0mial and sanchezl April 13, 2026 20:09
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Apr 13, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: bertinatto
Once this PR has been reviewed and has the lgtm label, please assign sanchezl for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
pkg/operator/kms/precondition.go (1)

104-104: Consider using comma-ok pattern for type assertion.

The direct type assertion obj.(*apiserverv1.EncryptionConfiguration) will panic if Decode returns an unexpected type. While the GVK hint should ensure the correct type is returned on success, using the comma-ok pattern would be more defensive.

Defensive type assertion
-	config := obj.(*apiserverv1.EncryptionConfiguration)
+	config, ok := obj.(*apiserverv1.EncryptionConfiguration)
+	if !ok {
+		return false
+	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/operator/kms/precondition.go` at line 104, Replace the direct type
assertion on obj with the comma-ok pattern to avoid a panic: attempt config, ok
:= obj.(*apiserverv1.EncryptionConfiguration) and if !ok handle the unexpected
type by returning an error (or logging and returning false) from the enclosing
function/condition; update any downstream uses of config to only run when ok is
true so the code is defensive against Decode returning a different type.
pkg/operator/kms/precondition_test.go (1)

35-37: Test case "feature gate accessor returns error" may not exercise the error path.

featureGateAccessorWithError() passes a non-closed channel, which means AreInitialFeatureGatesObserved() will return false before CurrentFeatureGates() is called. The precondition returns early, so the error from CurrentFeatureGates() is never reached.

Consider closing the channel to properly test the error-handling path:

Fix to test the actual error path
 func featureGateAccessorWithError() featuregates.FeatureGateAccess {
-	return featuregates.NewHardcodedFeatureGateAccessForTesting(nil, nil, make(chan struct{}), fmt.Errorf("test error"))
+	observed := make(chan struct{})
+	close(observed)
+	return featuregates.NewHardcodedFeatureGateAccessForTesting(nil, nil, observed, fmt.Errorf("test error"))
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/operator/kms/precondition_test.go` around lines 35 - 37, The test helper
featureGateAccessorWithError currently passes an open channel so
AreInitialFeatureGatesObserved() returns false and the error from
CurrentFeatureGates() is never hit; modify featureGateAccessorWithError to
create the channel, close it (e.g., ch := make(chan struct{}); close(ch)), and
pass that closed channel into
featuregates.NewHardcodedFeatureGateAccessForTesting(nil, nil, ch,
fmt.Errorf("test error")) so AreInitialFeatureGatesObserved() returns true and
the CurrentFeatureGates() error path is exercised.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/operator/kms/precondition.go`:
- Around line 65-71: The precondition silently treats parse/lookup failures in
podHasKMSSidecar as "false", causing permanent blocking when kmsConfigured is
true; update podHasKMSSidecar to return (bool, error) (or a distinct error value
on parse/missing-key), propagate that change into the precondition here so you
capture (kmsInPod, err) from podHasKMSSidecar, and handle err explicitly: log a
warning with the parsing error and decide to either return false, err to block
with an actionable error, or return false, nil only for transient conditions;
ensure callers of hasKMSEncryptionProvider/kmsConfigured logic remain unchanged
except for proper error handling and use the function names podHasKMSSidecar,
hasKMSEncryptionProvider, and the kmsConfigured/kmsInPod variables to locate and
update the code paths.

---

Nitpick comments:
In `@pkg/operator/kms/precondition_test.go`:
- Around line 35-37: The test helper featureGateAccessorWithError currently
passes an open channel so AreInitialFeatureGatesObserved() returns false and the
error from CurrentFeatureGates() is never hit; modify
featureGateAccessorWithError to create the channel, close it (e.g., ch :=
make(chan struct{}); close(ch)), and pass that closed channel into
featuregates.NewHardcodedFeatureGateAccessForTesting(nil, nil, ch,
fmt.Errorf("test error")) so AreInitialFeatureGatesObserved() returns true and
the CurrentFeatureGates() error path is exercised.

In `@pkg/operator/kms/precondition.go`:
- Line 104: Replace the direct type assertion on obj with the comma-ok pattern
to avoid a panic: attempt config, ok :=
obj.(*apiserverv1.EncryptionConfiguration) and if !ok handle the unexpected type
by returning an error (or logging and returning false) from the enclosing
function/condition; update any downstream uses of config to only run when ok is
true so the code is defensive against Decode returning a different type.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 41e8cb51-44ee-4991-b3c7-4fb5b8636a1c

📥 Commits

Reviewing files that changed from the base of the PR and between 8cb1bc8 and d5f6194.

📒 Files selected for processing (3)
  • pkg/operator/kms/precondition.go
  • pkg/operator/kms/precondition_test.go
  • pkg/operator/starter.go

Comment thread pkg/operator/kms/precondition.go Outdated
Comment on lines +65 to +71
kmsConfigured := hasKMSEncryptionProvider(encryptionConfig)
// FIXME(bertinatto): plugin name is hardcoded here
kmsInPod := podHasKMSSidecar(podCM, "kms-plugin")
if kmsConfigured != kmsInPod {
klog.Infof("kms revision precondition not met: kmsConfigured=%v kmsInPod=%v, waiting for reconciliation", kmsConfigured, kmsInPod)
return false, nil
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Silent parse failures in helpers can cause permanent blocking.

When podHasKMSSidecar fails to parse pod.yaml (malformed JSON or missing key), it returns false. If KMS is configured in the encryption config (kmsConfigured=true), the precondition will block indefinitely waiting for a "reconciliation" that cannot happen because the underlying issue is a parse error, not a timing mismatch.

Consider returning an error from the helper functions when parsing fails, and either logging a warning or returning an error from the precondition to surface the issue.

Proposed approach
-func podHasKMSSidecar(podCM *corev1.ConfigMap, sidecarName string) bool {
+func podHasKMSSidecar(podCM *corev1.ConfigMap, sidecarName string) (bool, error) {
 	podYAML, ok := podCM.Data["pod.yaml"]
 	if !ok {
-		return false
+		return false, fmt.Errorf("pod.yaml key not found in configmap")
 	}
 
 	var pod corev1.Pod
 	if err := json.Unmarshal([]byte(podYAML), &pod); err != nil {
-		return false
+		return false, fmt.Errorf("failed to unmarshal pod.yaml: %w", err)
 	}
 
-	return slices.ContainsFunc(pod.Spec.Containers, func(c corev1.Container) bool {
+	return slices.ContainsFunc(pod.Spec.Containers, func(c corev1.Container) bool {
 		return c.Name == sidecarName
-	})
+	}), nil
 }

Then in the main function, handle the error appropriately (log warning and proceed, or block with error).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/operator/kms/precondition.go` around lines 65 - 71, The precondition
silently treats parse/lookup failures in podHasKMSSidecar as "false", causing
permanent blocking when kmsConfigured is true; update podHasKMSSidecar to return
(bool, error) (or a distinct error value on parse/missing-key), propagate that
change into the precondition here so you capture (kmsInPod, err) from
podHasKMSSidecar, and handle err explicitly: log a warning with the parsing
error and decide to either return false, err to block with an actionable error,
or return false, nil only for transient conditions; ensure callers of
hasKMSEncryptionProvider/kmsConfigured logic remain unchanged except for proper
error handling and use the function names podHasKMSSidecar,
hasKMSEncryptionProvider, and the kmsConfigured/kmsInPod variables to locate and
update the code paths.

Comment thread pkg/operator/kms/precondition.go Outdated
return true, nil
}

encryptionConfig, err := kubeClient.CoreV1().Secrets("openshift-config-managed").Get(ctx, "encryption-config-openshift-kube-apiserver", metav1.GetOptions{})
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(My questions may not be accurate)

shouldn't we query from revisioned Secret in openshift-kube-apiserver-operator ns rather than openshift-config-managed (which is unrevisioned)

Comment thread pkg/operator/kms/precondition.go Outdated
if err != nil {
return true, nil
}
if !featureGates.Enabled(features.FeatureGateKMSEncryption) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the FG check required if the presence of a KMS configuration is also behind the same FG?

Comment thread pkg/operator/kms/precondition.go Outdated
return true, nil
}

encryptionConfig, err := kubeClient.CoreV1().Secrets("openshift-config-managed").Get(ctx, "encryption-config-openshift-kube-apiserver", metav1.GetOptions{})
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don’t think this is what we want. We need to compare revisioned resources. This will likely require changes in library-go. We need to validate revisioned resources to ensure they won’t change and we need to do that before the system switches to that revision.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, there might be an issue. This function is called from targetconfigcontroller which does not know revision number. So that how can targetconfigcontroller query the revisioned encryption config Secret under openshift-kube-apiserver. Is there a missing part in our design.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So that how can targetconfigcontroller query the revisioned encryption config Secret under openshift-kube-apiserver

it cannot, that's why we decided to add an additional function to the revision controller (library-go).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from that doc

The third solution is to add a validation function to the RevisionController. This function validated the revisioned resources. In our case the kas manifest and the encryption-configuration. If the configuration is invalid we return an error. As a result a new revision is not generated until the system converges to a good state.

We choose option 3. This solution seems the simplest. It doesn't change the existing flow and doesn't introduce new states.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for explanation, it makes sense. It seems everything is under control and there is no missing pieces for this part :)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yw.

@bertinatto bertinatto force-pushed the kms-plugin-lifecycle-precondition branch from d5f6194 to 8888acf Compare April 23, 2026 11:21
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@go.mod`:
- Line 140: The go.mod replace directive replacing
github.com/openshift/library-go with github.com/bertinatto/library-go is
pointing to an inaccessible/personal fork and must be removed or replaced with
an org-owned/upstream module; update the replace in go.mod to either use the
upstream github.com/openshift/library-go (or a vetted org-owned fork) at a valid
version, remove the transient personal fork entry, and add a short comment above
the replace directive describing the temporary nature and explicit criteria for
removal (e.g., upstream PR merged or version published) so reviewers know when
it can be deleted.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 81f62f60-3ab8-46a5-88a0-588813068f96

📥 Commits

Reviewing files that changed from the base of the PR and between d5f6194 and 8888acf.

⛔ Files ignored due to path filters (4)
  • go.sum is excluded by !**/*.sum
  • vendor/github.com/openshift/library-go/pkg/operator/revisioncontroller/revision_controller.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/library-go/pkg/operator/staticpod/controllers.go is excluded by !**/vendor/**, !vendor/**
  • vendor/modules.txt is excluded by !**/vendor/**, !vendor/**
📒 Files selected for processing (1)
  • go.mod

Comment thread go.mod

replace github.com/onsi/ginkgo/v2 => github.com/openshift/onsi-ginkgo/v2 v2.6.1-0.20251001123353-fd5b1fb35db1

replace github.com/openshift/library-go => github.com/bertinatto/library-go v0.0.0-20260423111950-e491ed3b41e5
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Verify replacement target and version from current go.mod
replace_line="$(awk '/^replace github.com\/openshift\/library-go =>/{print $0}' go.mod)"
echo "replace line: ${replace_line}"

target_mod="$(awk '/^replace github.com\/openshift\/library-go =>/{print $4}' go.mod)"
target_ver="$(awk '/^replace github.com\/openshift\/library-go =>/{print $5}' go.mod)"

if [[ -z "${target_mod}" || -z "${target_ver}" ]]; then
  echo "ERROR: could not parse replacement module/version" >&2
  exit 1
fi

echo "target module: ${target_mod}"
echo "target version/ref: ${target_ver}"

# Check that pinned ref exists and inspect declared module path
url="https://raw.githubusercontent.com/${target_mod}/${target_ver}/go.mod"
echo "fetching: ${url}"
curl -fsSL "${url}" | sed -n '1,10p'

declared_module="$(curl -fsSL "${url}" | awk '/^module /{print $2; exit}')"
echo "declared module in replacement go.mod: ${declared_module}"

# Expected outcome:
# 1) URL resolves (ref is retrievable/stable)
# 2) declared module path is compatible with replace target usage
# 3) team confirms provenance policy for using a personal fork in CI

Repository: openshift/cluster-kube-apiserver-operator

Length of output: 498


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Extract replacement details from go.mod
replace_line="$(grep '^replace github.com/openshift/library-go' go.mod)"
echo "Replace directive: ${replace_line}"

# Parse out the replacement module and version
# Format: replace github.com/openshift/library-go => github.com/bertinatto/library-go v0.0.0-20260423111950-e491ed3b41e5
target_mod=$(echo "${replace_line}" | awk '{print $4}')
target_ver=$(echo "${replace_line}" | awk '{print $5}')

echo "Target module: ${target_mod}"
echo "Target version: ${target_ver}"

# Convert module path to GitHub user/repo
# github.com/bertinatto/library-go => bertinatto/library-go
github_path=$(echo "${target_mod}" | sed 's|github.com/||')
echo "GitHub path: ${github_path}"

# Check if repository exists using GitHub API
echo ""
echo "Checking if repository exists..."
api_response=$(curl -fsSL "https://api.github.com/repos/${github_path}" 2>&1 || echo "FAILED")
if echo "${api_response}" | grep -q '"id"'; then
  echo "✓ Repository exists"
  echo "${api_response}" | grep -E '"name"|"private"|"fork"' | head -5
else
  echo "✗ Repository not found or API error"
fi

# Try to check if the specific commit/tag exists
echo ""
echo "Checking commit/tag: ${target_ver}"
commit_check=$(curl -fsSL "https://api.github.com/repos/${github_path}/commits/${target_ver}" 2>&1 || echo "FAILED")
if echo "${commit_check}" | grep -q '"sha"'; then
  echo "✓ Commit/tag exists"
else
  echo "✗ Commit/tag not found"
fi

Repository: openshift/cluster-kube-apiserver-operator

Length of output: 171


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# First, verify the replace directive is in go.mod as shown
echo "=== Verifying go.mod content ==="
grep -n "^replace github.com/openshift/library-go" go.mod

# Extract the exact replacement target
replace_line=$(grep "^replace github.com/openshift/library-go" go.mod)
echo ""
echo "Full replace line: ${replace_line}"

# Parse module and version
target_mod=$(echo "${replace_line}" | awk -F' ' '{print $4}')
target_ver=$(echo "${replace_line}" | awk -F' ' '{print $5}')

echo ""
echo "Target module: ${target_mod}"
echo "Target version: ${target_ver}"

# Try a simple HTTP HEAD request to the GitHub repo (simpler than API)
github_path=$(echo "${target_mod}" | sed 's|github.com/||')
echo "GitHub repo: ${github_path}"
echo ""
echo "Attempting to verify repository is publicly accessible..."

# Try to fetch the main branch README as a simple existence check
if timeout 10 curl -fsSLI "https://github.com/${github_path}" > /dev/null 2>&1; then
  echo "✓ GitHub repository appears to exist"
else
  echo "✗ Cannot reach repository or repository does not exist"
fi

# Also check: is this a personal fork (bertinatto) or org-owned?
echo ""
echo "Assessment:"
echo "- Module is from personal account: ${github_path} (bertinatto)"
echo "- Original: github.com/openshift/library-go"
echo "- Redirects to personal fork instead of org-owned repository"

Repository: openshift/cluster-kube-apiserver-operator

Length of output: 800


Replace directive references an inaccessible or non-existent repository.

The replace directive on line 140 redirects github.com/openshift/library-go to github.com/bertinatto/library-go, which cannot be reached or does not exist. This will cause build failures. Additionally, redirecting to a personal fork creates supply-chain and reproducibility risk for the operator runtime and test utilities.

Switch to an org-owned (or upstream) replacement before merge, and document removal criteria for this replace directive.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@go.mod` at line 140, The go.mod replace directive replacing
github.com/openshift/library-go with github.com/bertinatto/library-go is
pointing to an inaccessible/personal fork and must be removed or replaced with
an org-owned/upstream module; update the replace in go.mod to either use the
upstream github.com/openshift/library-go (or a vetted org-owned fork) at a valid
version, remove the transient personal fork entry, and add a short comment above
the replace directive describing the temporary nature and explicit criteria for
removal (e.g., upstream PR merged or version published) so reviewers know when
it can be deleted.

@bertinatto bertinatto force-pushed the kms-plugin-lifecycle-precondition branch from 0d56138 to 0e3c048 Compare April 23, 2026 14:31
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/operator/kms/postcheck.go`:
- Around line 58-69: The post-check incorrectly treats a missing
encryption-config-<rev> as a successful check; update the logic in the kms
revision post-check (around the encryptionSecret variable and the podCM
retrieval) so that when kubeClient.CoreV1().Secrets(...).Get returns IsNotFound
you do NOT return nil but instead mark kmsConfigured=false and continue to
compare with the kube-apiserver-pod config; however when
kubeClient.CoreV1().ConfigMaps(...).Get for "kube-apiserver-pod"+revSuffix
returns IsNotFound you should still treat that as an error (returning the error)
because the revision is incomplete. Ensure the code paths reference
encryptionSecret, podCM, and revSuffix and preserve existing error wrapping for
non-NotFound errors.
- Around line 48-50: The code calls featureGateAccessor.CurrentFeatureGates()
and currently returns nil on error, which disables the guard; change this to
fail closed by returning an error (wrap or return the received err) from the
surrounding function (the post-check handler that uses featureGates) instead of
nil so the revision is rejected when feature gate state cannot be read; update
the error path where featureGateAccessor.CurrentFeatureGates() is invoked to
propagate a descriptive error rather than returning nil.
- Around line 91-94: podHasKMSSidecar is unmarshalling YAML manifest using
json.Unmarshal which fails; replace the json unmarshal with a YAML unmarshal
(e.g., use sigs.k8s.io/yaml.Unmarshal) when decoding podYAML into the
corev1.Pod, and update the imports to remove "encoding/json" and add the YAML
package import so the ConfigMap key "pod.yaml" is parsed correctly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: c879a035-2c29-4cbd-a8cf-e7b617379062

📥 Commits

Reviewing files that changed from the base of the PR and between 8888acf and 0e3c048.

📒 Files selected for processing (2)
  • pkg/operator/kms/postcheck.go
  • pkg/operator/starter.go

Comment on lines +48 to +50
featureGates, err := featureGateAccessor.CurrentFeatureGates()
if err != nil {
return nil
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Fail closed when feature-gate state can't be read.

Line 50 returns nil on CurrentFeatureGates() error, which disables this guard exactly when the gate state is unknown and allows the revision to proceed unvalidated.

Suggested fix
 	featureGates, err := featureGateAccessor.CurrentFeatureGates()
 	if err != nil {
-		return nil
+		return fmt.Errorf("kms revision post-check: failed to read feature gates: %w", err)
 	}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
featureGates, err := featureGateAccessor.CurrentFeatureGates()
if err != nil {
return nil
featureGates, err := featureGateAccessor.CurrentFeatureGates()
if err != nil {
return fmt.Errorf("kms revision post-check: failed to read feature gates: %w", err)
}
🧰 Tools
🪛 golangci-lint (2.11.4)

[error] 50-50: error is not nil (line 48) but it returns nil

(nilerr)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/operator/kms/postcheck.go` around lines 48 - 50, The code calls
featureGateAccessor.CurrentFeatureGates() and currently returns nil on error,
which disables the guard; change this to fail closed by returning an error (wrap
or return the received err) from the surrounding function (the post-check
handler that uses featureGates) instead of nil so the revision is rejected when
feature gate state cannot be read; update the error path where
featureGateAccessor.CurrentFeatureGates() is invoked to propagate a descriptive
error rather than returning nil.

Comment on lines +58 to +69
encryptionSecret, err := kubeClient.CoreV1().Secrets(operatorclient.TargetNamespace).Get(ctx, "encryption-config"+revSuffix, metav1.GetOptions{})
if apierrors.IsNotFound(err) {
return nil
}
if err != nil {
return fmt.Errorf("kms revision post-check: failed to get encryption-config for revision %d: %w", revision, err)
}

podCM, err := kubeClient.CoreV1().ConfigMaps(operatorclient.TargetNamespace).Get(ctx, "kube-apiserver-pod"+revSuffix, metav1.GetOptions{})
if apierrors.IsNotFound(err) {
return nil
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don't treat missing revision resources as a successful check.

Because encryption-config is optional in pkg/operator/starter.go Lines 651-653, the IsNotFound path on Line 59 is a real state here. Returning nil means a revision can still pass when kube-apiserver-pod-<rev> already has kms-plugin but encryption-config-<rev> is absent—the exact mismatch this hook is supposed to block.

A safer split is: treat a missing encryption-config-<rev> as kmsConfigured=false and continue the comparison, but treat a missing kube-apiserver-pod-<rev> as an error because the revision is incomplete.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/operator/kms/postcheck.go` around lines 58 - 69, The post-check
incorrectly treats a missing encryption-config-<rev> as a successful check;
update the logic in the kms revision post-check (around the encryptionSecret
variable and the podCM retrieval) so that when
kubeClient.CoreV1().Secrets(...).Get returns IsNotFound you do NOT return nil
but instead mark kmsConfigured=false and continue to compare with the
kube-apiserver-pod config; however when kubeClient.CoreV1().ConfigMaps(...).Get
for "kube-apiserver-pod"+revSuffix returns IsNotFound you should still treat
that as an error (returning the error) because the revision is incomplete.
Ensure the code paths reference encryptionSecret, podCM, and revSuffix and
preserve existing error wrapping for non-NotFound errors.

Comment on lines +91 to +94
var pod corev1.Pod
if err := json.Unmarshal([]byte(podYAML), &pod); err != nil {
return false
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🌐 Web query:

Does Go's encoding/json.Unmarshal accept YAML input, or only JSON?

💡 Result:

No, Go's encoding/json.Unmarshal accepts only JSON input. It explicitly parses "JSON-encoded data" and returns a SyntaxError for invalid JSON syntax, such as YAML-specific features like colons without quotes or indentation-based structures.

Citations:


🏁 Script executed:

cat -n pkg/operator/kms/postcheck.go | head -120 | tail -50

Repository: openshift/cluster-kube-apiserver-operator

Length of output: 2114


🏁 Script executed:

cat -n pkg/operator/kms/postcheck.go | head -100

Repository: openshift/cluster-kube-apiserver-operator

Length of output: 3983


🏁 Script executed:

head -30 pkg/operator/kms/postcheck.go

Repository: openshift/cluster-kube-apiserver-operator

Length of output: 964


Use yaml.Unmarshal to parse the pod manifest from ConfigMap.

Line 92 uses json.Unmarshal on podYAML retrieved from a ConfigMap with key "pod.yaml" (line 86). The encoding/json package only accepts JSON-formatted data and will fail when parsing YAML. This causes podHasKMSSidecar() to always return false, misclassifying KMS-enabled revisions and blocking all rollouts when KMS encryption is configured.

Suggested fix
-	if err := json.Unmarshal([]byte(podYAML), &pod); err != nil {
+	if err := yaml.Unmarshal([]byte(podYAML), &pod); err != nil {
 		return false
 	}

Also update the imports from "encoding/json" to include a YAML unmarshaller such as sigs.k8s.io/yaml.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/operator/kms/postcheck.go` around lines 91 - 94, podHasKMSSidecar is
unmarshalling YAML manifest using json.Unmarshal which fails; replace the json
unmarshal with a YAML unmarshal (e.g., use sigs.k8s.io/yaml.Unmarshal) when
decoding podYAML into the corev1.Pod, and update the imports to remove
"encoding/json" and add the YAML package import so the ConfigMap key "pod.yaml"
is parsed correctly.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Apr 23, 2026

@bertinatto: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@bertinatto
Copy link
Copy Markdown
Member Author

/close
superseded by #2121

@openshift-ci openshift-ci Bot closed this Apr 23, 2026
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Apr 23, 2026

@bertinatto: Closed this PR.

Details

In response to this:

/close
superseded by #2121

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants