[WIP] kms: block revisions until encryption config and kms sidecar are in sync#2102
[WIP] kms: block revisions until encryption config and kms sidecar are in sync#2102bertinatto wants to merge 2 commits into
Conversation
WalkthroughAdds a go.mod replace directive and a KMS-specific revision post-check wired into the static pod operator’s revision controller startup to validate KMS configuration vs. pod presence after feature gate observation. Changes
Sequence Diagram(s)sequenceDiagram
participant Controller as RevisionController
participant Feature as FeatureGateAccessor
participant KubeAPI as Kubernetes API (Secrets/ConfigMaps)
participant Logger as Logger
Controller->>Feature: Check KMSEncryption feature gate
alt KMSEncryption enabled
Controller->>KubeAPI: GET Secret `encryption-config-<rev>`
KubeAPI-->>Controller: Secret (or NotFound / Error)
Controller->>KubeAPI: GET ConfigMap `kube-apiserver-pod-<rev>`
KubeAPI-->>Controller: ConfigMap (or NotFound / Error)
Controller->>Controller: Decode Secret -> EncryptionConfiguration
Controller->>Controller: Parse pod.yaml -> corev1.Pod
alt KMS configured != KMS present
Controller->>Logger: Log mismatch error
Controller-->>Controller: Return error
else
Controller->>Logger: Log success (v=4)
Controller-->>Controller: Return nil
end
else
Controller->>Logger: Skip KMS post-check
Controller-->>Controller: Return nil
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 10 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (10 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
|
||
| kmsConfigured := hasKMSEncryptionProvider(encryptionConfig) | ||
| // FIXME(bertinatto): plugin name is hardcoded here | ||
| kmsInPod := podHasKMSSidecar(podCM, "kms-plugin") |
There was a problem hiding this comment.
The validation that these 2 functions do is still premature. More work needs to be done here.
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: bertinatto The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
pkg/operator/kms/precondition.go (1)
104-104: Consider using comma-ok pattern for type assertion.The direct type assertion
obj.(*apiserverv1.EncryptionConfiguration)will panic ifDecodereturns an unexpected type. While the GVK hint should ensure the correct type is returned on success, using the comma-ok pattern would be more defensive.Defensive type assertion
- config := obj.(*apiserverv1.EncryptionConfiguration) + config, ok := obj.(*apiserverv1.EncryptionConfiguration) + if !ok { + return false + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/operator/kms/precondition.go` at line 104, Replace the direct type assertion on obj with the comma-ok pattern to avoid a panic: attempt config, ok := obj.(*apiserverv1.EncryptionConfiguration) and if !ok handle the unexpected type by returning an error (or logging and returning false) from the enclosing function/condition; update any downstream uses of config to only run when ok is true so the code is defensive against Decode returning a different type.pkg/operator/kms/precondition_test.go (1)
35-37: Test case "feature gate accessor returns error" may not exercise the error path.
featureGateAccessorWithError()passes a non-closed channel, which meansAreInitialFeatureGatesObserved()will returnfalsebeforeCurrentFeatureGates()is called. The precondition returns early, so the error fromCurrentFeatureGates()is never reached.Consider closing the channel to properly test the error-handling path:
Fix to test the actual error path
func featureGateAccessorWithError() featuregates.FeatureGateAccess { - return featuregates.NewHardcodedFeatureGateAccessForTesting(nil, nil, make(chan struct{}), fmt.Errorf("test error")) + observed := make(chan struct{}) + close(observed) + return featuregates.NewHardcodedFeatureGateAccessForTesting(nil, nil, observed, fmt.Errorf("test error")) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/operator/kms/precondition_test.go` around lines 35 - 37, The test helper featureGateAccessorWithError currently passes an open channel so AreInitialFeatureGatesObserved() returns false and the error from CurrentFeatureGates() is never hit; modify featureGateAccessorWithError to create the channel, close it (e.g., ch := make(chan struct{}); close(ch)), and pass that closed channel into featuregates.NewHardcodedFeatureGateAccessForTesting(nil, nil, ch, fmt.Errorf("test error")) so AreInitialFeatureGatesObserved() returns true and the CurrentFeatureGates() error path is exercised.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/operator/kms/precondition.go`:
- Around line 65-71: The precondition silently treats parse/lookup failures in
podHasKMSSidecar as "false", causing permanent blocking when kmsConfigured is
true; update podHasKMSSidecar to return (bool, error) (or a distinct error value
on parse/missing-key), propagate that change into the precondition here so you
capture (kmsInPod, err) from podHasKMSSidecar, and handle err explicitly: log a
warning with the parsing error and decide to either return false, err to block
with an actionable error, or return false, nil only for transient conditions;
ensure callers of hasKMSEncryptionProvider/kmsConfigured logic remain unchanged
except for proper error handling and use the function names podHasKMSSidecar,
hasKMSEncryptionProvider, and the kmsConfigured/kmsInPod variables to locate and
update the code paths.
---
Nitpick comments:
In `@pkg/operator/kms/precondition_test.go`:
- Around line 35-37: The test helper featureGateAccessorWithError currently
passes an open channel so AreInitialFeatureGatesObserved() returns false and the
error from CurrentFeatureGates() is never hit; modify
featureGateAccessorWithError to create the channel, close it (e.g., ch :=
make(chan struct{}); close(ch)), and pass that closed channel into
featuregates.NewHardcodedFeatureGateAccessForTesting(nil, nil, ch,
fmt.Errorf("test error")) so AreInitialFeatureGatesObserved() returns true and
the CurrentFeatureGates() error path is exercised.
In `@pkg/operator/kms/precondition.go`:
- Line 104: Replace the direct type assertion on obj with the comma-ok pattern
to avoid a panic: attempt config, ok :=
obj.(*apiserverv1.EncryptionConfiguration) and if !ok handle the unexpected type
by returning an error (or logging and returning false) from the enclosing
function/condition; update any downstream uses of config to only run when ok is
true so the code is defensive against Decode returning a different type.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 41e8cb51-44ee-4991-b3c7-4fb5b8636a1c
📒 Files selected for processing (3)
pkg/operator/kms/precondition.gopkg/operator/kms/precondition_test.gopkg/operator/starter.go
| kmsConfigured := hasKMSEncryptionProvider(encryptionConfig) | ||
| // FIXME(bertinatto): plugin name is hardcoded here | ||
| kmsInPod := podHasKMSSidecar(podCM, "kms-plugin") | ||
| if kmsConfigured != kmsInPod { | ||
| klog.Infof("kms revision precondition not met: kmsConfigured=%v kmsInPod=%v, waiting for reconciliation", kmsConfigured, kmsInPod) | ||
| return false, nil | ||
| } |
There was a problem hiding this comment.
Silent parse failures in helpers can cause permanent blocking.
When podHasKMSSidecar fails to parse pod.yaml (malformed JSON or missing key), it returns false. If KMS is configured in the encryption config (kmsConfigured=true), the precondition will block indefinitely waiting for a "reconciliation" that cannot happen because the underlying issue is a parse error, not a timing mismatch.
Consider returning an error from the helper functions when parsing fails, and either logging a warning or returning an error from the precondition to surface the issue.
Proposed approach
-func podHasKMSSidecar(podCM *corev1.ConfigMap, sidecarName string) bool {
+func podHasKMSSidecar(podCM *corev1.ConfigMap, sidecarName string) (bool, error) {
podYAML, ok := podCM.Data["pod.yaml"]
if !ok {
- return false
+ return false, fmt.Errorf("pod.yaml key not found in configmap")
}
var pod corev1.Pod
if err := json.Unmarshal([]byte(podYAML), &pod); err != nil {
- return false
+ return false, fmt.Errorf("failed to unmarshal pod.yaml: %w", err)
}
- return slices.ContainsFunc(pod.Spec.Containers, func(c corev1.Container) bool {
+ return slices.ContainsFunc(pod.Spec.Containers, func(c corev1.Container) bool {
return c.Name == sidecarName
- })
+ }), nil
}Then in the main function, handle the error appropriately (log warning and proceed, or block with error).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/operator/kms/precondition.go` around lines 65 - 71, The precondition
silently treats parse/lookup failures in podHasKMSSidecar as "false", causing
permanent blocking when kmsConfigured is true; update podHasKMSSidecar to return
(bool, error) (or a distinct error value on parse/missing-key), propagate that
change into the precondition here so you capture (kmsInPod, err) from
podHasKMSSidecar, and handle err explicitly: log a warning with the parsing
error and decide to either return false, err to block with an actionable error,
or return false, nil only for transient conditions; ensure callers of
hasKMSEncryptionProvider/kmsConfigured logic remain unchanged except for proper
error handling and use the function names podHasKMSSidecar,
hasKMSEncryptionProvider, and the kmsConfigured/kmsInPod variables to locate and
update the code paths.
| return true, nil | ||
| } | ||
|
|
||
| encryptionConfig, err := kubeClient.CoreV1().Secrets("openshift-config-managed").Get(ctx, "encryption-config-openshift-kube-apiserver", metav1.GetOptions{}) |
There was a problem hiding this comment.
(My questions may not be accurate)
shouldn't we query from revisioned Secret in openshift-kube-apiserver-operator ns rather than openshift-config-managed (which is unrevisioned)
| if err != nil { | ||
| return true, nil | ||
| } | ||
| if !featureGates.Enabled(features.FeatureGateKMSEncryption) { |
There was a problem hiding this comment.
Is the FG check required if the presence of a KMS configuration is also behind the same FG?
| return true, nil | ||
| } | ||
|
|
||
| encryptionConfig, err := kubeClient.CoreV1().Secrets("openshift-config-managed").Get(ctx, "encryption-config-openshift-kube-apiserver", metav1.GetOptions{}) |
There was a problem hiding this comment.
I don’t think this is what we want. We need to compare revisioned resources. This will likely require changes in library-go. We need to validate revisioned resources to ensure they won’t change and we need to do that before the system switches to that revision.
There was a problem hiding this comment.
I think, there might be an issue. This function is called from targetconfigcontroller which does not know revision number. So that how can targetconfigcontroller query the revisioned encryption config Secret under openshift-kube-apiserver. Is there a missing part in our design.
There was a problem hiding this comment.
So that how can targetconfigcontroller query the revisioned encryption config Secret under openshift-kube-apiserver
it cannot, that's why we decided to add an additional function to the revision controller (library-go).
There was a problem hiding this comment.
There was a problem hiding this comment.
from that doc
The third solution is to add a validation function to the RevisionController. This function validated the revisioned resources. In our case the kas manifest and the encryption-configuration. If the configuration is invalid we return an error. As a result a new revision is not generated until the system converges to a good state.
We choose option 3. This solution seems the simplest. It doesn't change the existing flow and doesn't introduce new states.
There was a problem hiding this comment.
Thanks for explanation, it makes sense. It seems everything is under control and there is no missing pieces for this part :)
d5f6194 to
8888acf
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@go.mod`:
- Line 140: The go.mod replace directive replacing
github.com/openshift/library-go with github.com/bertinatto/library-go is
pointing to an inaccessible/personal fork and must be removed or replaced with
an org-owned/upstream module; update the replace in go.mod to either use the
upstream github.com/openshift/library-go (or a vetted org-owned fork) at a valid
version, remove the transient personal fork entry, and add a short comment above
the replace directive describing the temporary nature and explicit criteria for
removal (e.g., upstream PR merged or version published) so reviewers know when
it can be deleted.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 81f62f60-3ab8-46a5-88a0-588813068f96
⛔ Files ignored due to path filters (4)
go.sumis excluded by!**/*.sumvendor/github.com/openshift/library-go/pkg/operator/revisioncontroller/revision_controller.gois excluded by!**/vendor/**,!vendor/**vendor/github.com/openshift/library-go/pkg/operator/staticpod/controllers.gois excluded by!**/vendor/**,!vendor/**vendor/modules.txtis excluded by!**/vendor/**,!vendor/**
📒 Files selected for processing (1)
go.mod
|
|
||
| replace github.com/onsi/ginkgo/v2 => github.com/openshift/onsi-ginkgo/v2 v2.6.1-0.20251001123353-fd5b1fb35db1 | ||
|
|
||
| replace github.com/openshift/library-go => github.com/bertinatto/library-go v0.0.0-20260423111950-e491ed3b41e5 |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Verify replacement target and version from current go.mod
replace_line="$(awk '/^replace github.com\/openshift\/library-go =>/{print $0}' go.mod)"
echo "replace line: ${replace_line}"
target_mod="$(awk '/^replace github.com\/openshift\/library-go =>/{print $4}' go.mod)"
target_ver="$(awk '/^replace github.com\/openshift\/library-go =>/{print $5}' go.mod)"
if [[ -z "${target_mod}" || -z "${target_ver}" ]]; then
echo "ERROR: could not parse replacement module/version" >&2
exit 1
fi
echo "target module: ${target_mod}"
echo "target version/ref: ${target_ver}"
# Check that pinned ref exists and inspect declared module path
url="https://raw.githubusercontent.com/${target_mod}/${target_ver}/go.mod"
echo "fetching: ${url}"
curl -fsSL "${url}" | sed -n '1,10p'
declared_module="$(curl -fsSL "${url}" | awk '/^module /{print $2; exit}')"
echo "declared module in replacement go.mod: ${declared_module}"
# Expected outcome:
# 1) URL resolves (ref is retrievable/stable)
# 2) declared module path is compatible with replace target usage
# 3) team confirms provenance policy for using a personal fork in CIRepository: openshift/cluster-kube-apiserver-operator
Length of output: 498
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Extract replacement details from go.mod
replace_line="$(grep '^replace github.com/openshift/library-go' go.mod)"
echo "Replace directive: ${replace_line}"
# Parse out the replacement module and version
# Format: replace github.com/openshift/library-go => github.com/bertinatto/library-go v0.0.0-20260423111950-e491ed3b41e5
target_mod=$(echo "${replace_line}" | awk '{print $4}')
target_ver=$(echo "${replace_line}" | awk '{print $5}')
echo "Target module: ${target_mod}"
echo "Target version: ${target_ver}"
# Convert module path to GitHub user/repo
# github.com/bertinatto/library-go => bertinatto/library-go
github_path=$(echo "${target_mod}" | sed 's|github.com/||')
echo "GitHub path: ${github_path}"
# Check if repository exists using GitHub API
echo ""
echo "Checking if repository exists..."
api_response=$(curl -fsSL "https://api.github.com/repos/${github_path}" 2>&1 || echo "FAILED")
if echo "${api_response}" | grep -q '"id"'; then
echo "✓ Repository exists"
echo "${api_response}" | grep -E '"name"|"private"|"fork"' | head -5
else
echo "✗ Repository not found or API error"
fi
# Try to check if the specific commit/tag exists
echo ""
echo "Checking commit/tag: ${target_ver}"
commit_check=$(curl -fsSL "https://api.github.com/repos/${github_path}/commits/${target_ver}" 2>&1 || echo "FAILED")
if echo "${commit_check}" | grep -q '"sha"'; then
echo "✓ Commit/tag exists"
else
echo "✗ Commit/tag not found"
fiRepository: openshift/cluster-kube-apiserver-operator
Length of output: 171
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# First, verify the replace directive is in go.mod as shown
echo "=== Verifying go.mod content ==="
grep -n "^replace github.com/openshift/library-go" go.mod
# Extract the exact replacement target
replace_line=$(grep "^replace github.com/openshift/library-go" go.mod)
echo ""
echo "Full replace line: ${replace_line}"
# Parse module and version
target_mod=$(echo "${replace_line}" | awk -F' ' '{print $4}')
target_ver=$(echo "${replace_line}" | awk -F' ' '{print $5}')
echo ""
echo "Target module: ${target_mod}"
echo "Target version: ${target_ver}"
# Try a simple HTTP HEAD request to the GitHub repo (simpler than API)
github_path=$(echo "${target_mod}" | sed 's|github.com/||')
echo "GitHub repo: ${github_path}"
echo ""
echo "Attempting to verify repository is publicly accessible..."
# Try to fetch the main branch README as a simple existence check
if timeout 10 curl -fsSLI "https://github.com/${github_path}" > /dev/null 2>&1; then
echo "✓ GitHub repository appears to exist"
else
echo "✗ Cannot reach repository or repository does not exist"
fi
# Also check: is this a personal fork (bertinatto) or org-owned?
echo ""
echo "Assessment:"
echo "- Module is from personal account: ${github_path} (bertinatto)"
echo "- Original: github.com/openshift/library-go"
echo "- Redirects to personal fork instead of org-owned repository"Repository: openshift/cluster-kube-apiserver-operator
Length of output: 800
Replace directive references an inaccessible or non-existent repository.
The replace directive on line 140 redirects github.com/openshift/library-go to github.com/bertinatto/library-go, which cannot be reached or does not exist. This will cause build failures. Additionally, redirecting to a personal fork creates supply-chain and reproducibility risk for the operator runtime and test utilities.
Switch to an org-owned (or upstream) replacement before merge, and document removal criteria for this replace directive.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@go.mod` at line 140, The go.mod replace directive replacing
github.com/openshift/library-go with github.com/bertinatto/library-go is
pointing to an inaccessible/personal fork and must be removed or replaced with
an org-owned/upstream module; update the replace in go.mod to either use the
upstream github.com/openshift/library-go (or a vetted org-owned fork) at a valid
version, remove the transient personal fork entry, and add a short comment above
the replace directive describing the temporary nature and explicit criteria for
removal (e.g., upstream PR merged or version published) so reviewers know when
it can be deleted.
0d56138 to
0e3c048
Compare
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/operator/kms/postcheck.go`:
- Around line 58-69: The post-check incorrectly treats a missing
encryption-config-<rev> as a successful check; update the logic in the kms
revision post-check (around the encryptionSecret variable and the podCM
retrieval) so that when kubeClient.CoreV1().Secrets(...).Get returns IsNotFound
you do NOT return nil but instead mark kmsConfigured=false and continue to
compare with the kube-apiserver-pod config; however when
kubeClient.CoreV1().ConfigMaps(...).Get for "kube-apiserver-pod"+revSuffix
returns IsNotFound you should still treat that as an error (returning the error)
because the revision is incomplete. Ensure the code paths reference
encryptionSecret, podCM, and revSuffix and preserve existing error wrapping for
non-NotFound errors.
- Around line 48-50: The code calls featureGateAccessor.CurrentFeatureGates()
and currently returns nil on error, which disables the guard; change this to
fail closed by returning an error (wrap or return the received err) from the
surrounding function (the post-check handler that uses featureGates) instead of
nil so the revision is rejected when feature gate state cannot be read; update
the error path where featureGateAccessor.CurrentFeatureGates() is invoked to
propagate a descriptive error rather than returning nil.
- Around line 91-94: podHasKMSSidecar is unmarshalling YAML manifest using
json.Unmarshal which fails; replace the json unmarshal with a YAML unmarshal
(e.g., use sigs.k8s.io/yaml.Unmarshal) when decoding podYAML into the
corev1.Pod, and update the imports to remove "encoding/json" and add the YAML
package import so the ConfigMap key "pod.yaml" is parsed correctly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: c879a035-2c29-4cbd-a8cf-e7b617379062
📒 Files selected for processing (2)
pkg/operator/kms/postcheck.gopkg/operator/starter.go
| featureGates, err := featureGateAccessor.CurrentFeatureGates() | ||
| if err != nil { | ||
| return nil |
There was a problem hiding this comment.
Fail closed when feature-gate state can't be read.
Line 50 returns nil on CurrentFeatureGates() error, which disables this guard exactly when the gate state is unknown and allows the revision to proceed unvalidated.
Suggested fix
featureGates, err := featureGateAccessor.CurrentFeatureGates()
if err != nil {
- return nil
+ return fmt.Errorf("kms revision post-check: failed to read feature gates: %w", err)
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| featureGates, err := featureGateAccessor.CurrentFeatureGates() | |
| if err != nil { | |
| return nil | |
| featureGates, err := featureGateAccessor.CurrentFeatureGates() | |
| if err != nil { | |
| return fmt.Errorf("kms revision post-check: failed to read feature gates: %w", err) | |
| } |
🧰 Tools
🪛 golangci-lint (2.11.4)
[error] 50-50: error is not nil (line 48) but it returns nil
(nilerr)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/operator/kms/postcheck.go` around lines 48 - 50, The code calls
featureGateAccessor.CurrentFeatureGates() and currently returns nil on error,
which disables the guard; change this to fail closed by returning an error (wrap
or return the received err) from the surrounding function (the post-check
handler that uses featureGates) instead of nil so the revision is rejected when
feature gate state cannot be read; update the error path where
featureGateAccessor.CurrentFeatureGates() is invoked to propagate a descriptive
error rather than returning nil.
| encryptionSecret, err := kubeClient.CoreV1().Secrets(operatorclient.TargetNamespace).Get(ctx, "encryption-config"+revSuffix, metav1.GetOptions{}) | ||
| if apierrors.IsNotFound(err) { | ||
| return nil | ||
| } | ||
| if err != nil { | ||
| return fmt.Errorf("kms revision post-check: failed to get encryption-config for revision %d: %w", revision, err) | ||
| } | ||
|
|
||
| podCM, err := kubeClient.CoreV1().ConfigMaps(operatorclient.TargetNamespace).Get(ctx, "kube-apiserver-pod"+revSuffix, metav1.GetOptions{}) | ||
| if apierrors.IsNotFound(err) { | ||
| return nil | ||
| } |
There was a problem hiding this comment.
Don't treat missing revision resources as a successful check.
Because encryption-config is optional in pkg/operator/starter.go Lines 651-653, the IsNotFound path on Line 59 is a real state here. Returning nil means a revision can still pass when kube-apiserver-pod-<rev> already has kms-plugin but encryption-config-<rev> is absent—the exact mismatch this hook is supposed to block.
A safer split is: treat a missing encryption-config-<rev> as kmsConfigured=false and continue the comparison, but treat a missing kube-apiserver-pod-<rev> as an error because the revision is incomplete.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/operator/kms/postcheck.go` around lines 58 - 69, The post-check
incorrectly treats a missing encryption-config-<rev> as a successful check;
update the logic in the kms revision post-check (around the encryptionSecret
variable and the podCM retrieval) so that when
kubeClient.CoreV1().Secrets(...).Get returns IsNotFound you do NOT return nil
but instead mark kmsConfigured=false and continue to compare with the
kube-apiserver-pod config; however when kubeClient.CoreV1().ConfigMaps(...).Get
for "kube-apiserver-pod"+revSuffix returns IsNotFound you should still treat
that as an error (returning the error) because the revision is incomplete.
Ensure the code paths reference encryptionSecret, podCM, and revSuffix and
preserve existing error wrapping for non-NotFound errors.
| var pod corev1.Pod | ||
| if err := json.Unmarshal([]byte(podYAML), &pod); err != nil { | ||
| return false | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
Does Go's encoding/json.Unmarshal accept YAML input, or only JSON?
💡 Result:
No, Go's encoding/json.Unmarshal accepts only JSON input. It explicitly parses "JSON-encoded data" and returns a SyntaxError for invalid JSON syntax, such as YAML-specific features like colons without quotes or indentation-based structures.
Citations:
- 1: https://pkg.go.dev/encoding/json
- 2: https://pkg.go.dev/encoding/json@go1.25.5
- 3: https://pkg.go.dev/encoding/json/
🏁 Script executed:
cat -n pkg/operator/kms/postcheck.go | head -120 | tail -50Repository: openshift/cluster-kube-apiserver-operator
Length of output: 2114
🏁 Script executed:
cat -n pkg/operator/kms/postcheck.go | head -100Repository: openshift/cluster-kube-apiserver-operator
Length of output: 3983
🏁 Script executed:
head -30 pkg/operator/kms/postcheck.goRepository: openshift/cluster-kube-apiserver-operator
Length of output: 964
Use yaml.Unmarshal to parse the pod manifest from ConfigMap.
Line 92 uses json.Unmarshal on podYAML retrieved from a ConfigMap with key "pod.yaml" (line 86). The encoding/json package only accepts JSON-formatted data and will fail when parsing YAML. This causes podHasKMSSidecar() to always return false, misclassifying KMS-enabled revisions and blocking all rollouts when KMS encryption is configured.
Suggested fix
- if err := json.Unmarshal([]byte(podYAML), &pod); err != nil {
+ if err := yaml.Unmarshal([]byte(podYAML), &pod); err != nil {
return false
}Also update the imports from "encoding/json" to include a YAML unmarshaller such as sigs.k8s.io/yaml.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/operator/kms/postcheck.go` around lines 91 - 94, podHasKMSSidecar is
unmarshalling YAML manifest using json.Unmarshal which fails; replace the json
unmarshal with a YAML unmarshal (e.g., use sigs.k8s.io/yaml.Unmarshal) when
decoding podYAML into the corev1.Pod, and update the imports to remove
"encoding/json" and add the YAML package import so the ConfigMap key "pod.yaml"
is parsed correctly.
|
@bertinatto: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
/close |
|
@bertinatto: Closed this PR. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Add a revision controller precondition that prevents rolling out a new revision when there is a mismatch between the encryption configuration and the pod spec.
CC @ardaguclu @p0lyn0mial
Summary by CodeRabbit
New Features
Chores