test/encryption: add InvalidImageRecoveryScenario for KMS plugin imag…#2243
test/encryption: add InvalidImageRecoveryScenario for KMS plugin imag…#2243gangwgr wants to merge 4 commits into
Conversation
…e fix Adds a test scenario that verifies the cluster can recover when a KMS encryption config is set with an invalid plugin image: 1. Enable KMS with invalid (non-existent) plugin image 2. Attempt to switch to aescbc (cluster remains stuck on KMS) 3. Operator reports degraded 4. Fix the config with valid KMS plugin image 5. Operator recovers and encryption works normally
|
Skipping CI for Draft Pull Request. |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughControllers add safe in-place KMS plugin-config updates and propagate those updates into the unrevisioned encryption-config during non-convergence; tests (unit and e2e) cover in-place updates, recovery from invalid KMS images, and ensure no input mutation. ChangesEncryption KMS in-place updates
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 9 | ❌ 3❌ Failed checks (3 warnings)
✅ Passed checks (9 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@test/library/encryption/scenarios.go`:
- Around line 323-327: Wrap the cluster-scoped APIServerConfig update in a retry
loop: use retry.RetryOnConflict to perform the get-modify-update sequence for
cs.ApiServerConfig (the existing cs.ApiServerConfig.Get(ctx, "cluster", ...) and
cs.ApiServerConfig.Update(ctx, apiServer, ...)) so that transient API conflicts
are retried; in the closure fetch the current APIServerConfig, set
apiServer.Spec.Encryption = scenario.InvalidImageProvider.APIServerEncryption,
attempt the Update and return the error to be retried on conflict.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: f65ffc6f-c7e6-43a1-bef2-7b566ab90b98
📒 Files selected for processing (1)
test/library/encryption/scenarios.go
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: gangwgr The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
a11498d to
78d95d2
Compare
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pkg/operator/encryption/controllers/key_controller.go`:
- Around line 327-340: The current update only rewrites the KMS plugin config
blob and leaves any carried AppRole secret data stale; when handling the
keySecret fetched via secrets.FromKeyState (keySecret) and its Secret object s
(from c.secretClient.Secrets(...).Get) you must also refresh/prune the carried
AppRole secret entries whose keys match the "...-kms-plugin-secret-*" pattern:
retrieve the referenced Vault.Authentication.AppRole.Secret.Name, fetch that
Secret payload, copy its role-id/secret-id (or remove old entries if the
reference changed) into s.Data under the appropriate kms-plugin-secret-* keys
before calling c.secretClient.Secrets(s.Namespace).Update, and ensure
encoding.EncodeKMSPluginConfig is still used for the config key; add a
regression test that performs an in-place swap of
Vault.Authentication.AppRole.Secret.Name and asserts the updated key secret
contains the new role-id/secret-id and no stale kms-plugin-secret-* entries.
In `@pkg/operator/encryption/controllers/state_controller.go`:
- Around line 214-219: The call to encryptiondata.ToEncryptionState currently
discards its error; change the handling so you capture its returned error and
return it if non-nil (e.g., check err from
encryptiondata.ToEncryptionState(existingConfig, keySecrets) and return that
error), rather than proceeding to call encryptiondata.FromEncryptionState when
ToEncryptionState failed or silently no-op when enrichedState is nil; keep the
existing enrichedState nil check but ensure ToEncryptionState errors are
propagated up from this function.
In `@test/library/encryption/kms/vault.go`:
- Around line 109-110: The helper ensureDefaultVaultAppRoleSecret currently
copies creds.Data["role-id"] and creds.Data["secret-id"] directly into
required.Data which can write empty values; update
ensureDefaultVaultAppRoleSecret to validate creds.Data contains non-empty
"role-id" and "secret-id" before assigning to required.Data, using require or
test assertions that fail fast with a clear message referencing the
vault-kms/vault-credentials secret and the missing/empty field names so the test
setup errors explicitly if those keys are absent or empty.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: e8b83ad6-657c-4d82-ba42-cd9e3042a7ea
📒 Files selected for processing (7)
pkg/operator/encryption/controllers/key_controller.gopkg/operator/encryption/controllers/key_controller_test.gopkg/operator/encryption/controllers/state_controller.gopkg/operator/encryption/controllers/state_controller_test.gopkg/operator/encryption/encryptiondata/config_test.gotest/e2e-encryption/encryption_test.gotest/library/encryption/kms/vault.go
| keySecret, err := secrets.FromKeyState(c.instanceName, latest) | ||
| if err != nil { | ||
| return err | ||
| } | ||
| s, err := c.secretClient.Secrets(keySecret.Namespace).Get(ctx, keySecret.Name, metav1.GetOptions{}) | ||
| if err != nil { | ||
| return fmt.Errorf("failed to get key secret %s/%s: %v", keySecret.Namespace, keySecret.Name, err) | ||
| } | ||
| pluginData, err := encoding.EncodeKMSPluginConfig(apiServerEncryption.KMS) | ||
| if err != nil { | ||
| return fmt.Errorf("failed to encode KMS plugin config: %v", err) | ||
| } | ||
| s.Data["encryption.apiserver.operator.openshift.io-kms-plugin-config"] = pluginData | ||
| _, updateErr := c.secretClient.Secrets(s.Namespace).Update(ctx, s, metav1.UpdateOptions{}) |
There was a problem hiding this comment.
Refresh the carried AppRole secret data during in-place auth updates.
generateKeySecret() copies the referenced AppRole payload into the key secret, but this path only rewrites ...-kms-plugin-config. If the in-place change is a new Vault.Authentication.AppRole.Secret.Name, the latest key keeps the old role-id/secret-id entries, so the rollout still comes up with stale credentials. Please update/prune the ...-kms-plugin-secret-* data alongside the plugin config here, and add a regression case that swaps the AppRole secret reference.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@pkg/operator/encryption/controllers/key_controller.go` around lines 327 -
340, The current update only rewrites the KMS plugin config blob and leaves any
carried AppRole secret data stale; when handling the keySecret fetched via
secrets.FromKeyState (keySecret) and its Secret object s (from
c.secretClient.Secrets(...).Get) you must also refresh/prune the carried AppRole
secret entries whose keys match the "...-kms-plugin-secret-*" pattern: retrieve
the referenced Vault.Authentication.AppRole.Secret.Name, fetch that Secret
payload, copy its role-id/secret-id (or remove old entries if the reference
changed) into s.Data under the appropriate kms-plugin-secret-* keys before
calling c.secretClient.Secrets(s.Namespace).Update, and ensure
encoding.EncodeKMSPluginConfig is still used for the config key; add a
regression test that performs an in-place swap of
Vault.Authentication.AppRole.Secret.Name and asserts the updated key secret
contains the new role-id/secret-id and no stale kms-plugin-secret-* entries.
| enrichedState, _ := encryptiondata.ToEncryptionState(existingConfig, keySecrets) | ||
| if enrichedState == nil { | ||
| return nil | ||
| } | ||
|
|
||
| rebuiltConfig, err := encryptiondata.FromEncryptionState(enrichedState) |
There was a problem hiding this comment.
Don’t drop ToEncryptionState() errors here.
If the unrevisioned encryption-config drifts from the backing key secrets, this helper currently ignores that failure and can silently no-op instead of surfacing a degraded condition. Returning the error here avoids rebuilding from a partial state.
Suggested fix
- enrichedState, _ := encryptiondata.ToEncryptionState(existingConfig, keySecrets)
+ enrichedState, err := encryptiondata.ToEncryptionState(existingConfig, keySecrets)
+ if err != nil {
+ return err
+ }
if enrichedState == nil {
return nil
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| enrichedState, _ := encryptiondata.ToEncryptionState(existingConfig, keySecrets) | |
| if enrichedState == nil { | |
| return nil | |
| } | |
| rebuiltConfig, err := encryptiondata.FromEncryptionState(enrichedState) | |
| enrichedState, err := encryptiondata.ToEncryptionState(existingConfig, keySecrets) | |
| if err != nil { | |
| return err | |
| } | |
| if enrichedState == nil { | |
| return nil | |
| } | |
| rebuiltConfig, err := encryptiondata.FromEncryptionState(enrichedState) |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@pkg/operator/encryption/controllers/state_controller.go` around lines 214 -
219, The call to encryptiondata.ToEncryptionState currently discards its error;
change the handling so you capture its returned error and return it if non-nil
(e.g., check err from encryptiondata.ToEncryptionState(existingConfig,
keySecrets) and return that error), rather than proceeding to call
encryptiondata.FromEncryptionState when ToEncryptionState failed or silently
no-op when enrichedState is nil; keep the existing enrichedState nil check but
ensure ToEncryptionState errors are propagated up from this function.
| "role-id": creds.Data["role-id"], | ||
| "secret-id": creds.Data["secret-id"], |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
rg -n -C2 --type=go --type=yaml --type=sh \
'vault-credentials|role-id|secret-id|roleID|secretID'Repository: openshift/library-go
Length of output: 37480
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Inspect the exact code around the changed lines
sed -n '70,140p' test/library/encryption/kms/vault.go
# Find callers/usage of ensureDefaultVaultAppRoleSecret
rg -n "ensureDefaultVaultAppRoleSecret" -S test/library/encryption/kms/vault.go test/library/encryption -S
# Check if any other code validates required vault-credentials keys earlier
rg -n "defaultVaultCredentialsSecret|vault-credentials" -S test/library/encryption/kms test/library/encryptionRepository: openshift/library-go
Length of output: 4965
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Locate ApplySecret implementation
rg -n --type=go "func ApplySecret\\(" -S . | head -n 50
# Print the implementation area
rg -n --type=go "func ApplySecret\\(" -S . | head -n 1 | awk -F: '{print $1}' | xargs -I{} true
# (We’ll directly cat the likely file once we know it from the first search.)
# Fetch the full file for the first match only (avoid huge output)
file="$(rg -n --type=go "func ApplySecret\\(" -S . | head -n 1 | cut -d: -f1)"
echo "ApplySecret file: $file"
sed -n '1,220p' "$file"
# Find where AppRole secret keys are validated (role-id/secret-id)
rg -n --type=go "missing required key \"secret-id\"|missing required key \"role-id\"|role-id\" and \"secret-id\"|required key.*role-id|required key.*secret-id" -S pkg | head -n 50
# Locate the specific validation logic for Vault AppRole secret reads
rg -n --type=go "vault-approle-secret|VaultAppRoleAuthentication|AppRole secret must contain" -S pkg/operator/encryption | head -n 50Repository: openshift/library-go
Length of output: 15601
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Show ApplySecretImproved implementation (secret-specific path)
rg -n "func ApplySecretImproved" pkg/operator/resource/resourceapply/core.go
sed -n '220,420p' pkg/operator/resource/resourceapply/core.go
# Inspect operator-side validation of Vault AppRole secret keys
sed -n '470,580p' pkg/operator/encryption/controllers/key_controller.go
# Also inspect the associated test error message path for missing keys
rg -n "missing required key \"secret-id\"|missing required key \"role-id\"" pkg/operator/encryption/controllers/key_controller_test.go
sed -n '560,720p' pkg/operator/encryption/controllers/key_controller_test.goRepository: openshift/library-go
Length of output: 20563
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Find the exact validation code that emits "missing required key"
rg -n --type=go "missing required key \"secret-id\"|missing required key \"role-id\"|missing required key" pkg/operator/encryption/controllers/key_controller.go
# Print the surrounding block(s) for the validation logic (small slices)
rg -n --type=go -C5 "missing required key" pkg/operator/encryption/controllers/key_controller.goRepository: openshift/library-go
Length of output: 635
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Find the PluginSecretData.Set implementation
rg -n --type=go "type .*PluginSecretData|PluginSecretData.*Set\\(" pkg/operator/encryption | head -n 50
# Find the exact Set receiver method(s)
rg -n --type=go "func \\(.*\\) Set\\(.*PluginSecretData" pkg/operator/encryption | head -n 50
rg -n --type=go "func \\(.*\\) Set\\(.*secretName.*key.*\\)" pkg/operator/encryption | head -n 50
# Also locate where KMS.PluginSecretData.Set is called
rg -n --type=go "PluginSecretData\\.Set\\(" pkg/operator/encryption/controllers/key_controller.go pkg/operator/encryption -SRepository: openshift/library-go
Length of output: 198
🏁 Script executed:
#!/bin/bash
set -euo pipefail
rg -n --type=go "type .*PluginSecretData" -S . || true
rg -n --type=go "PluginSecretData" -S pkg/operator/encryption | head -n 50
rg -n --type=go "func \\(.*\\) Set\\(" -S pkg/operator/encryption | rg -n "PluginSecretData" | head -n 50
rg -n --type=go "PluginSecretData\\.Set\\(" -S . || trueRepository: openshift/library-go
Length of output: 1562
🏁 Script executed:
#!/bin/bash
set -euo pipefail
sed -n '1,220p' pkg/operator/encryption/state/types.go
sed -n '1,260p' pkg/operator/encryption/secrets/secrets.go
# Locate the Set(...) and any helpers SetFromRawKey / SetFromRawKey implementations
rg -n --type=go "func \\(.*\\) Set\\(" pkg/operator/encryption/state pkg/operator/encryption/secrets pkg/operator/encryption | head -n 50
rg -n --type=go "SetFromRawKey" pkg/operator/encryption | head -n 50Repository: openshift/library-go
Length of output: 14312
🏁 Script executed:
#!/bin/bash
set -euo pipefail
sed -n '340,460p' pkg/operator/encryption/controllers/key_controller.goRepository: openshift/library-go
Length of output: 4904
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Find call sites of generateKeySecret and surrounding error handling
rg -n --type=go "generateKeySecret\\(" pkg/operator/encryption/controllers/key_controller.go
# Show the blocks around each call site (small ranges)
rg -n --type=go -C5 "generateKeySecret\\(" pkg/operator/encryption/controllers/key_controller.goRepository: openshift/library-go
Length of output: 1450
🏁 Script executed:
#!/bin/bash
set -euo pipefail
rg -n --type=go "secretName, dataKey, and value must not be empty" -S . || true
rg -n --type=go "missing required key \"role-id\"|missing required key \"secret-id\"" -S pkg/operator/encryption/controllers
rg -n --type=go "Set\\(secretName, dataKey, value" -S pkg/operator/encryption | head -n 50Repository: openshift/library-go
Length of output: 568
🏁 Script executed:
#!/bin/bash
set -euo pipefail
sed -n '1,120p' test/library/encryption/kms/vault.go
sed -n '120,220p' test/library/encryption/kms/vault.go
# Find the test helper that references ensureDefaultVaultAppRoleSecret setup
rg -n "ensureDefaultVaultAppRoleSecret" test/library/encryption/kms/vault.goRepository: openshift/library-go
Length of output: 7310
Fail fast when vault-credentials is missing (or empty) Vault AppRole keys
ensureDefaultVaultAppRoleSecret copies creds.Data["role-id"] / creds.Data["secret-id"] directly (test/library/encryption/kms/vault.go lines 109-110). If the CI-provided vault-credentials secret lacks those fields, the helper can write empty/nil bytes into the target secret; the operator then fails during key creation when KMSSecretData.Set rejects empty values (failed to create key: ... must not be empty), producing a less direct contract-mismatch signal. Add explicit require checks for key presence and non-empty values when building required.Data, so the setup fails with a clear message about the expected role-id/secret-id fields in vault-kms/vault-credentials.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@test/library/encryption/kms/vault.go` around lines 109 - 110, The helper
ensureDefaultVaultAppRoleSecret currently copies creds.Data["role-id"] and
creds.Data["secret-id"] directly into required.Data which can write empty
values; update ensureDefaultVaultAppRoleSecret to validate creds.Data contains
non-empty "role-id" and "secret-id" before assigning to required.Data, using
require or test assertions that fail fast with a clear message referencing the
vault-kms/vault-credentials secret and the missing/empty field names so the test
setup errors explicitly if those keys are absent or empty.
| // 3. See degraded | ||
| // 4. Update with KMS with valid image | ||
| // 5. See everything works | ||
| type InvalidImageRecoveryScenario struct { |
There was a problem hiding this comment.
I think this scenario nicely exercises the in-place field update case which will be added by #2241.
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (3)
pkg/operator/encryption/controllers/state_controller.go (1)
214-215:⚠️ Potential issue | 🟠 Major | ⚡ Quick winPropagate
ToEncryptionStatefailures instead of dropping them.Line 214 discards the returned error. A failed enrichment can silently no-op and hide state drift/corruption during non-convergence.
Suggested fix
- enrichedState, _ := encryptiondata.ToEncryptionState(existingConfig, keySecrets) + enrichedState, err := encryptiondata.ToEncryptionState(existingConfig, keySecrets) + if err != nil { + return err + } if enrichedState == nil { return nil }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/operator/encryption/controllers/state_controller.go` around lines 214 - 215, The call to ToEncryptionState is ignoring its returned error (enrichedState, _ := encryptiondata.ToEncryptionState(existingConfig, keySecrets)), which can hide failures; change this to capture the error (e.g., enrichedState, err := encryptiondata.ToEncryptionState(...)), check for non-nil err and propagate it instead of continuing with a nil enrichedState—return the error from the current reconcile/state handling function or requeue with the error so failures surface and are not silently dropped. Ensure you reference ToEncryptionState, enrichedState, existingConfig, and keySecrets when making the change.test/library/encryption/kms/vault.go (1)
109-110:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winValidate AppRole credential fields before copying them into the target secret.
Lines 109-110 copy map lookups directly; missing/empty
role-idorsecret-idvalues fail later with less clear errors. Fail fast here with explicit assertions.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/library/encryption/kms/vault.go` around lines 109 - 110, Before copying creds.Data["role-id"] and creds.Data["secret-id"] into the target secret map, validate that creds.Data contains non-empty string values for the "role-id" and "secret-id" keys; if either is missing or empty, return a clear error (or panic/assert) with context (e.g., "missing AppRole role-id" / "missing AppRole secret-id"). Update the code that builds the map (the lines assigning "role-id" and "secret-id") to first check creds and creds.Data types/values and fail fast with descriptive messages instead of blindly copying potentially nil/empty values.pkg/operator/encryption/controllers/key_controller.go (1)
327-340:⚠️ Potential issue | 🟠 Major | ⚡ Quick winRefresh carried
kms-plugin-secret-*entries when applying in-place AppRole config changes.Line 339 only updates
...-kms-plugin-config. If the AppRole secret reference changes, the key secret can keep stale carriedrole-id/secret-iddata from the old secret. Refresh/prune thekms-plugin-secret-*fields in the same update path.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/operator/encryption/controllers/key_controller.go` around lines 327 - 340, The update currently only writes the kms-plugin-config field and can leave stale carried AppRole secret entries; before calling c.secretClient.Secrets(s.Namespace).Update(ctx, s, metav1.UpdateOptions{}), remove or refresh any keys in s.Data whose names start with "encryption.apiserver.operator.openshift.io-kms-plugin-secret-" (e.g., iterate s.Data map and delete entries with that prefix) so carried role-id/secret-id from old AppRole refs are pruned when keySecret/AppRole changes; keep existing logic that sets "encryption.apiserver.operator.openshift.io-kms-plugin-config" (variables: keySecret, s, pluginData, and the Update call remain) and then call Update.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pkg/operator/encryption/encryptiondata/config_test.go`:
- Line 475: The test calls encryptiondata.ToEncryptionState(input, keySecrets)
but ignores its returned error, so update the test in config_test.go to capture
both return values (e.g., state, err) and assert the call succeeded;
specifically call encryptiondata.ToEncryptionState(input, keySecrets) -> (state,
err) and then assert err == nil (using the test helper used in the repo, e.g.,
require.NoError(t, err) or t.Fatalf on error) before continuing the immutability
checks on state and input.
In `@test/library/encryption/scenarios.go`:
- Around line 329-340: After calling scenario.WaitForDegraded(ctx, e) add an
explicit assertion that the cluster's encryption mode remained KMS: re-fetch the
APIServer via cs.ApiServerConfig.Get(ctx, "cluster", metav1.GetOptions{}), then
assert apiServer.Spec.Encryption.Type equals configv1.EncryptionTypeKMS using
require/expect (e.g., require.Equal) so the test fails if the attempted switch
to aescbc actually applied; reference cs.ApiServerConfig.Get,
apiServer.Spec.Encryption.Type, and scenario.WaitForDegraded when locating where
to add the check.
---
Duplicate comments:
In `@pkg/operator/encryption/controllers/key_controller.go`:
- Around line 327-340: The update currently only writes the kms-plugin-config
field and can leave stale carried AppRole secret entries; before calling
c.secretClient.Secrets(s.Namespace).Update(ctx, s, metav1.UpdateOptions{}),
remove or refresh any keys in s.Data whose names start with
"encryption.apiserver.operator.openshift.io-kms-plugin-secret-" (e.g., iterate
s.Data map and delete entries with that prefix) so carried role-id/secret-id
from old AppRole refs are pruned when keySecret/AppRole changes; keep existing
logic that sets "encryption.apiserver.operator.openshift.io-kms-plugin-config"
(variables: keySecret, s, pluginData, and the Update call remain) and then call
Update.
In `@pkg/operator/encryption/controllers/state_controller.go`:
- Around line 214-215: The call to ToEncryptionState is ignoring its returned
error (enrichedState, _ := encryptiondata.ToEncryptionState(existingConfig,
keySecrets)), which can hide failures; change this to capture the error (e.g.,
enrichedState, err := encryptiondata.ToEncryptionState(...)), check for non-nil
err and propagate it instead of continuing with a nil enrichedState—return the
error from the current reconcile/state handling function or requeue with the
error so failures surface and are not silently dropped. Ensure you reference
ToEncryptionState, enrichedState, existingConfig, and keySecrets when making the
change.
In `@test/library/encryption/kms/vault.go`:
- Around line 109-110: Before copying creds.Data["role-id"] and
creds.Data["secret-id"] into the target secret map, validate that creds.Data
contains non-empty string values for the "role-id" and "secret-id" keys; if
either is missing or empty, return a clear error (or panic/assert) with context
(e.g., "missing AppRole role-id" / "missing AppRole secret-id"). Update the code
that builds the map (the lines assigning "role-id" and "secret-id") to first
check creds and creds.Data types/values and fail fast with descriptive messages
instead of blindly copying potentially nil/empty values.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: bf88526e-d2a2-4471-980d-e267facbb3fb
📒 Files selected for processing (8)
pkg/operator/encryption/controllers/key_controller.gopkg/operator/encryption/controllers/key_controller_test.gopkg/operator/encryption/controllers/state_controller.gopkg/operator/encryption/controllers/state_controller_test.gopkg/operator/encryption/encryptiondata/config_test.gotest/e2e-encryption/encryption_test.gotest/library/encryption/kms/vault.gotest/library/encryption/scenarios.go
| encryptiontesting.CreateEncryptionKeySecretWithRawKey("test", []schema.GroupResource{{Resource: "secrets"}}, 1, []byte("61def964fb967f5d7c44a2af8dab6865")), | ||
| } | ||
|
|
||
| encryptiondata.ToEncryptionState(input, keySecrets) |
There was a problem hiding this comment.
Assert ToEncryptionState succeeds in this immutability test.
Line 475 ignores the error return, so the test can pass even when conversion fails.
Suggested fix
- encryptiondata.ToEncryptionState(input, keySecrets)
+ _, err := encryptiondata.ToEncryptionState(input, keySecrets)
+ if err != nil {
+ t.Fatalf("ToEncryptionState returned error: %v", err)
+ }📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| encryptiondata.ToEncryptionState(input, keySecrets) | |
| _, err := encryptiondata.ToEncryptionState(input, keySecrets) | |
| if err != nil { | |
| t.Fatalf("ToEncryptionState returned error: %v", err) | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@pkg/operator/encryption/encryptiondata/config_test.go` at line 475, The test
calls encryptiondata.ToEncryptionState(input, keySecrets) but ignores its
returned error, so update the test in config_test.go to capture both return
values (e.g., state, err) and assert the call succeeded; specifically call
encryptiondata.ToEncryptionState(input, keySecrets) -> (state, err) and then
assert err == nil (using the test helper used in the repo, e.g.,
require.NoError(t, err) or t.Fatalf on error) before continuing the immutability
checks on state and input.
| // step 2: attempt to switch to aescbc — cluster should remain stuck on KMS | ||
| t.Log("Attempting to switch to aescbc (cluster should remain stuck on KMS)") | ||
| apiServer, err = cs.ApiServerConfig.Get(ctx, "cluster", metav1.GetOptions{}) | ||
| require.NoError(t, err) | ||
| apiServer.Spec.Encryption = configv1.APIServerEncryption{Type: configv1.EncryptionTypeAESCBC} | ||
| _, err = cs.ApiServerConfig.Update(ctx, apiServer, metav1.UpdateOptions{}) | ||
| require.NoError(t, err) | ||
|
|
||
| // step 3: wait for degraded | ||
| t.Log("Waiting for operator to report degraded status") | ||
| scenario.WaitForDegraded(ctx, e) | ||
|
|
There was a problem hiding this comment.
Assert the mode remains KMS after the failed switch attempt.
After Line 334, the test only waits for degraded status. It does not explicitly verify the cluster stayed on KMS, so step 2 can pass without validating the intended behavior.
Proposed fix
func TestEncryptionInvalidImageRecovery(ctx context.Context, t testing.TB, scenario InvalidImageRecoveryScenario) {
e := NewE(t, PrintEventsOnFailure(scenario.OperatorNamespace))
+ require.NotNil(t, scenario.AssertFunc, "AssertFunc must not be nil")
require.NotNil(t, scenario.WaitForDegraded, "WaitForDegraded must not be nil")
require.NotNil(t, scenario.WaitForRecovery, "WaitForRecovery must not be nil")
@@
// step 3: wait for degraded
t.Log("Waiting for operator to report degraded status")
scenario.WaitForDegraded(ctx, e)
+ scenario.AssertFunc(e, cs, configv1.EncryptionTypeKMS, scenario.Namespace, scenario.LabelSelector)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@test/library/encryption/scenarios.go` around lines 329 - 340, After calling
scenario.WaitForDegraded(ctx, e) add an explicit assertion that the cluster's
encryption mode remained KMS: re-fetch the APIServer via
cs.ApiServerConfig.Get(ctx, "cluster", metav1.GetOptions{}), then assert
apiServer.Spec.Encryption.Type equals configv1.EncryptionTypeKMS using
require/expect (e.g., require.Equal) so the test fails if the attempted switch
to aescbc actually applied; reference cs.ApiServerConfig.Get,
apiServer.Spec.Encryption.Type, and scenario.WaitForDegraded when locating where
to add the check.
When a KMS sidecar revision is stuck (e.g. wrong container image), the cluster-admin needs to update APIServer.spec.encryption.kms with corrected values. Previously, all encryption controllers blocked on revision convergence, so the fix could never be applied. This change allows the key_controller to update the latest key secret's KMS plugin config in-place for non-migration fields (image, TLS, AppRole) regardless of convergence state. If there is a change in migration-triggering fields, we should wait until convergence happens. State controller carries in-place fields updates only when there is no change in the encryption-config data key.
78d95d2 to
5374add
Compare
There was a problem hiding this comment.
♻️ Duplicate comments (1)
pkg/operator/encryption/encryptiondata/config_test.go (1)
475-475:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winAssert
ToEncryptionStatesucceeds before checking immutability.Line 475 ignores the returned error, so this test can pass even when conversion fails. Capture and assert the error first.
Proposed fix
- encryptiondata.ToEncryptionState(input, keySecrets) + _, err := encryptiondata.ToEncryptionState(input, keySecrets) + if err != nil { + t.Fatalf("ToEncryptionState returned error: %v", err) + }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/operator/encryption/encryptiondata/config_test.go` at line 475, The call to encryptiondata.ToEncryptionState(input, keySecrets) is ignoring its returned error; change the test to capture both (e.g., state, err := encryptiondata.ToEncryptionState(input, keySecrets)), assert the call succeeded (use the test helper in this file, e.g., require.NoError(t, err) or t.Fatalf on err), and only then continue with the immutability/assertions against the returned state variable; reference ToEncryptionState and the local variables you add (state, err) to locate the change.
🧹 Nitpick comments (1)
pkg/operator/encryption/controllers/state_controller_test.go (1)
1223-1241: ⚡ Quick winHarden structural assertions to prevent false positives in propagation test.
This check only validates slice lengths, so it won’t catch structural regressions (provider order/type, resource name, KMS endpoint/name). Assert deep equality for
actualConfig.Encryptionagainst the expected pre-update structure, and also assert the updated secret isopenshift-config-managed/encryption-config-kms.Suggested test assertion tightening
validateFunc: func(ts *testing.T, actions []clientgotesting.Action, destName string, expectedEncryptionCfg *encryptiondata.Config) { + foundUpdate := false for _, action := range actions { if action.Matches("update", "secrets") { updateAction := action.(clientgotesting.UpdateAction) actualSecret := updateAction.GetObject().(*corev1.Secret) + if actualSecret.Namespace != "openshift-config-managed" || actualSecret.Name != "encryption-config-kms" { + continue + } + foundUpdate = true actualConfig, err := encryptiondata.FromSecret(actualSecret) if err != nil { ts.Fatalf("failed to parse updated encryption config: %v", err) } // Verify plugin config was updated updatedPlugin := encryptiontesting.DefaultKMSPluginConfig updatedPlugin.Vault.KMSPluginImage = "registry.example.com/kms-plugin@sha256:1111111111111111111111111111111111111111111111111111111111111111" if !equality.Semantic.DeepEqual(actualConfig.KMSPlugins["1"], updatedPlugin) { ts.Errorf("plugin config not propagated: got %+v", actualConfig.KMSPlugins["1"]) } - // Verify EncryptionConfiguration (structural part) is unchanged - if len(actualConfig.Encryption.Resources) != 1 || len(actualConfig.Encryption.Resources[0].Providers) != 2 { - ts.Errorf("EncryptionConfiguration structure changed unexpectedly") - } + // Verify EncryptionConfiguration (structural part) is unchanged + expectedEncryption := &apiserverconfigv1.EncryptionConfiguration{ + Resources: []apiserverconfigv1.ResourceConfiguration{{ + Resources: []string{"secrets"}, + Providers: []apiserverconfigv1.ProviderConfiguration{{ + KMS: &apiserverconfigv1.KMSConfiguration{ + APIVersion: "v2", + Name: "1_secrets", + Endpoint: "unix:///var/run/kmsplugin/kms-1.sock", + Timeout: &metav1.Duration{Duration: 10 * time.Second}, + }, + }, { + Identity: &apiserverconfigv1.IdentityConfiguration{}, + }}, + }}, + } + if !equality.Semantic.DeepEqual(actualConfig.Encryption, expectedEncryption) { + ts.Errorf("EncryptionConfiguration changed unexpectedly: %s", diff.ObjectReflectDiff(expectedEncryption, actualConfig.Encryption)) + } return } } - ts.Error("expected an update action for the encryption-config secret") + if !foundUpdate { + ts.Error("expected an update action for openshift-config-managed/encryption-config-kms") + } },🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/operator/encryption/controllers/state_controller_test.go` around lines 1223 - 1241, The test's structural assertion for actualConfig.Encryption is too weak (only checks slice lengths) and may miss regressions; update the validateFunc to assert deep equality of actualConfig.Encryption against the expected pre-update Encryption structure (the exact expected Encryption object used before the plugin change) rather than just lengths, and also assert the updated secret's name equals "openshift-config-managed/encryption-config-kms"; reference actualConfig.Encryption for the deep-equal check and keep the existing plugin propagation assertion that compares actualConfig.KMSPlugins["1"] to encryptiontesting.DefaultKMSPluginConfig with the modified Vault.KMSPluginImage.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Duplicate comments:
In `@pkg/operator/encryption/encryptiondata/config_test.go`:
- Line 475: The call to encryptiondata.ToEncryptionState(input, keySecrets) is
ignoring its returned error; change the test to capture both (e.g., state, err
:= encryptiondata.ToEncryptionState(input, keySecrets)), assert the call
succeeded (use the test helper in this file, e.g., require.NoError(t, err) or
t.Fatalf on err), and only then continue with the immutability/assertions
against the returned state variable; reference ToEncryptionState and the local
variables you add (state, err) to locate the change.
---
Nitpick comments:
In `@pkg/operator/encryption/controllers/state_controller_test.go`:
- Around line 1223-1241: The test's structural assertion for
actualConfig.Encryption is too weak (only checks slice lengths) and may miss
regressions; update the validateFunc to assert deep equality of
actualConfig.Encryption against the expected pre-update Encryption structure
(the exact expected Encryption object used before the plugin change) rather than
just lengths, and also assert the updated secret's name equals
"openshift-config-managed/encryption-config-kms"; reference
actualConfig.Encryption for the deep-equal check and keep the existing plugin
propagation assertion that compares actualConfig.KMSPlugins["1"] to
encryptiontesting.DefaultKMSPluginConfig with the modified Vault.KMSPluginImage.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 65696f7c-4824-4db8-9a63-8b854698322c
📒 Files selected for processing (9)
pkg/operator/encryption/controllers/key_controller.gopkg/operator/encryption/controllers/key_controller_test.gopkg/operator/encryption/controllers/state_controller.gopkg/operator/encryption/controllers/state_controller_test.gopkg/operator/encryption/encryptiondata/config_test.gotest/e2e-encryption/encryption_test.gotest/library/encryption/helpers.gotest/library/encryption/kms/vault.gotest/library/encryption/scenarios.go
🚧 Files skipped from review as they are similar to previous changes (5)
- test/library/encryption/scenarios.go
- pkg/operator/encryption/controllers/state_controller.go
- test/e2e-encryption/encryption_test.go
- pkg/operator/encryption/controllers/key_controller.go
- pkg/operator/encryption/controllers/key_controller_test.go
1f8a774 to
32381d8
Compare
The invalid image uses a properly formatted OCI reference with sha256 digest that passes API validation but points to a non-existent image, causing ImagePullBackOff and operator degradation.
32381d8 to
d94ed61
Compare
…e fix
Adds a test scenario that verifies the cluster can recover when a KMS encryption config is set with an invalid plugin image:
Summary by CodeRabbit
New Features
Tests
Chores