diff --git a/enhancements/storage/csi-secrets-store-rotation-and-wif.md b/enhancements/storage/csi-secrets-store-rotation-and-wif.md new file mode 100644 index 0000000000..04d29d20b0 --- /dev/null +++ b/enhancements/storage/csi-secrets-store-rotation-and-wif.md @@ -0,0 +1,527 @@ +--- +title: csi-secrets-store-rotation-and-wif +authors: + - "@chiragkyal" +reviewers: + - "@mytreya-rh" + - "@dobsonj" + - "@JoelSpeed" +api-approvers: + - "@JoelSpeed" +approvers: + - "@mytreya-rh" +creation-date: 2026-05-15 +last-updated: 2026-05-17 +tracking-link: + - https://redhat.atlassian.net/browse/SSCSI-254 +see-also: + - "/enhancements/storage/csi-secrets-store.md" +--- + +# Configurable Secret Rotation and Workload Identity Federation for the Secrets Store CSI Driver + +## Summary + +This enhancement will extend the OpenShift Secrets Store CSI Driver Operator to +allow cluster administrators to configure secret rotation behavior (enable/disable, +polling interval) and workload identity federation (WIF) token audiences through +the `ClusterCSIDriver` custom resource. These settings will be dynamically +propagated by the operator to both the `storage.k8s.io/v1` `CSIDriver` object and +the driver's node DaemonSet, replacing the current hardcoded defaults and enabling +multi-cloud WIF scenarios without requiring manual edits to operand manifests. + +## Motivation + +The Secrets Store CSI Driver was GA'd in OpenShift 4.17 (see +[csi-secrets-store.md](/enhancements/storage/csi-secrets-store.md)) with hardcoded +secret rotation enabled at a fixed 2-minute poll interval. While these defaults are +reasonable for many workloads, they present two problems: + +1. **No user control over rotation behavior.** Administrators cannot disable rotation + for workloads that do not need it (saving unnecessary provider API calls), nor can + they tune the polling interval for environments that need faster or slower rotation. + +2. **No support for workload identity federation.** Cloud providers (AWS, Azure, GCP) + support federated identity using pod-bound service account tokens. The upstream + Secrets Store CSI Driver [v1.6.0](https://github.com/kubernetes-sigs/secrets-store-csi-driver/releases/tag/v1.6.0) + replaced its internal rotation controller with kubelet-driven + [`requiresRepublish`](https://secrets-store-csi-driver.sigs.k8s.io/topics/secret-auto-rotation.html) + and added [`tokenRequests`](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.28/#csidriverspec-v1-storage-k8s-io) + support in the `CSIDriver` spec, but the OpenShift operator does not yet expose + these capabilities to administrators. + +### User Stories + +- As a cluster administrator, I want to disable automatic secret rotation for + workloads that use static secrets, so that the driver does not make unnecessary + provider API calls that may count against rate limits. + +- As a cluster administrator, I want to configure the rotation polling interval to + a shorter value (e.g. 30s), so that secrets are refreshed more quickly in + latency-sensitive environments. + +- As a platform engineer, I want to configure `tokenRequests` audiences on the + `CSIDriver` object through the operator configuration, so that pods can use + workload identity federation to authenticate with AWS STS, Azure AD, or GCP IAM + when fetching secrets from external vaults. + +- As a multi-cloud operator, I want to configure multiple token audiences on a single + Secrets Store CSI Driver instance, so that different workloads on the same cluster + can federate identity with different cloud providers (e.g. AWS and Azure + simultaneously). + +- As a cluster administrator, I want my rotation and token configuration to persist + across operator upgrades and pod restarts without manual re-intervention. + +### Goals + +- Allow administrators to enable or disable secret rotation via `ClusterCSIDriver`. +- Allow administrators to configure the rotation polling interval via `ClusterCSIDriver`. +- Allow administrators to configure `tokenRequests` (audience + optional expiration) + for workload identity federation via `ClusterCSIDriver`. + +### Non-Goals + +- This enhancement does not add automatic detection of which cloud provider a + cluster runs on to auto-configure token audiences. Administrators must explicitly + configure the audiences for their environment. +- This enhancement does not modify the upstream Secrets Store CSI Driver or + add features beyond what upstream v1.6.0 provides. +- This enhancement does not cover provider-specific configuration (e.g. Azure Key + Vault, AWS Secrets Manager, HashiCorp Vault). Provider plugins are installed + separately. + +## Proposal + +This proposal will extend the `ClusterCSIDriver` API with a new `SecretsStore` +driver configuration type and update the operator to dynamically propagate these +settings to the CSI driver operand. + +### High-Level Changes + +1. **API extension**: Add `SecretsStore` to the `CSIDriverConfigSpec` discriminated + union in `openshift/api`, defining `secretRotation` and `tokenRequests` fields. + +2. **Dynamic `CSIDriver` object management**: Replace the static `csidriver.yaml` + with a dynamic `AssetFunc` that programmatically sets + `spec.requiresRepublish` and `spec.tokenRequests` on the `CSIDriver` object + based on `ClusterCSIDriver` configuration. + +3. **Dynamic DaemonSet argument injection**: Add a `DaemonSetHookFunc` that sets + `--enable-secret-rotation` and `--rotation-poll-interval` container arguments + on the driver DaemonSet based on `ClusterCSIDriver` configuration. + +### Workflow Description + +**Cluster administrator** is responsible for managing CSI driver +configuration. + +#### Enabling Workload Identity Federation + +1. The cluster administrator edits the `ClusterCSIDriver` for + `secrets-store.csi.k8s.io`: + + ```yaml + apiVersion: operator.openshift.io/v1 + kind: ClusterCSIDriver + metadata: + name: secrets-store.csi.k8s.io + spec: + driverConfig: + driverType: SecretsStore + secretsStore: + tokenRequests: + - audience: "sts.amazonaws.com" + expirationSeconds: 3600 + ``` + +2. The operator will detect the `ClusterCSIDriver` change via the shared informer. +3. The `StaticResourceController` will call the dynamic `AssetFunc` which will + read the `ClusterCSIDriver` configuration and generate a `CSIDriver` manifest + with `spec.tokenRequests` populated. +4. `resourceapply.ApplyCSIDriver` will detect the spec hash difference and + recreate the `CSIDriver` object with the new `tokenRequests`. +5. Kubelet will observe the updated `CSIDriver` and begin providing the requested + service account tokens in the `volume_context` of `NodePublishVolume` calls. +6. The provider plugin will receive the token and use it for workload identity + federation with the cloud provider. + +#### Configuring Rotation + +1. The cluster administrator edits the `ClusterCSIDriver`: + + ```yaml + spec: + driverConfig: + driverType: SecretsStore + secretsStore: + secretRotation: + enabled: true + rotationPollInterval: "5m" + ``` + +2. The operator will re-sync: + - The dynamic `AssetFunc` will set `requiresRepublish: true` on the `CSIDriver`. + - The `DaemonSetHookFunc` will set `--enable-secret-rotation=true` and + `--rotation-poll-interval=5m0s` on the driver container. +3. The DaemonSet pods will be rolling-updated with the new arguments. +4. Kubelet will periodically call `NodePublishVolume` (because `requiresRepublish` + is true), and the driver will re-fetch secrets from the provider if the cache + window (5 minutes) has elapsed. + +#### Disabling Rotation + +1. The cluster administrator sets `secretRotation.enabled: false`. +2. The operator will set `requiresRepublish: false` on the `CSIDriver` and + `--enable-secret-rotation=false` on the DaemonSet. +3. Kubelet will stop periodic `NodePublishVolume` calls. Secrets will only be + fetched at initial pod mount time. + +### API Extensions + +This enhancement adds a new `SecretsStore` variant to the existing +`CSIDriverConfigSpec` discriminated union in `operator.openshift.io/v1`. + +```go +// +kubebuilder:validation:Enum="";AWS;Azure;GCP;IBMCloud;vSphere;SecretsStore +type CSIDriverType string + +// +kubebuilder:validation:XValidation:rule="has(self.driverType) && self.driverType == 'SecretsStore' ? has(self.secretsStore) : !has(self.secretsStore)",message="secretsStore must be set if driverType is 'SecretsStore', but remain unset otherwise" +type CSIDriverConfigSpec struct { + // ... existing fields (aws, azure, gcp, ibmcloud, vSphere) ... + + // secretsStore is used to configure the Secrets Store CSI driver. + // +optional + SecretsStore *SecretsStoreCSIDriverConfigSpec `json:"secretsStore,omitempty"` +} + +// SecretsStoreCSIDriverConfigSpec defines properties that can be configured +// for the Secrets Store CSI driver. +type SecretsStoreCSIDriverConfigSpec struct { + // secretRotation controls automatic secret rotation behavior. + // When omitted, secret rotation is enabled with a default poll interval + // of 2 minutes. + // +optional + SecretRotation *SecretsStoreSecretRotation `json:"secretRotation,omitempty"` + + // tokenRequests specifies service account token audiences that kubelet + // will provide to the CSI driver during NodePublishVolume calls. + // These tokens enable workload identity federation (WIF) with cloud + // providers such as AWS, Azure, and GCP. + // An empty audience string means the token uses the kube-apiserver's + // default APIAudiences. + // +optional + // +listType=atomic + TokenRequests []SecretsStoreTokenRequest `json:"tokenRequests,omitempty"` +} + +// SecretsStoreSecretRotation configures the automatic secret rotation +// behavior for the Secrets Store CSI driver. +type SecretsStoreSecretRotation struct { + // enabled controls whether automatic secret rotation is active. + // When true, the CSIDriver object sets requiresRepublish and the + // driver re-fetches secrets from providers. + // When false, secrets are only fetched at initial + // pod mount time. + // Default is true. + // +kubebuilder:default=true + // +optional + Enabled *bool `json:"enabled,omitempty"` + + // rotationPollInterval is the minimum duration between secret rotation + // attempts. The driver skips provider calls if less than this interval + // has elapsed since the last successful rotation. + // Format is a Go duration string (e.g. "2m", "1h30m"). + // Default is "2m". + // +kubebuilder:default="2m" + // +kubebuilder:validation:Pattern=`^([0-9]+(\.[0-9]+)?(s|m|h))+$` + // +optional + RotationPollInterval *metav1.Duration `json:"rotationPollInterval,omitempty"` +} + +// SecretsStoreTokenRequest specifies a service account token audience +// configuration for workload identity federation (WIF) with the Secrets +// Store CSI driver. +type SecretsStoreTokenRequest struct { + // audience is the intended audience of the service account token. + // An empty string means the issued token will use the kube-apiserver's + // default APIAudiences. + // +required + Audience string `json:"audience"` + + // expirationSeconds is the requested duration of validity of the + // service account token. The token issuer may return a token with + // a different validity duration. + // +optional + ExpirationSeconds *int64 `json:"expirationSeconds,omitempty"` +} +``` + +#### Example ClusterCSIDriver YAML + +```yaml +apiVersion: operator.openshift.io/v1 +kind: ClusterCSIDriver +metadata: + name: secrets-store.csi.k8s.io +spec: + managementState: Managed + driverConfig: + driverType: SecretsStore + secretsStore: + secretRotation: + enabled: true + rotationPollInterval: "5m" + tokenRequests: + - audience: "sts.amazonaws.com" + expirationSeconds: 3600 + - audience: "api://AzureADTokenExchange" +``` + +#### Resulting CSIDriver Object + +The operator will generate the following `storage.k8s.io/v1` `CSIDriver`: + +```yaml +apiVersion: storage.k8s.io/v1 +kind: CSIDriver +metadata: + name: secrets-store.csi.k8s.io +spec: + podInfoOnMount: true + attachRequired: false + fsGroupPolicy: File + volumeLifecycleModes: + - Ephemeral + requiresRepublish: true + tokenRequests: + - audience: "sts.amazonaws.com" + expirationSeconds: 3600 + - audience: "api://AzureADTokenExchange" +``` + +#### Validation Rules + +- `secretsStore` must be set if and only if `driverType` is `SecretsStore` +- `rotationPollInterval` will be validated via regex pattern + `^([0-9]+(\.[0-9]+)?(s|m|h))+$` to ensure valid Go duration format. +- `rotationPollInterval` will enforce a minimum of 2 minutes via CEL validation + to match the existing hard coded value. + +### Topology Considerations + +#### Hypershift / Hosted Control Planes + +N/A + +#### Standalone Clusters + +N/A + +#### Single-node Deployments or MicroShift + +The Secrets Store CSI Driver Operator is not part of MicroShift yet. + +#### OpenShift Kubernetes Engine + +N/A + +### Implementation Details/Notes/Constraints + +#### Upstream `requiresRepublish` Mechanism + +Starting with Secrets Store CSI Driver v1.6.0, the upstream project replaced the +internal rotation controller with the kubelet-native `requiresRepublish` field. +When `requiresRepublish: true` is set on the `CSIDriver` +object, kubelet periodically calls `NodePublishVolume` for already-mounted volumes. +The driver then: + +1. Checks if the cache window (`--rotation-poll-interval`) has elapsed since the + last provider call. +2. If yes, contacts the provider to fetch the latest secret version and updates + the mounted volume. +3. If no, returns success immediately without contacting the provider. + + +#### Dynamic Asset Generation + +The operator uses library-go's `StaticResourceController` to manage the +`CSIDriver` object. A custom `AssetFunc` will be added that: + +1. Reads the base `csidriver.yaml` manifest. +2. Deserializes it into a typed `storagev1.CSIDriver` object using + `resourceread.ReadCSIDriverV1OrDie`. +3. Reads the `ClusterCSIDriver` via a lister to obtain the desired + `requiresRepublish` and `tokenRequests` values. +4. Sets these fields on the `CSIDriver` object. +5. Serializes back to JSON for the `StaticResourceController` to apply. + +This will work because `StaticResourceController` will be wired to the +`ClusterCSIDriver` informer, so any change to the CR will trigger a re-sync. The +downstream `resourceapply.ApplyCSIDriver` will detect spec changes via +annotation-based spec hashing and recreate the `CSIDriver` object as needed (since +`CSIDriver.spec` is effectively immutable in Kubernetes). + +#### DaemonSet Hook + +The operator will use a `DaemonSetHookFunc` to set the `--enable-secret-rotation` and +`--rotation-poll-interval` arguments on the csi-driver container. The hook will +read the `ClusterCSIDriver` via a lister and find/replace arguments by their +`--flag=` prefix. + +The `CSIDriverNodeServiceController` will be configured with the +`ClusterCSIDriver` informer as an optional informer, so that DaemonSet +reconciliation will trigger immediately when the administrator changes the +`ClusterCSIDriver` configuration. + +#### Default Behavior and Upgrade Safety + +All new fields will have defaults that match the operator's current hardcoded +behavior: +- `secretRotation.enabled` will default to `true` +- `rotationPollInterval` will default to `"2m"` +- `tokenRequests` will default to empty (no WIF) + +Clusters upgrading to the new operator version with no `driverConfig` set will see +**no change in behavior**. The operator will fall back to these defaults when the +`ClusterCSIDriver` does not specify a `SecretsStore` driver config. + + +### Risks and Mitigations + +**Risk**: Setting `rotationPollInterval` too low could overwhelm the external +secret provider with API calls. + +**Mitigation**: OpenShift document will suggest users to choose a wise value. + +### Drawbacks + +- The `CSIDriver.spec` is effectively immutable in Kubernetes; changes require + delete and recreate. `library-go`'s `ApplyCSIDriver` will handle this + transparently via spec-hash annotations, but it will mean a brief window where + the `CSIDriver` object does not exist during updates. In practice, this window + will be negligible and will not affect running pods. + +## Alternatives (Not Implemented) + +Nothing considered. + +## Open Questions [optional] + +1. Should `requiresRepublish` on the `CSIDriver` object always be set to `true`, + or should it mirror the value of `secretRotation.enabled`? + +2. Should `expirationSeconds` enforce a minimum value of 600 (10 minutes)? + +## Test Plan + +### Unit Tests + +- Rotation config extraction: nil `driverConfig`, nil `secretsStore`, nil + `secretRotation`, explicitly enabled, explicitly disabled, custom interval + all return the correct enable/interval values. +- CSIDriver config mapping: `ClusterCSIDriver` settings correctly map to + `requiresRepublish` boolean and `storagev1.TokenRequest` list. +- DaemonSet hook arg replacement: hook correctly sets/replaces + `--enable-secret-rotation=` and `--rotation-poll-interval=` by prefix match. +- DaemonSet hook error handling: hook returns an error when the expected + csi-driver container is not found. +- Dynamic asset func: `CSIDriver` manifest correctly receives + `requiresRepublish` and `tokenRequests` fields from `ClusterCSIDriver` config. +- Namespace substitution: non-CSIDriver assets continue to have namespace + replacement applied correctly. + +### E2E Tests + +- No `driverConfig` set: operator uses defaults (`requiresRepublish: true`, + `--enable-secret-rotation=true`, `--rotation-poll-interval=2m`, no + `tokenRequests`). +- `secretRotation.enabled: true` with custom `rotationPollInterval`: operator + sets `requiresRepublish: true` and the custom interval on the DaemonSet. +- `secretRotation.enabled: false`: operator sets `requiresRepublish: false` and + `--enable-secret-rotation=false` on the DaemonSet. +- `tokenRequests` with one or more audiences: operator sets matching + `spec.tokenRequests` on the `CSIDriver` object. +- `tokenRequests` with `expirationSeconds`: operator propagates the expiration + value to the `CSIDriver` `tokenRequests`. +- Multi-cloud WIF: a single Secrets Store CSI Driver instance with multiple + `tokenRequests` audiences can mount secrets from different cloud providers. +- Upgrade: cluster with no `driverConfig` set upgrades to the new version + and retains the same rotation behavior. + +## Graduation Criteria + +This feature will target GA directly. + +### Dev Preview -> Tech Preview + +N/A + +### Tech Preview -> GA + +N/A + +### Removing a deprecated feature + +N/A + +## Upgrade / Downgrade Strategy + +**Upgrade**: Clusters upgrading to the new operator version will see no behavior +change. The operator defaults match the previously hardcoded values +(`requiresRepublish: true`, `--enable-secret-rotation=true`, +`--rotation-poll-interval=2m`, no `tokenRequests`). + +To adopt the new configuration, the administrator must edit the `ClusterCSIDriver` +to set `driverType: SecretsStore` with the desired `secretsStore` configuration. +No changes are required to keep the existing behavior. + + +## Version Skew Strategy + +The feature is be supported since OpenShift 5.0 + +## Operational Aspects of API Extensions + +N/A + +## Support Procedures + +- **Detecting misconfiguration**: Check the `ClusterCSIDriver` status conditions. + If the operator fails to apply the `CSIDriver` or DaemonSet, the `Degraded` + condition will be set. + +- **Verifying rotation config**: Inspect the DaemonSet args: + ```bash + oc get ds -n openshift-cluster-csi-drivers secrets-store-csi-driver-node -o jsonpath='{.spec.template.spec.containers[?(@.name=="csi-driver")].args}' + ``` + +- **Verifying CSIDriver spec**: Inspect the `CSIDriver` object: + ```bash + oc get csidriver secrets-store.csi.k8s.io -o yaml + ``` + Check that `spec.requiresRepublish` and `spec.tokenRequests` match the + `ClusterCSIDriver` configuration. + +- **Disabling the feature**: Set `driverConfig.secretsStore.secretRotation.enabled: false` + to disable rotation, or remove `tokenRequests` to disable WIF. + + +## Infrastructure Needed [optional] + +No additional infrastructure is needed. + +## Implementation History + +- 2026-05-15: Initial proposal + +## References + +- [Secrets Store CSI Driver v1.6.0 Release Notes](https://github.com/kubernetes-sigs/secrets-store-csi-driver/releases/tag/v1.6.0) +- https://secrets-store-csi-driver.sigs.k8s.io/getting-started/upgrades#pre-v160 +- [Upstream requiresRepublish PR](https://github.com/kubernetes-sigs/secrets-store-csi-driver/pull/1622) +- [Kubernetes CSIDriver API - requiresRepublish](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.28/#csidriverspec-v1-storage-k8s-io) +- [Kubernetes CSIDriver API - tokenRequests](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.28/#tokenrequest-v1-storage-k8s-io) +- [Upstream Secret Auto Rotation Documentation](https://secrets-store-csi-driver.sigs.k8s.io/topics/secret-auto-rotation.html) +- https://kubernetes-csi.github.io/docs/csi-driver-object.html +- https://kubernetes-csi.github.io/docs/token-requests.html#status \ No newline at end of file