Enhancement: Etcd sharding by resource kind for HyperShift#1979
Open
jhjaggars wants to merge 6 commits into
Open
Enhancement: Etcd sharding by resource kind for HyperShift#1979jhjaggars wants to merge 6 commits into
jhjaggars wants to merge 6 commits into
Conversation
Contributor
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
fdf69e3 to
d188bb0
Compare
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
d188bb0 to
313f0ce
Compare
enxebre
reviewed
Apr 27, 2026
enxebre
reviewed
Apr 27, 2026
enxebre
reviewed
Apr 27, 2026
enxebre
reviewed
Apr 27, 2026
enxebre
reviewed
Apr 27, 2026
enxebre
reviewed
Apr 27, 2026
muraee
reviewed
Apr 27, 2026
muraee
reviewed
Apr 27, 2026
- Rename "v2 framework" to "CPO component framework" throughout - Scope hcp CLI references to self-hosted/MCE deployments - Replace EtcdLike interface with strings.HasPrefix prefix check - Replace per-manifest adapt functions with TemplatedProvider approach - Update API types to match kube-api-linter output (omitzero, value types, MinItems/MinLength bounds, omitempty on list map keys) - Drop data-policy annotation — backup determined by storage type (PVC = backed up, EmptyDir = not) - Add parent-level CEL rule preventing shard removal once configured - Add cross-shard duplicate resource prefix CEL validation - Add CEL rule preventing storageClassName on EmptyDir storage - Make scheduling mutable (no data migration needed for placement) - Clarify replicas override controllerAvailabilityPolicy when set - Document CPO restart idempotency for conditional registration - Make downgrade incompatibility for sharded HCPs explicit - Document wait-for-etcd extension mechanism for multi-shard - Document defrag controller sidecar behavior with multiple shards - Add ServiceMonitor and PDB to NewShardComponent registration - Fix stale DataPolicy reference in ManagedEtcdShardStorageSpec - Fix ResourcePrefixes godoc listing "/" as valid for non-default shards - Update Alternative C to reflect partial template adoption Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
JoelSpeed
reviewed
Apr 30, 2026
- Replace resourcePrefixes []string with structured EtcdShardResource type (apiGroup + resource fields with proper validation) - Restructure ManagedEtcdShardStorageSpec as discriminated union with nested PersistentVolume spec and CEL union rule - Change shard name validation to DNS1123 label (was DNS1035-like) - Make storage mutable day-2 (switching PVC/EmptyDir doesn't need cluster recreation) - Make replicas non-pointer int32 (zero value is never valid) - Use CEL url library for endpoint validation (isURL + getScheme) - Prevent adding shards to unsharded clusters (has(oldSelf.shards) == has(self.shards)) - Move MinProperties=1 from Scheduling field to EtcdShardSchedulingSpec struct - Remove redundant omitempty on struct fields (omitzero only) - Add commented-out label key/value validation on nodeSelector (CEL cost budget) - Fix EmptyDir data loss description (survives container restarts) - Add single-replica failure mode note - Add shard rename envtest case - StorageClassName MaxLength 253, DNS1123 subdomain - Add @JoelSpeed to api-approvers - Update last-updated to 2026-04-30 All type changes verified against kube-api-linter (0 issues). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
JoelSpeed
reviewed
May 1, 2026
This was referenced May 6, 2026
- Remove +immutable tags (non-functional, replaced by CEL self == oldSelf) - Make APIGroup a *string for empty-string roundtripping - Use standard DNS1123 subdomain phrasing in godocs - Fix endpoint MaxLength from 255 to 267 - Add @JoelSpeed to reviewers - Remove non-functional map key CEL validations on nodeSelector - Make replicas required (not optional) with Enum=1;3 - Use pointers with omitempty for optional scalar fields, value types with omitzero for optional struct fields per convention Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add storage immutability: `self == oldSelf` on shard Storage field and `has(oldSelf.storage) == has(self.storage)` on ManagedEtcdShardSpec to prevent adding/removing storage after creation, matching existing non-sharded etcd precedent - Add shard swap prevention: `oldSelf.all(old, self.exists(cur, cur.name == old.name))` on both managed and unmanaged Shards lists to close a gap where the size check alone couldn't prevent replacing one shard entry with another (transition rules don't fire on uncorrelated map-type list entries) - Update validation explanation to document why both size check and name preservation rules are needed Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Contributor
|
@jhjaggars: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Proposes etcd sharding by Kubernetes resource kind for HyperShift hosted control planes, enabling distribution of resources across multiple independent etcd deployments for improved scalability and performance.
Each etcd shard is registered as an independent
ControlPlaneComponentwithin the CPO v2 component framework, inheriting all framework features automatically. KAS is configured with--etcd-servers-overridesto route resources to the appropriate shard.NewStatefulSetComponentwithWithAssetDir("etcd"), inheriting priority class, topology spread, scale-to-zero, PDB, etc.EtcdLikeinterface: minimal framework extension for KAS dependency exclusion, priority class, and replica defaultsEtcdShardingfeature gate inTechPreviewNoUpgradeManagedEtcdShardSpec,UnmanagedEtcdShardSpec,EmptyDirEtcdStorageSpec,EtcdDataPolicyTypeadded tohypershift.openshift.io/v1beta1cc @enxebre @sjenning @csrwng