OCPEDGE-2280: mutable topology by jeff-roche · Pull Request #2008 · openshift/enhancements

jeff-roche · 2026-05-11T19:46:28Z

Summary

Introduces the Mutable Topology enhancement proposal, which enables OpenShift clusters to transition between topology modes as a Day 2 operation. This replaces the previous Adaptable Topology proposal.

Key Design Decisions

Controller in cluster-config-operator (CCO) — A new topology transition controller in CCO watches spec.desiredTopology on the Infrastructure CR, validates preconditions, coordinates the transition across operators, and updates topology status fields when complete. CCO was chosen over CVO, CEO, and MCO (and over a standalone operator) because it owns the config.openshift.io API group and the Infrastructure CR lifecycle. See Alternatives in the proposal for the full placement analysis.
No new topology enum values — Transitions move between existing TopologyMode values (SingleReplica, HighlyAvailable, etc.). Operators continue reacting to fixed topology values they already understand. Transition complexity is concentrated in a single controller rather than distributed across 30+ operators.
Spec/status contract — Follows the standard Kubernetes pattern: spec.desiredTopology expresses administrator intent; status.controlPlaneTopology reflects observed state. Mirrors the oc adm upgrade pattern (patch spec, controller does the work).
Feature-gated — MutableTopology gate progresses through DevPreview → TechPreview → GA. Controller is not registered when the gate is disabled (zero runtime overhead).

Scope

Initial transition: SNO → HA compact (3-node) on platform: none
CLI: oc adm transition topology HighlyAvailable
Admission control: CEL validation on desiredTopology; ValidatingAdmissionPolicy (fail-closed) protects topology status fields from direct edits outside CCO
etcd scaling: CEO handles sequential 1→2→3 member scaling via existing learner-to-voter promotion
Failure handling: Controller resets desiredTopology on failure (deliberate spec mutation to prevent infinite retry loops); CEO attempts etcd rollback
Upgrade safety: CCO sets Upgradeable=False while a transition is in progress

What Changed (Revision History)

The proposal was revised to base the controller in CCO rather than proposing a dedicated standalone operator (OTTO). Key changes from the prior revision:

Controller placement moved from a standalone operator to CCO, with full alternatives analysis (CVO, CEO, MCO, standalone operator, CLI-only)
Added ValidatingAdmissionPolicy for topology status field protection (fail-closed)
Added detailed failure handling: controller resets desiredTopology on failure with rationale for the spec-mutation deviation
Expanded graduation criteria with per-operator topology dependency matrix requirement
Added monitoring/telemetry requirements (Prometheus metrics, alerts) for GA graduation
Added Support Procedures section with team ownership, detection, and recovery procedures
Clarified etcd scaling risks: the 2-voter intermediate state is unique to Day 2 transitions (does not occur during bootstrapping)
Added Upgradeable=False enforcement during transitions to prevent concurrent upgrades

Out of Scope

Bidirectional transitions (HA → SNO)
HyperShift / hosted control planes
MicroShift
Automatic node provisioning
Cloud platforms (AWS, Azure, GCP) — design does not preclude future support
platform: baremetal — pending keepalived resolution

🤖 Generated with Claude Code

openshift-ci-robot · 2026-05-11T19:48:33Z

@jeff-roche: This pull request references OCPEDGE-2280 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the epic to target either version "5.0." or "openshift-5.0.", but it targets "openshift-4.22" instead.

Details

In response to this:

Summary

Introduces the Mutable Topology enhancement, replacing the previous Adaptable Topology proposal

Proposes a new optional payload operator (OTTO) to orchestrate topology transitions between existing fixed topology modes, rather than adding a new topology enum

Initial scope: SNO to HA compact (3-node) on platform: none

Test plan

markdownlint passes (markdownlint-cli2)

Reviewer feedback from control plane, API, and architecture teams

Template structure validated against guidelines/enhancement_template.md

🤖 Generated with Claude Code

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

jeff-roche · 2026-05-11T22:12:24Z

/assign @jaypoulz @eggfoobar @jerpeter1 @sdodson @dgoodwin @tjungblu @JoelSpeed @dusk125 @patrickdillon

brandisher

I'm missing a "why" statement covering why a day 2, out-of-payload operator is the right choice for this. The CVO section towards the bottom hints at the why a bit but more explicit detail is needed.

With that in mind, I haven't reviewed the EP fully because I don't understand why this is the approach we're taking. The assessment of CVO seems very light and not enough to exclude that as a potential option to meet the goals.

brandisher · 2026-05-12T15:08:32Z

+- The CLI would need direct access to operator internals, violating separation of concerns
+- Error recovery and retry logic is better suited to an operator's reconciliation loop
+
+### Controller in CVO


Is CVO the only option in the core operators where this might make sense?

I expanded to include some other operators, none of which fit the bill in my opinion. This is an entirely new process and shoehorning it into another operator that wasn't designed for tackling this type of procedure seems irresponsible to me

Which operator handles adding nodes to clusters?

openshift-ci · 2026-05-12T18:05:32Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from dgoodwin. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

jeff-roche · 2026-05-12T18:26:01Z

I'm missing a "why" statement covering why a day 2, out-of-payload operator is the right choice for this. The CVO section towards the bottom hints at the why a bit but more explicit detail is needed.

With that in mind, I haven't reviewed the EP fully because I don't understand why this is the approach we're taking. The assessment of CVO seems very light and not enough to exclude that as a potential option to meet the goals.

@brandisher I've added a new paragraph under the ## Proposal header that explains the why. If you're looking for something specifically beyond what I added, I'd be happy to add some more detail

JoelSpeed

🤖 Generated with Claude Code

There are significant portions of this proposal that assume behaviour of OpenShift that either doesn't exist, or doesn't work in the way proposed. I'm assuming here that this is hallucination of Claude?

The EP as it stands today doesn't actually make sense for implementation. It also doesn't align with what I thought we had agreed on the architecture call.

Has anyone tried to manually take a cluster and scale up and manually transition from a single replica to multiple replicas? IMO this is the most important next step for this project

What I thought we had agreed:

To scale from SNO to HA, the user must create two new control plane nodes and join them to the cluster
- On HighlyAvailable topology - KAS, KCM, etcd, etc all get scheduled automatically as static pods on these nodes - I don't see anything that prevents this based on if it's a SNO cluster today, this needs to be checked (it probably should)
- MCO still serves ignition for control plane nodes on SNO, so user needs to create the control plane nodes somehow to ignite from here
New fields are added to the infrastructure spec to allow the user to say "I intend for this cluster to be HA going forward"
A controller is added to cluster config operator
- This checks that the precondition of having additional control plane nodes in the cluster is met
- Once the precondition is met, it updates the status to reflect spec
Operators now react to the change in status and transition from single to HA
- etcd operator promotes learners to full members, quorum goes from 1->3 (I don't know if this guard is in place today, we should add if not)
- KAS/KCM - no change, it already scheduled new KAS/kCM pods
- Others - Those that previously deploy a single replica of their operand now move to 2 replicas, other changes might be needed on a per operator basis, I was expecting those details in the EP but don't see them yet

JoelSpeed · 2026-05-13T10:13:21Z

+- Installed either manually or via the `oc adm transition topology` command
+- Owns the transition graph — the directed graph defining which topology transitions are supported
+- Owns the validation criteria for each transition (required nodes, certificates, secrets, operator states)
+- Orchestrates transitions by interacting with cluster operators via their existing APIs


I don't think this actually exists

This statement is just objectively incorrect. :( I can see why you'd be confused reading this.
This never happens. What we can do is look to update things like ingress and console to be more adaptable like etcd/api-server such that they update their replicas when more infrastructure nodes become available and do firmer pre-flight checks so that the "transition" piece becomes a no-op, but I think it's OK for some operators to continue to treat the topology field as the source of truth for desired behavior.

An alternative would be to key off the infrastructure topologies "desiredTopology" and update the hooks for ingress to try to update it's replicas when it detects an update to that field. Then the pre-flight checks actually verify that has the right number of replicas and we update the topology after it's already succeeded. i guess it depends on whether we're treating the topology field or the desired topology field as the answer to "what should the operator being doing right now".

it's OK for some operators to continue to treat the topology field as the source of truth for desired behavior.

Absolutely.

In an ideal world, most operators would not scale up their operands until the status toplogy fields were updated. We know that's not true today but I don't think we necessarily need to fix most of the controllers. The one controller that does concern me is etcd operator. Would be good to understand why it acts the way it does today (will just scale up and add the member to quorum on SNO) and whether there's a way we can change that behaviour so that it would treat new members as learners until the status toplogy transitions

@jeff-roche BTW can we get rid of the objectively incorrect statements at some point please

JoelSpeed · 2026-05-13T10:16:46Z

+
+#### Risk: Platform Bare Metal May Not Support Single-Node Clusters
+
+**Risk**: If keepalived networking cannot be enabled, `platform: baremetal` will be limited to 2+ nodes, reducing the value of mutable topology for this platform.


limited to 2+? Isn't that the success criteria?

baremetal platform doesn't support SNO because having a load balancer for 1 node doesn't make sense.
In order for users to get the benefits of having not having to manually deploy a load balancer (i.e. what they primarily save in terms of effort when deploying on platform: baremetal), we need to investigate if we can allow baremetal as a platform for SNO first (which loadbalancing disabled), and change that operator so that loadbalancing can be introduced post-transition.

Otherwise we need to introduce a new, scarier feature: platform transitions.
That a pandora's box I don't want to look at.

That a pandora's box I don't want to look at.

You and me both

So are we tying this EP to not only supporting topology transitions, but also SNO on baremetal? I would have expected a SNO on baremetal project to be sufficiently large and warrant its own EP?

JoelSpeed · 2026-05-13T10:18:13Z

+- Error recovery and retry logic is better suited to an operator's reconciliation loop than imperative CLI code
+- The CLI would need direct access to operator internals, violating separation of concerns
+
+### Extending an Existing Core Operator


Or cluster config operator which would make a very natural home for this as long as we have commitment of ownership from folks writing the new controller

At risk of going on a tangent, currently the installer has problems calculating topology when laying down manifests. We have a bug for this in the backlog, and I left #1905 (comment) on the previous enhancement.

I would like to see that calculation moved to the cluster config operator in bootkube during bootstrapping. That solution could co-exist with this one (and my team will push it forward as priorities allow); but it could potentially also tie into this solution.

I'm find with us using CCO for this. I will take the blame for miscommunicating this to Jeff - it didn't strike me as obvious that a controller for this transition would obviously belong there. My instincts were that new code in the core operators is expensive, especially for a controller that doesn't need to be running 99% of the time. That said, I think it's fine for this to be a controller that is installed with zero replicas and the replicas are scaled-up during transition events. That fits them main sentiment of what Jeff and I were trying to solve - minimizing the tax on clusters that will never use this feature (i.e. the vast majority of them).

IMHO the solution for the installer is to get the user to specify their intent in the install-config (this should be passed through to the cluster). This enhancement is a good opportunity to define what that input should look like.

Resolving this thread as I've re-scoped this to be a new CCO controller

patrickdillon

I know the scope is limited to baremetal/platform:none, but I know there is interest for mutable topologies in cloud platforms as well so as much as appropriate I would to ensure the design leaves a path forward for those cloud platforms.

Also, like the other enhancement I don't see any mention of mastersSchedulable which affects the calculation for infrastructureTopology. How is the mastersSchedulable field handled/taken into account for this solution?

patrickdillon · 2026-05-13T13:19:07Z

+- Error recovery and retry logic is better suited to an operator's reconciliation loop than imperative CLI code
+- The CLI would need direct access to operator internals, violating separation of concerns
+
+### Extending an Existing Core Operator


At risk of going on a tangent, currently the installer has problems calculating topology when laying down manifests. We have a bug for this in the backlog, and I left #1905 (comment) on the previous enhancement.

I would like to see that calculation moved to the cluster config operator in bootkube during bootstrapping. That solution could co-exist with this one (and my team will push it forward as priorities allow); but it could potentially also tie into this solution.

zaneb

This one looks directionally correct 👍

zaneb · 2026-05-14T00:26:19Z

+
+##### Pre-Transition
+
+1. The cluster administrator prepares the additional control-plane nodes (hardware, network, OS)


Does 'OS' here imply that the user joins the hosts to the cluster as as control plane nodes at this stage? If not, at what stage is that expected to happen?

Not sure what this means? Does this mean just prepping the HW is inplace? Or does this mean adding the node as a worker node to the cluster?
That would have the benefit that we can rely on all the existing docs and procedures on how to add a worker node to an existing cluster.

zaneb · 2026-05-14T03:56:02Z

+OTTO maintains a directed graph of supported transitions. For the initial implementation:
+
+```text
+SingleReplica (SNO, platform: none) → HighlyAvailable (3-node compact)


I think it's a mistake to define the supported topologies in terms of the controlPlaneTopology field. There are at least 6 use cases I can think of that users have articulated:

single-node (1 schedulable control plane, 0+ workers, no load balancer)

compact (3 schedulable control plane, 0+ workers)

standby (3 non-schedulable control plane, 0 workers)

HA (3 non-schedulable control plane, 2+ workers)

TNA (2 non-schedulable control plane, 1 arbiter, 2+ workers)

TNF (2 schedulable control plane w/ STONITH, 0 workers)

I've expanded the detail around CP and infra topology, as well as some validation rules around number of workers. For the first pass, we will report an error prior to transitioning if there are any worker nodes.

Can you point me to this expansion? I have the same question as Zane still having re-read the EP. This IMO needs more expansion unless I missed a section

zaneb · 2026-05-14T04:04:09Z

+
+The initial implementation targets `platform: none` clusters. On `platform: none`, the administrator is responsible for managing their own load balancing configuration (VIPs, DNS) when scaling beyond a single node.
+
+`platform: baremetal` support is planned for a subsequent phase. Bare metal networking uses keepalived for ingress load balancing, which is not useful and creates a point of failure for SNO deployments. The Bare Metal Networking team will be consulted to determine if this networking setup can be enabled for single-node clusters transitioning to HA.


I find it weird that we are going to add single-node support to platform:baremetal just so that we can say we are not preventing it from later transitioning to HA.
Who is asking for this?

I would prefer that any effort from the on-prem networking team were instead directed toward adding optional on-prem networking to platform:external.

As said in a previous comment, its crucial to get support for "plaform:baremetal" and keelaived load balancing (for ingress AND API) in the medium term. We should validate that there is no technical obstacle and this can be added in the next release. Rational: at the edge, there hardly is an external load balancer available.

zaneb · 2026-05-14T04:26:06Z

+10. OTTO updates the Infrastructure status fields:
+    - `controlPlaneTopology` transitions from `SingleReplica` to `HighlyAvailable`
+    - `infrastructureTopology` transitions from `SingleReplica` to `HighlyAvailable`
+11. Operators reconcile against the new topology values and adjust their deployment strategies, replica counts, and placement policies


Are we going to try to e.g. restart OLM operators (which previously have treated the topology as fixed)?

Do you have a view into how many/which olm operators are reading this value? Are they reading it at startup, or watching the resource? The expected pattern would be that the operator sees the change, and then reacts by updating the operand (e.g. scaling from 1 to 2 replicas now that it's been told the cluster is HA)

zaneb · 2026-05-14T04:53:02Z

+| cluster-etcd-operator | Coordinate with OTTO for sequential etcd scaling during transitions |
+| Ingress, networking, monitoring operators | Respond to OTTO coordination signals during transitions; reconcile on Infrastructure config changes |
+
+#### Platform Support Constraints


We need to mention that IBI clusters cannot be converted from SNO (and have some mechanism for preventing that).

What's the technical blocker there?

zaneb · 2026-05-14T04:59:42Z

+- Error recovery and retry logic is better suited to an operator's reconciliation loop than imperative CLI code
+- The CLI would need direct access to operator internals, violating separation of concerns
+
+### Extending an Existing Core Operator


IMHO the solution for the installer is to get the user to specify their intent in the install-config (this should be passed through to the cluster). This enhancement is a good opportunity to define what that input should look like.

jeff-roche · 2026-05-15T21:52:35Z

Big update coming next week to realign this with CCO instead of a dedicated operator, add some more technical detail around the flow, and address masters schedulable. Thank you everyone for the quick and thorough reviews, I believe we are rapidly converging on a solid solution!

DanielFroehlich · 2026-05-18T09:41:28Z

+
+##### Pre-Transition
+
+1. The cluster administrator prepares the additional control-plane nodes (hardware, network, OS)


Not sure what this means? Does this mean just prepping the HW is inplace? Or does this mean adding the node as a worker node to the cluster?
That would have the benefit that we can rely on all the existing docs and procedures on how to add a worker node to an existing cluster.

DanielFroehlich · 2026-05-18T09:58:37Z

+4. CEO promotes the learner to a voting member — the cluster now has 2 voting members (quorum=2)
+5. CEO adds an etcd learner on the third control-plane node
+6. The learner syncs data from an existing voter
+7. CEO promotes the learner to a voting member — the cluster now has 3 voting members (quorum=2)


Suggested change

7. CEO promotes the learner to a voting member — the cluster now has 3 voting members (quorum=2)

7. CEO promotes the learner to a voting member — the cluster now has 3 voting members (quorum=3)

DanielFroehlich · 2026-05-18T10:03:33Z

+
+The initial implementation targets `platform: none` clusters. On `platform: none`, the administrator is responsible for managing their own load balancing configuration (VIPs, DNS) when scaling beyond a single node.
+
+`platform: baremetal` support is planned for a subsequent phase. Bare metal networking uses keepalived for ingress load balancing, which is not useful and creates a point of failure for SNO deployments. The Bare Metal Networking team will be consulted to determine if this networking setup can be enabled for single-node clusters transitioning to HA.


As said in a previous comment, its crucial to get support for "plaform:baremetal" and keelaived load balancing (for ingress AND API) in the medium term. We should validate that there is no technical obstacle and this can be added in the next release. Rational: at the edge, there hardly is an external load balancer available.

DanielFroehlich · 2026-05-18T10:08:03Z

+- The 2-member state is transient and follows the same sequential pattern used during cluster bootstrapping — a well-exercised code path
+- Learner instances are used before promoting members to minimize the promotion window
+- No availability guarantee during transitions; administrators should treat scaling operations as a maintenance window
+- CEO will attempt rollback if scaling fails (e.g., rollback to 1 member if the 1→2→3 scale-up fails partway through)


What happens if the loss of quorum=2 was created by a split brain situation? Will both etcd attempt rollback to 1? This could lead to two individual clusters of one. I would be fine with specifing a simple heuristic to resolve this situation, e.g. dropping the younger etcd instance in favour of the older one or something like that. Or maybe a special command for the admin to resolve this situation

DanielFroehlich · 2026-05-18T10:08:45Z

+- Learner instances are used before promoting members to minimize the promotion window
+- No availability guarantee during transitions; administrators should treat scaling operations as a maintenance window
+- CEO will attempt rollback if scaling fails (e.g., rollback to 1 member if the 1→2→3 scale-up fails partway through)
+- Future iterations may explore admitting two learners simultaneously and promoting only when both are ready, eliminating the 2-member voting window entirely but that is out of scope for this enhancement


maybe worth adressing this directly, instead of dealing with the potential split brain situation from my previous comment?

Introduce the Mutable Topology enhancement, which replaces the previous Adaptable Topology proposal. Instead of a new topology enum that all operators must interpret, this approach uses a dedicated operator (OTTO) to orchestrate transitions between existing fixed topology modes. Initial scope: SNO to HA compact on platform: none. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Move the topology transition controller from a standalone operator (OTTO) into cluster-config-operator. CCO owns the config.openshift.io API group and infrastructure CR lifecycle, making it the natural home. Key design decisions: - desiredTopology initialized by installer to match controlPlaneTopology (no kubebuilder default — value is cluster-specific) - Controller triggers on desiredTopology != status.controlPlaneTopology - On failure, controller resets desiredTopology to current topology - Upgrade blocked via Upgradeable=False during transitions - Condition types: TopologyTransitionProgressing, Completed, Failed - Per-operator topology audit required for Dev Preview entry Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

openshift-ci · 2026-05-18T16:33:39Z

@jeff-roche: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

dhensel-rh · 2026-05-18T20:38:23Z

+- Resolution: CEO should attempt automatic rollback. If rollback fails, follow standard etcd disaster recovery procedures.
+
+### Recovery Procedures
+


As part of this transition, will backups scale ?

If I take a backup on SNO, wil it work on TNA or do I need to take a fresh/new backup ?

I haven't thought through backups. @jaypoulz have you given this any thought? My initial thought is you would need to do a new backup as I'm not sure of how we would scale the backup.

dhensel-rh · 2026-05-18T20:43:15Z

Are there limitations for a SNO to TNF transition ? TNF requires BMC/Redfish so if the SNO bare metal hardware does not have it, does it block the transition? I could see this being a problem trying to match hardware in general (BMC firmware versions, vendor types, etc. ).

JoelSpeed

This is much better than the previous iteration. I still fee like there's some disconnect between the new and old stuff, some stuff may still be hanging over from the previous iteration that doesn't quite make sense now, PTAL at my comments

JoelSpeed · 2026-05-19T08:58:10Z

+
+This enhancement enables OpenShift clusters to transition between topology modes as a Day 2 operation. This changes the existing OpenShift assumption that topologies are immutable after installation.
+
+A new `desiredTopology` field in the infrastructure spec expresses the administrator's intent to transition. A topology transition controller in cluster-config-operator watches for changes to this field, validates preconditions, coordinates the transition, and updates the existing topology status fields when the cluster is ready.


Is this for infrastructure or control plane, or both?

This is a fair question. In my head this entire process is about control plane scaling. I think we already have the necessary mechanisms in place to scale workers, right?

JoelSpeed · 2026-05-19T08:59:15Z

+This enhancement enables OpenShift clusters to transition between topology modes as a Day 2 operation. This changes the existing OpenShift assumption that topologies are immutable after installation.
+
+A new `desiredTopology` field in the infrastructure spec expresses the administrator's intent to transition. A topology transition controller in cluster-config-operator watches for changes to this field, validates preconditions, coordinates the transition, and updates the existing topology status fields when the cluster is ready.
+A new `oc adm transition topology` CLI command provides an interface for cluster administrators to initiate transitions.


Is this a common addition to the CLI? I have nothing against extending the CLI, but do question if it is strictly required

I think it is not strictly required, this is more of a usability thing. In theory a cluster admin could go in and update the desired topology and manually monitor progress but that might feel disconnected. Through the CLI we could give some structure to the process

JoelSpeed · 2026-05-19T08:59:39Z

+
+A new `desiredTopology` field in the infrastructure spec expresses the administrator's intent to transition. A topology transition controller in cluster-config-operator watches for changes to this field, validates preconditions, coordinates the transition, and updates the existing topology status fields when the cluster is ready.
+A new `oc adm transition topology` CLI command provides an interface for cluster administrators to initiate transitions.
+The initial implementation supports transitioning Single Node OpenShift (SNO) clusters to HA compact (3-node) on `platform: none`.


Hoping to see somewhere a documented reason for why we are only considering platform none

ack, I think this is covered a couple times in this doc but I can find a more explicit place to mention the reasoning

JoelSpeed · 2026-05-19T09:04:56Z

+
+This enhancement introduces a new infrastructure API field and a topology transition controller in cluster-config-operator (CCO; not to be confused with cloud-credential-operator) to enable topology transitions as Day 2 operations.
+
+The approach follows the standard Kubernetes spec/status contract and mirrors the pattern used by `oc adm upgrade`:


It's more an openshift thing rather than a kube thing this pattern

ack, will update wording

JoelSpeed · 2026-05-19T09:06:38Z

+
+3. **`oc adm transition topology` CLI command** — A command that validates preconditions before patching `spec.desiredTopology` on the infrastructure CR, then monitors transition progress.
+
+The transition controller is proposed to live in cluster-config-operator because CCO is the canonical location for config.openshift.io CRD manifests and bootstrap CR rendering, and the topology transition logic is tightly coupled to the Infrastructure CR schema it ships. This is a deliberate expansion of CCO's scope since historically the repo has been limited to CRD manifests and bootstrap rendering. The controller is feature-gated using the standard library-go FeatureGateAccess pattern: when the gate is disabled the controller is not registered with the manager and incurs negligible runtime overhead; a gate change triggers an operator restart via ForceExit so the new state is picked up cleanly.


the repo has been limited to CRD manifests and bootstrap rendering

This is not really true, but also doesn't materially affect what you're trying to say in this EP

TBH, this whole paragraph is fluff IMO

Good with me dropping it? It was a recommendation from chai-bot to add it and I figured it didn't hurt but agree it's fluff

JoelSpeed · 2026-05-19T09:50:48Z

+
+#### Risk: Platform Bare Metal May Not Support Single-Node Clusters
+
+**Risk**: If keepalived networking cannot be enabled, `platform: baremetal` will be limited to 2+ nodes, reducing the value of mutable topology for this platform.


That a pandora's box I don't want to look at.

You and me both

So are we tying this EP to not only supporting topology transitions, but also SNO on baremetal? I would have expected a SNO on baremetal project to be sufficiently large and warrant its own EP?

JoelSpeed · 2026-05-19T09:52:17Z

+
+#### Risk: Cannot Validate External Requirements
+
+**Risk**: On `platform: none`, the topology transition controller cannot validate external requirements such as correct load balancer configuration or DNS setup. An administrator may initiate a transition with misconfigured networking, leading to a partially functional cluster.


This is the first time load balancers are mentioned. Is this still something we expect the CCO to validate? Feels like that's up to the admin to set up before they initiate the transition, and not something we should be caring about IMO

JoelSpeed · 2026-05-19T09:53:11Z

+**Why it was rejected**:
+- The scope does not warrant a new operator — cluster-config-operator is the natural home for this logic since it already owns the `config.openshift.io` API group and infrastructure CR lifecycle
+- A standalone operator adds payload size, requires its own upgrade/lifecycle management, and introduces another component to monitor
+- The transition controller can live in CCO with zero overhead when not in use, gated by the `MutableTopology` feature gate


Nit: near zero

JoelSpeed · 2026-05-19T09:53:39Z

+
+## Open Questions
+
+1. **HyperShift considerations**: Since the scope has broadened from edge-specific deployments to changing the topology assumption for OpenShift as a whole, do we need to consider HyperShift support? Initial answer is no — this would be future work and require its own enhancement.


This doesn't feel like an open question if it has an answer

JoelSpeed · 2026-05-19T09:55:12Z

+| ---- | ----------- |
+| Precondition validation | Verify controller rejects transitions with missing nodes, invalid platforms, or unsupported source topologies |
+| CLI interaction | Verify `oc adm transition topology` correctly patches `spec.desiredTopology` and monitors progress |
+| Feature gate gating | Verify the controller is inactive when `MutableTopology` feature gate is disabled |


The API won't exist when the gate is disabled, so you won't be able to drive the controller even if it were running. I think this test is probably impossible if not superfluous

openshift-ci Bot requested review from bn222 and cooktheryan May 11, 2026 19:46

jeff-roche mentioned this pull request May 11, 2026

OCPEDGE-2280: Add Adaptable Topology, reorganize topology enhancements #1905

Closed

jeff-roche changed the title ~~enhancements/topologies: mutable topology enhancement proposal~~ OCPEDGE-2280: mutable topology enhancement proposal May 11, 2026

openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label May 11, 2026

jeff-roche changed the title ~~OCPEDGE-2280: mutable topology enhancement proposal~~ OCPEDGE-2280: mutable topology May 11, 2026

jeff-roche force-pushed the mutable-topology branch from 438e03c to 98a1ba4 Compare May 11, 2026 21:24

openshift-ci Bot assigned dgoodwin, dusk125, eggfoobar, jaypoulz, jerpeter1, JoelSpeed, patrickdillon, sdodson and tjungblu May 11, 2026

brandisher reviewed May 12, 2026

View reviewed changes

patrickdillon reviewed May 12, 2026

View reviewed changes

Comment thread enhancements/topologies/mutable-topology.md Outdated

brandisher reviewed May 13, 2026

View reviewed changes

Comment thread enhancements/topologies/mutable-topology.md Outdated

JoelSpeed reviewed May 13, 2026

View reviewed changes

patrickdillon reviewed May 13, 2026

View reviewed changes

zaneb reviewed May 14, 2026

View reviewed changes

DanielFroehlich reviewed May 18, 2026

View reviewed changes

jeff-roche and others added 2 commits May 18, 2026 12:15

jeff-roche force-pushed the mutable-topology branch from a8d48b3 to 22b3682 Compare May 18, 2026 16:16

dhensel-rh reviewed May 18, 2026

View reviewed changes

JoelSpeed reviewed May 19, 2026

View reviewed changes


		#### Risk: Platform Bare Metal May Not Support Single-Node Clusters

		Risk: If keepalived networking cannot be enabled, `platform: baremetal` will be limited to 2+ nodes, reducing the value of mutable topology for this platform.


		##### Pre-Transition

		1. The cluster administrator prepares the additional control-plane nodes (hardware, network, OS)


		The initial implementation targets `platform: none` clusters. On `platform: none`, the administrator is responsible for managing their own load balancing configuration (VIPs, DNS) when scaling beyond a single node.

		`platform: baremetal` support is planned for a subsequent phase. Bare metal networking uses keepalived for ingress load balancing, which is not useful and creates a point of failure for SNO deployments. The Bare Metal Networking team will be consulted to determine if this networking setup can be enabled for single-node clusters transitioning to HA.

	7. CEO promotes the learner to a voting member — the cluster now has 3 voting members (quorum=2)
	7. CEO promotes the learner to a voting member — the cluster now has 3 voting members (quorum=3)

		- Resolution: CEO should attempt automatic rollback. If rollback fails, follow standard etcd disaster recovery procedures.

		### Recovery Procedures


		This enhancement enables OpenShift clusters to transition between topology modes as a Day 2 operation. This changes the existing OpenShift assumption that topologies are immutable after installation.

		A new `desiredTopology` field in the infrastructure spec expresses the administrator's intent to transition. A topology transition controller in cluster-config-operator watches for changes to this field, validates preconditions, coordinates the transition, and updates the existing topology status fields when the cluster is ready.


		This enhancement introduces a new infrastructure API field and a topology transition controller in cluster-config-operator (CCO; not to be confused with cloud-credential-operator) to enable topology transitions as Day 2 operations.

		The approach follows the standard Kubernetes spec/status contract and mirrors the pattern used by `oc adm upgrade`:


		3. `oc adm transition topology` CLI command — A command that validates preconditions before patching `spec.desiredTopology` on the infrastructure CR, then monitors transition progress.

		The transition controller is proposed to live in cluster-config-operator because CCO is the canonical location for config.openshift.io CRD manifests and bootstrap CR rendering, and the topology transition logic is tightly coupled to the Infrastructure CR schema it ships. This is a deliberate expansion of CCO's scope since historically the repo has been limited to CRD manifests and bootstrap rendering. The controller is feature-gated using the standard library-go FeatureGateAccess pattern: when the gate is disabled the controller is not registered with the manager and incurs negligible runtime overhead; a gate change triggers an operator restart via ForceExit so the new state is picked up cleanly.


		#### Risk: Cannot Validate External Requirements

		Risk: On `platform: none`, the topology transition controller cannot validate external requirements such as correct load balancer configuration or DNS setup. An administrator may initiate a transition with misconfigured networking, leading to a partially functional cluster.


		## Open Questions

		1. HyperShift considerations: Since the scope has broadened from edge-specific deployments to changing the topology assumption for OpenShift as a whole, do we need to consider HyperShift support? Initial answer is no — this would be future work and require its own enhancement.

Conversation

jeff-roche commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Key Design Decisions

Scope

What Changed (Revision History)

Out of Scope

Uh oh!

openshift-ci-robot commented May 11, 2026 • edited by openshift-ci Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

jeff-roche commented May 11, 2026

Uh oh!

brandisher left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

openshift-ci Bot commented May 12, 2026

Uh oh!

jeff-roche commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

JoelSpeed left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jaypoulz May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

patrickdillon left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

jeff-roche commented May 11, 2026 •

edited

Loading

openshift-ci-robot commented May 11, 2026 •

edited by openshift-ci Bot

Loading

jeff-roche commented May 12, 2026 •

edited

Loading

jaypoulz May 13, 2026 •

edited

Loading

jeff-roche commented May 15, 2026 •

edited

Loading