Part of the upgrade epic, Phase 3. The RPO=0, zero-downtime upgrade for CLUSTERED deployments, using the built raft + replication.
Rolling node-by-node upgrade (the etcd/Consul/ElastiCache/Redis pattern): upgrade an in-sync REPLICA first (it catches up from the primary), then PROMOTE it (ownership moves only via a committed raft log + monotonic epoch - this fence IS synchronous promotion, so no acknowledged write is lost), then upgrade the old primary as a replica. Clients redirect on the failover; the dataset is never down.
Scope: an ironcache upgrade mode/flag that orchestrates the cluster-aware rolling upgrade (drive the replica upgrade via the spine, wait for in-sync, trigger the committed PromoteReplica, verify, upgrade the old primary); guardrails (refuse to promote a non-in-sync replica - the lag gate; require quorum). NOT applicable to the single-node prod box without first standing up a replica (raft mode is a boot-only cluster decision). Async replication has a small loss window bounded by the in-sync gate - document it.
Acceptance: in a raft cluster, an ironcache upgrade rolls the whole cluster to a new version with zero downtime and zero acknowledged-write loss, primary upgraded last. Depends on: the self-updater spine; prod/test running clustered.
Part of the upgrade epic, Phase 3. The RPO=0, zero-downtime upgrade for CLUSTERED deployments, using the built raft + replication.
Rolling node-by-node upgrade (the etcd/Consul/ElastiCache/Redis pattern): upgrade an in-sync REPLICA first (it catches up from the primary), then PROMOTE it (ownership moves only via a committed raft log + monotonic epoch - this fence IS synchronous promotion, so no acknowledged write is lost), then upgrade the old primary as a replica. Clients redirect on the failover; the dataset is never down.
Scope: an
ironcache upgrademode/flag that orchestrates the cluster-aware rolling upgrade (drive the replica upgrade via the spine, wait for in-sync, trigger the committed PromoteReplica, verify, upgrade the old primary); guardrails (refuse to promote a non-in-sync replica - the lag gate; require quorum). NOT applicable to the single-node prod box without first standing up a replica (raft mode is a boot-only cluster decision). Async replication has a small loss window bounded by the in-sync gate - document it.Acceptance: in a raft cluster, an
ironcache upgraderolls the whole cluster to a new version with zero downtime and zero acknowledged-write loss, primary upgraded last. Depends on: the self-updater spine; prod/test running clustered.