You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
50329: kvserver,cli,roachtest,sql: introduce a fully decommissioned bit r=tbg a=irfansharif
Ignore first commit, cherry-picked from @andreimatei's cockroachdb#49135.
---
This commit introduces a fully decommissioned bit to Cockroach.
Previously our Liveness schema only contained a decommissioning bool,
with consequently no ability to disamiguate between a node currently
undergoing decommissioning, and a node that was fully decommissioned. We
used some combination of store dead threshold to surface, in our UI,
"fully decommissioned" nodes, but it was never quite so. We need this
specificity for the Connect RPC.
We wire up the new CommissionStatus enum that's now part of the liveness
record. In doing so it elides usage of the decommissioning bool used in
v20.1. We're careful to maintain an on-the-wire representation of the
Liveness record that will be understood by v20.1 nodes (see
Liveness.EnsureCompatible for details).
We repurpose the AdminServer.Decommission RPC (which is now a misnomer,
should be thought of being named Commission instead) to persist
CommissionStatus states to KV through the lifetime of a node
decommissioning/recommissioning. See cli/node.go for where that's done.
For recommissioning a node, it suffices to simply persist a COMMISSIONED
state. When decommissioning a node, since it's a longer running process,
we first persist an in-progress DECOMMISSIONING state, and once we've
moved off all the Replicas in the node, we finalize the decommissioning
process by persisting the DECOMMISSIONED state.
When transitioning between CommissionStatus states, we CPut against
what's already there, disallowing illegal state transitions. The
appropriate error codes are surfaced back to the user. An example would
be in attempting to recommission a fully decommissioned node, in which
case we'd error out with the following:
> command failed: illegal commission status transition: can only
> recommission a decommissioning node; n4 found to be decommissioned
Note that this is a behavioral change for `cockroach node recommission`.
Previously it was able to recommission any "decommissioned" node,
regardless of how long ago it's was removed from the cluster. Now
recommission serves to only cancel an accidental decommissioning process
that wasn't finalized.
The 'decommissioning' column in crdb_internal.gossip_liveness is now
powered by this new CommissionStatus, and we introduce a new
commission_status column to it. We also introduce the same column to
the output generated by `cockroach node status --decommission`. The
is_decommissioning column still exists, but is also powered by the
CommissionStatus now.
We also iron out the events plumbed into system.eventlog: it now has a
dedicated event for "node decommissioning".
Release note (general change): `cockroach node recommission` has new
semantics.Previously it was able to recommission any decommissioned node,
regardless of how long ago it's was fully removed from the cluster. Now
recommission serves to only cancel an accidental decommissioning process
that wasn't finalized.
Release note (cli change): We introduce a commission_status column to
the output generated by `cockroach node status --decommission`. It
should be used in favor of the is_decommissioning column going forward.
Co-authored-by: irfan sharif <[email protected]>
Copy file name to clipboardExpand all lines: docs/generated/settings/settings.html
+1-1Lines changed: 1 addition & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -72,6 +72,6 @@
72
72
<tr><td><code>trace.debug.enable</code></td><td>boolean</td><td><code>false</code></td><td>if set, traces for recent requests can be seen in the /debug page</td></tr>
73
73
<tr><td><code>trace.lightstep.token</code></td><td>string</td><td><code></code></td><td>if set, traces go to Lightstep using this token</td></tr>
74
74
<tr><td><code>trace.zipkin.collector</code></td><td>string</td><td><code></code></td><td>if set, traces go to the given Zipkin instance (example: '127.0.0.1:9411'); ignored if trace.lightstep.token is set</td></tr>
75
-
<tr><td><code>version</code></td><td>custom validation</td><td><code>20.1-10</code></td><td>set the active cluster version in the format '<major>.<minor>'</td></tr>
75
+
<tr><td><code>version</code></td><td>custom validation</td><td><code>20.1-11</code></td><td>set the active cluster version in the format '<major>.<minor>'</td></tr>
0 commit comments