Description
Decommissioning a CRDB node requires knowing its node ID. Node IDs are printed and logged on startup, but I'd like a more precise way to programmatically determine the ID of a running node.
Initially I thought I could gather it from the output of cockroach node status
based on the listening address, but that can be imprecise; e.g., if I do gross things with running processes and timing, I can get output like this:
id | address | sql_address | build | started_at | updated_at | locality | is_available | is_live
-----+-------------+-------------+---------+----------------------------+----------------------------+----------+--------------+----------
1 | [::1]:33331 | [::1]:33331 | v22.1.9 | 2024-06-03 12:46:44.335389 | 2024-06-03 12:50:06.881011 | | true | true
2 | [::1]:33332 | [::1]:33332 | v22.1.9 | 2024-06-03 12:46:52.257414 | 2024-06-03 12:49:11.929403 | | false | false
7 | [::1]:33332 | [::1]:33332 | v22.1.9 | 2024-06-03 12:50:06.200582 | 2024-06-03 12:50:06.880881 | | true | true
8 | [::1]:33332 | [::1]:33332 | v22.1.9 | 2024-06-03 12:49:14.815347 | 2024-06-03 12:50:04.356989 | | true | true
where nodes 2 and 8 used to be running at a particular address; now node 7 is running there, but based on just this output I can't tell whether it's 7 or 8 (since both claim to be available and live).
I asked on Slack, and was told about
select crdb_internal.node_id();
which is exactly what I want. (For my use case I don't particularly care whether this comes from the SQL shell or the HTTP API or a CLI invocation.) However, I can't find this in the crdb-internal docs, and therefore don't know whether it would get the "use in production" stability checkmark.
Could node_id()
be added to those docs? If it's not suitable for production use, is there a different way to gather node IDs?
Jira issue: CRDB-39187