Fix invalid first appointments in Raft #2330

yngvar-antonsson · 2025-06-06T15:07:58Z

When Raft failover is enabled on replicasets with fewer than 3 instances (typically routers), a bug occurs in leader selection:

Leader is changed in the replicaset
Router learns about new leader via membership
Config update triggers failover.cfg
Router incorrectly reverts to the old leader (first lexicographical instance) as if in 'disabled' mode

This causes vshard to return NON_MASTER errors since wrong storage is elected as master.

I didn't forget about

Tests
Changelog
Documentation

Closes https://jira.vk.team/browse/TNT-1415

Copilot

Pull Request Overview

This PR fixes incorrect initial leader appointments during Raft failover when a replicaset has fewer than three instances, and adds a test to verify leader selection history under Raft mode.

Introduce get_leaders_history in the test service to capture applied leader snapshots.
Enhance the Raft failover logic to appoint a leader in small clusters on initial configuration.
Replace disabled-mode appointment logic with the unified Raft get_appointments and add integration tests.

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
test/integration/raft_test.lua	Add `raft_leaders_history` test group and a new test
test/entrypoint/srv_basic.lua	Track and expose `leaders_history` in the test module
cartridge/failover/raft.lua	Insert fallback leader appointment for small clusters
cartridge/failover.lua	Use `raft_failover.get_appointments` in disabled mode
CHANGELOG.rst	Document “Invalid leader appointment in Raft failover”

Comments suppressed due to low confidence (3)

cartridge/failover.lua:961

Ensure that raft_failover is properly imported (e.g. local raft_failover = require('cartridge.failover.raft')) at the top of this file; otherwise this call will result in a nil-reference error.

first_appointments = raft_failover.get_appointments(topology_cfg)

test/integration/raft_test.lua:345

Verify that the Cluster helper defines wait_until_healthy(). If not, calling this method will raise an undefined-method error in the test.

g.cluster:wait_until_healthy()

test/entrypoint/srv_basic.lua:35

[nitpick] The leaders_history table persists across service restarts and tests. Consider clearing it in a stop or init hook to avoid cross-test contamination.

local leaders_history = {}

Copilot · 2025-06-06T15:10:12Z

cartridge/failover/raft.lua

-    appointments[vars.replicaset_uuid] = vars.leader_uuid
+    -- If there is still no leader in the replicaset, appoint the first
+    -- electable instance as a leader.
+    for replicaset_uuid, _ in pairs(replicasets) do


The threshold ‘3’ for small clusters is a magic number—consider extracting it into a named constant (e.g. MIN_RAFT_REPLICASET_SIZE) or config parameter for clarity and easier adjustments.

cartridge/failover/raft.lua

test/entrypoint/srv_basic.lua

Satbek · 2025-06-09T14:40:00Z

cartridge/failover/raft.lua

+        if appointments[replicaset_uuid] == nil then
+            local leaders = topology.get_leaders_order(
+                topology_cfg, replicaset_uuid, nil, {only_electable = false, only_enabled = true}
+            )
+            if vars.disable_raft_on_small_clusters and #leaders < 3 then
+                appointments[replicaset_uuid] = leaders[1]
+            end
+        end
+    end


we need another test for this case

Satbek · 2025-06-10T14:05:27Z

cartridge/failover/raft.lua

+    -- If there is still no leader in the replicaset, appoint the first
+    -- electable instance as a leader.
+    for replicaset_uuid, _ in pairs(replicasets) do
+        if appointments[replicaset_uuid] == nil then


The failover.disable doesn't change the membership payload.

If we disabled instance, raft_term and leader_uuid will remain the same.

If there were instances and Raft in the replicaset, the instances will keep the same raft_term and leader_uuid, and the old leader will remain as it was with Raft.

Therefore, it's better not to check for nil.

Reproducer: 1. Start a cluster with one router and a storage replicaset of three instances (storage-1 is the initial leader). 2. Call `box.ctl.promote()` on storage-2. 3. Wait until the router learns the new leader via membership. 4. Patch the cluster-wide config to trigger `failover.cfg`. 5. The router reverts to the old master because it is lexicographically first in the topology, mimicking ‘disabled’ mode. Fix: always build `first_appointments` via `raft_failover.get_appointments(topology_cfg)` so routers do not fall back to stale Raft information when the replicaset has fewer than three instances.

…ments in Raft mode If `disable_raft_on_small_clusters` is enabled and a replicaset contains fewer than three instances, treat it like ‘disabled’ failover and pick the first enabled non-eleactable instance as leader.

yngvar-antonsson requested a review from Copilot June 6, 2025 15:07

Copilot AI reviewed Jun 6, 2025

View reviewed changes

a1div0 reviewed Jun 9, 2025

View reviewed changes

cartridge/failover/raft.lua Show resolved Hide resolved

test/entrypoint/srv_basic.lua Show resolved Hide resolved

Satbek reviewed Jun 9, 2025

View reviewed changes

Satbek reviewed Jun 10, 2025

View reviewed changes

Satbek force-pushed the incorrect-appointments branch from 6d2137e to 5027c9d Compare June 10, 2025 15:07

a1div0 self-requested a review June 11, 2025 05:43

a1div0 approved these changes Jun 11, 2025

View reviewed changes

Satbek force-pushed the incorrect-appointments branch from 4599156 to 887c75c Compare June 11, 2025 06:26

Satbek added 2 commits June 11, 2025 11:43

Satbek force-pushed the incorrect-appointments branch from 887c75c to 4532fb9 Compare June 11, 2025 06:51

Satbek merged commit 100105d into master Jun 11, 2025
22 checks passed

Satbek deleted the incorrect-appointments branch June 11, 2025 07:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix invalid first appointments in Raft #2330

Fix invalid first appointments in Raft #2330

Uh oh!

yngvar-antonsson commented Jun 6, 2025 •

edited by Satbek

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Jun 6, 2025

Uh oh!

Uh oh!

Uh oh!

Satbek Jun 9, 2025

Uh oh!

Satbek Jun 10, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fix invalid first appointments in Raft #2330

Fix invalid first appointments in Raft #2330

Uh oh!

Conversation

yngvar-antonsson commented Jun 6, 2025 • edited by Satbek Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Jun 6, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Satbek Jun 9, 2025

Choose a reason for hiding this comment

Uh oh!

Satbek Jun 10, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

yngvar-antonsson commented Jun 6, 2025 •

edited by Satbek

Loading