-
Notifications
You must be signed in to change notification settings - Fork 33
Open
Labels
Description
Sometimes the failover cannot increase priority of a replica due to an assertion:
Line 596 in f72bb21
assert(replica.net_sequential_fail == 0) |
2025-09-18 12:49:06.825 [155737] main/405/vshard.replicaset.00000000-0000-0000-0000-000000000006/vshard.service_info service_info.lua:54 E> Error during failovering: {"name":"PROC_LUA","code":32,"base_type":"ClientError","type":"ClientError","details":".../serpentian/Programming/tnt/vshard/vshard/replicaset.lua:598: assertion failed!","message":".../serpentian/Programming/tnt/vshard/vshard/replicaset.lua:598: assertion failed!","trace":[{"file":"/home/serpentian/Programming/tnt/vshard/vshard/error.lua","line":284}]}
We need to investigate and fix that. Here're the full logs:
Logs
2025-09-18 12:49:06.788 [155737] main/400/main/vshard.router init.lua:1228 I> Starting router configuration
2025-09-18 12:49:06.788 [155737] main/400/main/vshard.router init.lua:1234 I> Calling box.cfg()...
2025-09-18 12:49:06.788 [155737] main/400/main/vshard.router init.lua:1246 I> Box has been configured
2025-09-18 12:49:06.788 [155737] main/401/lua/vshard.replicaset replicaset.lua:1498 I> Old replicaset and replica objects are outdated.
2025-09-18 12:49:06.788 [155737] main/400/main/vshard.router init.lua:1197 I> Master auto search is enabled
2025-09-18 12:49:06.788 [155737] main/402/vshard.replicaset.00000000-0000-0000-0000-000000000003/vshard.replicaset replicaset.lua:1903 I> New replica replica_1_b(storage@unix/:/tmp/t/010_router-luatest/replica_1_b.iproto) for replicaset(id="00000000-0000-0000-0000-000000000003", master=replica_1_a(storage@unix/:/tmp/t/010_router-luatest/replica_1_a.iproto))
2025-09-18 12:49:06.788 [155737] main/402/vshard.replicaset.00000000-0000-0000-0000-000000000003/vshard.replicaset replicaset.lua:1953 I> All replicas are ok
2025-09-18 12:49:06.788 [155737] main/402/vshard.replicaset.00000000-0000-0000-0000-000000000003/vshard.replicaset replicaset.lua:1965 I> Failovering step is finished. Schedule next after 1.000000 seconds
2025-09-18 12:49:06.788 [155737] main/405/vshard.replicaset.00000000-0000-0000-0000-000000000006/vshard.replicaset replicaset.lua:1347 W> health_status on 'replica_2_a' is unhealthy: Requests to 'replica_2_a' failed 53 times in a row
2025-09-18 12:49:06.788 [155737] main/405/vshard.replicaset.00000000-0000-0000-0000-000000000006/vshard.replicaset replicaset.lua:1347 W> health_status on 'replica_2_b' is unhealthy: Connection to 'replica_2_b' was down for 31.72157 seconds
2025-09-18 12:49:06.788 [155737] main/405/vshard.replicaset.00000000-0000-0000-0000-000000000006/vshard.service_info service_info.lua:54 E> Error during failovering: {"name":"PROC_LUA","code":32,"base_type":"ClientError","type":"ClientError","details":".../serpentian/Programming/tnt/vshard/vshard/replicaset.lua:598: assertion failed!","message":".../serpentian/Programming/tnt/vshard/vshard/replicaset.lua:598: assertion failed!","trace":[{"file":"/home/serpentian/Programming/tnt/vshard/vshard/error.lua","line":284}]}
2025-09-18 12:49:06.788 [155737] main/405/vshard.replicaset.00000000-0000-0000-0000-000000000006/vshard.replicaset replicaset.lua:1965 I> Failovering step is finished. Schedule next after 1.000000 seconds
2025-09-18 12:49:06.789 [155737] main/404/vshard.replica.replica_1_a/vshard.replicaset replicaset.lua:1344 I> replication_status on 'replica_1_a' is healthy
2025-09-18 12:49:06.789 [155737] main/403/vshard.replica.replica_1_b/vshard.replicaset replicaset.lua:1344 I> replication_status on 'replica_1_b' is healthy
2025-09-18 12:49:06.789 [155737] main/405/vshard.replicaset.00000000-0000-0000-0000-000000000006/vshard.replicaset replicaset.lua:704 E> Exception during calling 'vshard.storage._call' on 'replica_2_a(storage@unix/:/tmp/t/010_router-luatest/replica_2_a.iproto)': ...mming/tnt/vshard/test/router-luatest/router_2_2_test.lua:1210: TimedOut
2025-09-18 12:49:06.789 [155737] main/405/vshard.replicaset.00000000-0000-0000-0000-000000000006/vshard.replicaset replicaset.lua:1239 I> Master of replicaset 00000000-0000-0000-0000-000000000006, node 00000000-0000-0000-0000-000000000005, does not respond: {"code":32,"base_type":"LuajitError","type":"LuajitError","message":"...mming/tnt/vshard/test/router-luatest/router_2_2_test.lua:1210: TimedOut","trace":[{"file":"./src/lua/utils.c","line":700}]}. Trying to find a new one
2025-09-18 12:49:06.789 [155737] main/405/vshard.replicaset.00000000-0000-0000-0000-000000000006/vshard.replicaset replicaset.lua:2065 I> Master search is started
2025-09-18 12:49:06.789 [155737] main/406/vshard.replica.replica_2_a/vshard.service_info service_info.lua:54 E> Ping error from replica_2_a(storage@unix/:/tmp/t/010_router-luatest/replica_2_a.iproto): perhaps a connection is down: {"code":32,"base_type":"LuajitError","type":"LuajitError","message":"...mming/tnt/vshard/test/router-luatest/router_2_2_test.lua:1210: TimedOut","trace":[{"file":"./src/lua/utils.c","line":700}]}
2025-09-18 12:49:06.789 [155737] main/406/vshard.replica.replica_2_a/vshard.replicaset replicaset.lua:306 I> disconnected from unix/:/tmp/t/010_router-luatest/replica_2_a.iproto
2025-09-18 12:49:06.789 [155737] main/402/vshard.replicaset.00000000-0000-0000-0000-000000000003/vshard.replicaset replicaset.lua:1344 I> health_status on 'replica_1_b' is healthy
2025-09-18 12:49:06.789 [155737] main/402/vshard.replicaset.00000000-0000-0000-0000-000000000003/vshard.replicaset replicaset.lua:1344 I> health_status on 'replica_1_a' is healthy
2025-09-18 12:49:06.789 [155737] main/402/vshard.replicaset.00000000-0000-0000-0000-000000000003/vshard.replicaset replicaset.lua:1953 I> All replicas are ok
2025-09-18 12:49:06.789 [155737] main/405/vshard.replicaset.00000000-0000-0000-0000-000000000006/vshard.replicaset replicaset.lua:1953 I> All replicas are ok
2025-09-18 12:49:06.794 [155737] main/408/unix/:/tmp/t/010_router-luatest/replica_2_a.iproto (net.box)/vshard.replicaset replicaset.lua:273 I> connected to unix/:/tmp/t/010_router-luatest/replica_2_a.iproto
2025-09-18 12:49:06.825 [155737] main/405/vshard.replicaset.00000000-0000-0000-0000-000000000006/vshard.service_info service_info.lua:54 E> Error during failovering: {"name":"PROC_LUA","code":32,"base_type":"ClientError","type":"ClientError","details":".../serpentian/Programming/tnt/vshard/vshard/replicaset.lua:598: assertion failed!","message":".../serpentian/Programming/tnt/vshard/vshard/replicaset.lua:598: assertion failed!","trace":[{"file":"/home/serpentian/Programming/tnt/vshard/vshard/error.lua","line":284}]}
2025-09-18 12:49:06.825 [155737] main/405/vshard.replicaset.00000000-0000-0000-0000-000000000006/vshard.replicaset replicaset.lua:1965 I> Failovering step is finished. Schedule next after 1.000000 seconds
2025-09-18 12:49:06.826 [155737] main/406/vshard.replica.replica_2_a/vshard.service_info service_info.lua:54 E> Ping error from replica_2_a(storage@unix/:/tmp/t/010_router-luatest/replica_2_a.iproto): perhaps a connection is down: {"code":32,"base_type":"LuajitError","type":"LuajitError","message":"...mming/tnt/vshard/test/router-luatest/router_2_2_test.lua:1210: TimedOut","trace":[{"file":"./src/lua/utils.c","line":700}]}
2025-09-18 12:49:06.826 [155737] main/406/vshard.replica.replica_2_a/vshard.replicaset replicaset.lua:306 I> disconnected from unix/:/tmp/t/010_router-luatest/replica_2_a.iproto
2025-09-18 12:49:06.835 [155737] main/410/unix/:/tmp/t/010_router-luatest/replica_2_a.iproto (net.box)/vshard.replicaset replicaset.lua:273 I> connected to unix/:/tmp/t/010_router-luatest/replica_2_a.iproto