feat(server): add support for SafetyStartDelay to mitigate double-sign risks #25627
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Summary
This ensures safe validator startup whenever O_SYNC durability guarantees are intentionally relaxed.
This PR adds support for the safety_start_delay configuration from CometBFT. When the node is configured to run with disable_os_sync = true (optimized I/O mode), this feature forces the node to wait for a specified duration before starting.
This is a companion PR to CometBFT PR cometbft/cometbft#5515 .
Problem Statement
Validators operating on low-end hardware often face I/O bottlenecks during block commitment, leading to missed blocks. A proposed solution in CometBFT allows disabling os.O_SYNC (disable_os_sync) to eliminate this bottleneck.
However, disabling O_SYNC introduces a risk of "Amnesia" (double signing) if the node restarts immediately after a power failure, as the local state might not have been persisted to disk.
Solution
We implement a Safety Start Delay mechanism in the start command.
Related Issues
Backward Compatibility
How to Test