Skip to content

Conversation

@rescrv
Copy link
Contributor

@rescrv rescrv commented Dec 29, 2025

Description of changes

Implement a quorum-based coordination mechanism that:

  • Runs futures in parallel and waits for a minimum count of Ok results
  • Starts a timeout after reaching the quorum threshold
  • Cancels remaining futures that exceed the timeout
  • Returns results in original order with None for cancelled futures

This enables handling partial quorum failures where some writers may
be slow or unresponsive, allowing the system to proceed once a quorum
of successful writes is achieved while still attempting to maximize
replication within a bounded time window.

Test plan

Unit tests + CI

Migration plan

N/A

Observability plan

N/A

Documentation Changes

N/A

Co-authored-by: AI

@github-actions
Copy link

Reviewer Checklist

Please leverage this checklist to ensure your code review is thorough before approving

Testing, Bugs, Errors, Logs, Documentation

  • Can you think of any use case in which the code does not behave as intended? Have they been tested?
  • Can you think of any inputs or external events that could break the code? Is user input validated and safe? Have they been tested?
  • If appropriate, are there adequate property based tests?
  • If appropriate, are there adequate unit tests?
  • Should any logging, debugging, tracing information be added or removed?
  • Are error messages user-friendly?
  • Have all documentation changes needed been made?
  • Have all non-obvious changes been commented?

System Compatibility

  • Are there any potential impacts on other parts of the system or backward compatibility?
  • Does this change intersect with any items on our roadmap, and if so, is there a plan for fitting them together?

Quality

  • Is this code of a unexpectedly high quality (Readability, Modularity, Intuitiveness)

@propel-code-bot
Copy link
Contributor

propel-code-bot bot commented Dec 29, 2025

The change introduces a write_quorum helper that is exported for wider WAL consumption and is backed by extensive async tests spanning empty inputs, mixed success and error scenarios, timeout handling, ordering guarantees, and quorum edge cases.

Affected Areas

rust/wal3/src/quorum_writer.rs
rust/wal3/src/lib.rs

This summary was automatically generated by @propel-code-bot

@rescrv rescrv force-pushed the rescrv/quorum-writer branch from aacd96b to 1661c23 Compare December 29, 2025 18:04
@rescrv rescrv changed the base branch from main to rescrv/topology-config December 29, 2025 18:04
@rescrv rescrv requested a review from sanketkedia December 29, 2025 18:07
@rescrv rescrv force-pushed the rescrv/quorum-writer branch from 1661c23 to a57a27d Compare December 29, 2025 22:50
@rescrv rescrv force-pushed the rescrv/topology-config branch from 26bc4bf to adfd6a8 Compare December 30, 2025 21:46
@rescrv rescrv force-pushed the rescrv/quorum-writer branch from a57a27d to af49672 Compare December 30, 2025 21:53
@rescrv rescrv force-pushed the rescrv/topology-config branch from adfd6a8 to 038ac2e Compare December 30, 2025 23:09
@rescrv rescrv force-pushed the rescrv/quorum-writer branch from af49672 to 5ba021d Compare December 30, 2025 23:09
@rescrv rescrv force-pushed the rescrv/topology-config branch from 038ac2e to ded3a45 Compare December 31, 2025 00:10
@rescrv rescrv force-pushed the rescrv/quorum-writer branch from 5ba021d to d1d3137 Compare December 31, 2025 00:10
@rescrv rescrv force-pushed the rescrv/topology-config branch from ded3a45 to 82b5763 Compare December 31, 2025 01:05
@rescrv rescrv force-pushed the rescrv/quorum-writer branch from d1d3137 to ed229e0 Compare December 31, 2025 01:05
@rescrv rescrv force-pushed the rescrv/topology-config branch from 82b5763 to b90737d Compare January 2, 2026 22:59
@rescrv rescrv force-pushed the rescrv/quorum-writer branch from ed229e0 to 2f00206 Compare January 2, 2026 23:12
let mut ok_count = 0;

// Phase 1: Wait for the minimum number of Ok futures to complete.
while ok_count < min_futures_to_wait_for {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So phase1 by design is without a deadline and can get stuck here forever? Would be useful to elaborate through comments here as to the intention

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The individual futures are assumed to be timeout-friendly, as is currently the case with all quorum ops. This gives the ability to control the timeout on a per-future basis without muddying this code.

@rescrv rescrv force-pushed the rescrv/topology-config branch from b90737d to 4052be6 Compare January 6, 2026 17:13
Base automatically changed from rescrv/topology-config to main January 6, 2026 17:51
rescrv added 2 commits January 6, 2026 10:15
Implement a quorum-based coordination mechanism that:
- Runs futures in parallel and waits for a minimum count of Ok results
- Starts a timeout after reaching the quorum threshold
- Cancels remaining futures that exceed the timeout
- Returns results in original order with None for cancelled futures

This enables handling partial quorum failures where some writers may
be slow or unresponsive, allowing the system to proceed once a quorum
of successful writes is achieved while still attempting to maximize
replication within a bounded time window.

Co-authored-by: AI
@rescrv rescrv force-pushed the rescrv/quorum-writer branch from 2f00206 to 06ee7ec Compare January 6, 2026 18:16
Comment on lines +51 to +56
type IndexedFuture<S, E> = Pin<Box<dyn Future<Output = (usize, Result<S, E>)> + Send>>;
let mut pending: FuturesUnordered<IndexedFuture<S, E>> = FuturesUnordered::new();

for (idx, fut) in futures.into_iter().enumerate() {
pending.push(Box::pin(async move { (idx, fut.await) }));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recommended

[Performance] Since futures is a homogeneous Vec<F>, you can store the generated async blocks directly in FuturesUnordered without type erasure. This removes the need for Pin<Box<dyn Future...>>, saving one heap allocation per future and avoiding dynamic dispatch overhead.

The compiler will generate a single anonymous type for the async move block across all iterations.

Context for Agents
Since `futures` is a homogeneous `Vec<F>`, you can store the generated async blocks directly in `FuturesUnordered` without type erasure. This removes the need for `Pin<Box<dyn Future...>>`, saving one heap allocation per future and avoiding dynamic dispatch overhead.

The compiler will generate a single anonymous type for the `async move` block across all iterations.

File: rust/wal3/src/quorum_writer.rs
Line: 56

@rescrv rescrv merged commit 715ec58 into main Jan 6, 2026
64 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants