Skip to content

Conversation

@jserv
Copy link
Collaborator

@jserv jserv commented Oct 30, 2025

This refines I/O coroutine API that was designed for peripheral I/O but provided no benefit over inline polling.


Summary by cubic

Refined the coroutine system to focus on hart scheduling and moved peripheral I/O to inline polling for lower latency and simpler design. Added event-driven WFI with a round‑robin scheduler that sleeps when all harts are idle.

  • Refactors

    • Removed I/O coroutines; peripherals now inline‑polled every 64 instructions.
    • New API: coro_init(total_slots, hart_slots); added CORO_INVALID_ID. Renamed to coro_create_hart(slot_id, func, arg) and coro_resume_hart(slot_id); coro_is_suspended(slot_id) checks only hart slots.
    • Scheduler drains timer events and monitors UART (kqueue/timerfd + poll) so WFI sleep blocks properly; virtio‑net sets vnet.ram unconditionally and skips queue refresh if peer is not initialized.
  • Migration

    • Replace coro_init(n_hart) with coro_init(total_slots, hart_slots).
    • Update calls to coro_create_hart/coro_resume_hart to use slot_id and arg.
    • Use coro_is_suspended(slot_id) only for hart slots; optionally use CORO_INVALID_ID when checking current hart ID.

Written for commit cc633b4. Summary will update automatically on new commits.

@jserv jserv requested a review from shengwen-tw October 30, 2025 04:42
@jserv jserv requested review from chiangkd and ranvd October 30, 2025 04:48
cubic-dev-ai[bot]

This comment was marked as resolved.

@jserv jserv force-pushed the refine-coroutine branch 3 times, most recently from 98679e6 to 6a124dc Compare October 30, 2025 06:37
jserv added 4 commits October 30, 2025 17:51
This refines I/O coroutine API that was designed for peripheral I/O but
provided no benefit over inline polling.
When no network device is specified via -n option, the virtio-net device
was left uninitialized but still exposed to guest via device tree. This
caused segmentation fault when guest attempted to initialize the device.

Root cause analysis:
- WFI merge (e4ae87e) introduced conditional initialization:
  if (netdev) { virtio_net_init(); vnet.ram = ram; }
- Previous code always set vnet.ram regardless of netdev
- Guest kernel initializes all devices in device tree
- Without vnet.ram pointer, QueueReady handler crashed accessing NULL
The poll()-based event loop was not consuming kqueue events on macOS,
causing the kqueue fd to remain readable after the first timer tick.
This defeated the WFI sleep optimization by making poll(..., -1) return
immediately instead of blocking when all harts are idle.

Root cause:
- Linux path: timerfd is consumed via read() after poll()
- macOS path: kqueue events were never drained after poll()
- Result: kqueue fd stays readable → poll() never blocks → 100% CPU
GitHub Actions macOS runners show significantly slower performance
compared to Linux. Boot process that completes in 23-30 seconds locally
takes 5+ minutes in CI environment.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants