Skip to content

[upgrade] Phase 2a: systemd socket-activation (LISTEN_FDS) - no connection-refused #389

Description

@ELares

Part of the upgrade epic, Phase 2. Remove the connection-REFUSED window on a single-node restart via systemd socket-activation.

Today IronCache binds its own RESP socket (SO_REUSEPORT); the systemd unit is Type=simple with no .socket. Add socket-activation so SYSTEMD owns the listening socket and passes the fd to IronCache via the LISTEN_FDS/LISTEN_PID protocol (sd_listen_fds, first fd = SD_LISTEN_FDS_START = 3). When configured, IronCache uses the passed fd instead of binding its own. Across an upgrade restart the socket stays open, so clients QUEUE in the kernel backlog (latency) instead of getting ECONNREFUSED - perceived downtime collapses to the new process's startup/reload time.

Scope: engine reads LISTEN_FDS (env, LISTEN_PID check) and adopts the fd (Accept=no, single long-lived acceptor) with a clean fallback to self-bind when not socket-activated; a packaging .socket unit (ListenStream=:6379, Backlog= tuned vs net.core.somaxconn) + the .service wired to it; the Terraform user-data updated. Better than SO_REUSEPORT for this case because the single listen queue is never closed across the restart (a closed SO_REUSEPORT socket loses its queued connections).

Acceptance: an ironcache upgrade restart under a client connecting in a tight loop shows queued/latency, NOT connection-refused; non-socket-activated boot still works unchanged.

Metadata

Metadata

Assignees

No one assigned

    Labels

    area:upgradeBinary self-upgrade (ironcache upgrade) workstreamsub-issueGranular child task split out from a parent design issue

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions