Add instance restart policy#233
Conversation
✱ Stainless preview builds for hypemanThis PR will update the
|
383ad8b to
365fa7d
Compare
Monitoring Plan: Restart PolicyThis PR introduces a new The primary risks to watch are: stop/start loops if the reconciler misidentifies instances, a regression in |
4f6aa80 to
bb9b4f0
Compare
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit bb9b4f0. Configure here.
| m.notifyLifecycleEvent(ctx, LifecycleEventStart, inst) | ||
| } | ||
| return inst, err | ||
| } |
There was a problem hiding this comment.
Restart ignores manual stop set between status write and start
Medium Severity
RestartInstance acquires the instance lock and calls startInstance without checking if manual_stop was set on the restart status. In startInstanceForRestartPolicy, setRestartStatusIfStopped acquires the lock, verifies no manual_stop, writes the attempt status, and releases the lock. Then RestartInstance separately acquires the lock to start. If a user calls StopInstance in the gap between these two lock acquisitions, markRestartManualStopLocked writes manual_stop to metadata, but RestartInstance never checks it—startInstance only validates instance state is Stopped, not the restart status. This TOCTOU race allows one unwanted automatic restart after an explicit manual stop.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit bb9b4f0. Configure here.


Summary
Testing
Not run: full ./cmd/api/api package; it enters broader lifecycle coverage outside this focused change.
Note
Medium Risk
Adds new whole-instance restart supervision that can stop/start instances automatically based on exit/health signals, touching lifecycle state transitions and background controllers; misconfiguration or logic bugs could cause unexpected restart loops or suppressed restarts.
Overview
Adds instance restart supervision via new
restart_policy(config) andrestart_status(runtime) fields across the OpenAPI spec/generatedoapitypes and the instances API (CreateInstance,UpdateInstance,GetInstancemapping/validation).Implements a new
lib/restart-policypackage plus aninstancesrestart-policy controller that persists retry state, applies backoff/max-attempt/stable-window rules, suppresses restarts after manualStopInstance, and can treathealth_check=unhealthyas a restart trigger foron_failure/always.Wires the controller into
cmd/api/main.go, extends health check controller to optionally callHandleHealthCheckUnhealthy, resets restart status on forks/snapshot restores, and adds unit + integration coverage for request mapping, validation, controller behavior, and metrics labels.Reviewed by Cursor Bugbot for commit bb9b4f0. Bugbot is set up for automated code reviews on this repo. Configure here.