Skip to content

feat: auto-restart browser on crash with watchdog (#14)#22

Open
hayka-pacha wants to merge 1 commit into
CloakHQ:mainfrom
hayka-pacha:feat/auto-restart
Open

feat: auto-restart browser on crash with watchdog (#14)#22
hayka-pacha wants to merge 1 commit into
CloakHQ:mainfrom
hayka-pacha:feat/auto-restart

Conversation

@hayka-pacha
Copy link
Copy Markdown

What

Adds a background watchdog that detects crashed browser instances and automatically restarts them, closing #14.

Changes

  • browser_manager.py -- watchdog implementation:
    • starts a background task
    • Probes each running profile via for liveness
    • Exponential backoff restart (1s, 2s, 4s, max 30s, max 3 retries)
    • Tracks user-stopped profiles to avoid unwanted restarts
    • Proper cleanup on shutdown
  • models.py -- new field on all profile models
  • database.py -- column migration
  • main.py -- starts watchdog on app lifespan startup
  • 12 new tests in covering:
    • Watchdog task lifecycle (create, idempotent, cancel)
    • Crash detection and restart
    • No restart when or user stopped
    • Backoff values and retry limits
    • Healthy profiles left alone

Test results

All 204 tests pass (192 existing + 12 new).

Design notes

  • The watchdog uses as a liveness probe -- lightweight, no side effects
  • The set prevents the watchdog from restarting profiles the user explicitly stopped
  • Backoff resets to 1s after a successful restart

- Add watchdog task to BrowserManager that periodically checks if running
  profiles are still alive by probing context.pages
- If a profile crashes, automatically restart it with exponential backoff
  (1s, 2s, 4s, max 30s) and max 3 retries before giving up
- Add auto_restart boolean field to profile model (default True)
- Add auto_restart to database schema with migration for existing DBs
- Start watchdog in lifespan startup in main.py
- Explicitly stopped profiles are tracked to prevent unwanted restarts
- cleanup_all() cancels watchdog task on shutdown
- Fix lambda closure bug in context.on('close') callback
- Add 12 tests: crash detection, restart, backoff, max retries, healthy
  profiles, auto_restart disabled, user-stopped profiles, deleted profiles
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant