feat(health): add support for TCP and none health check types#1531
feat(health): add support for TCP and none health check types#1531AruneshDwivedi wants to merge 4 commits into
Conversation
Currently, health-check.sh only supports HTTP health checks via curl. Services using TCP protocols (like piper-audio with Wyoming) or CLI tools (like aider) either fail or are always shown as unhealthy. Changes: - Add SERVICE_HEALTH_TYPES associative array to service registry - Parse optional 'health_type' field from manifest (default: http) - 'http': current behavior, HTTP GET via curl (default) - 'tcp': check if port accepts TCP connections - 'none': skip health check entirely (CLI tools) - Update test_service() to handle all three types - Update check_service() to skip CLI tools - Update display logic to show 'skipped' for none-type services Fixes Light-Heart-Labs#666 Signed-off-by: Arunesh Dwivedi <arunesh.devops@gmail.com> Assisted-by: OWL Signed-off-by: Arunesh Dwivedi <arunesh.devops@gmail.com>
check_service_async now checks health_type and writes 'skipped' for none-type services, ensuring consistent display in parallel health checks. Signed-off-by: Arunesh Dwivedi <arunesh.devops@gmail.com>
|
Thanks for opening this. The direction is good, and What we would need before this is mergeable:
Once those pieces are in place, this should be a good bounded PR: backward-compatible by default, explicit for TCP/CLI services, and consistent across CLI, doctor, dashboard, schema, and tests. One smaller note: the PR body currently says |
|
Thanks for the detailed review @Lightheartdevs. I understand the concerns — the current PR only adds plumbing in 2 files but the health_type needs to be a repo-wide contract. I'm working on implementing all 7 requirements:
I'll update the PR body to say "Part of #666" until all pieces are in place. Working on this now — will push updates soon. |
Implements all 7 review requirements from @Lightheartdevs: 1. Add health_type to manifest schema (enum: http|tcp|none, default http) - Both schema files updated with enum validation - validate-manifest-schema.sh rejects invalid values - audit-extensions.py validates constraints per type 2. Wire health_type through extension catalog / dashboard - generate-extensions-catalog.py includes health_type and health_timeout - dashboard-api/config.py passes health_type to service config - dashboard-api/user_extensions.py includes health_type for user extensions 3. Make all health surfaces agree - health-check.sh: container state check before skip, TCP timeout - dream-doctor.sh: handles http/tcp/none with proper timeouts - audit-extensions.py: validates health_type constraints - dashboard-api/helpers.py: TCP port check, none=skipped - dashboard-api/routers/extensions.py: status computation for all types 4. Tighten 'none' behavior - Container state checked BEFORE skipping for none type - none requires port=0 and startup_check=false (validated in schema/audit) - Docker services with none still report fail if container not running 5. Schema validation for health_timeout (1-300 seconds) 6. Update affected manifests - piper-audio: health_type=tcp, health_timeout=5 - aider: health_type=none 7. TCP probes bounded by timeout - health-check.sh uses timeout command with health_timeout - dashboard-api uses asyncio.wait_for with health_timeout - dream-doctor.sh uses timeout command with health_timeout Part of Light-Heart-Labs#666 Signed-off-by: Arunesh Dwivedi <arunesh.devops@gmail.com>
Tests for the health_type repo-wide contract: - Schema validates http/tcp/none and rejects invalid values - Library schema has matching health_type enum - piper-audio manifest uses health_type=tcp - aider manifest uses health_type=none Part of Light-Heart-Labs#666 Signed-off-by: Arunesh Dwivedi <arunesh.devops@gmail.com>
Summary
Fixes #666. Currently, health-check.sh only supports HTTP health checks via curl. Services using TCP protocols (like piper-audio with Wyoming) or CLI tools (like aider) either fail or are always shown as unhealthy.
Changes
Service Registry ()
SERVICE_HEALTH_TYPESassociative arrayhealth_typefield from manifest (default:http)http: current behavior, HTTP GET via curltcp: check if port accepts TCP connectionsnone: skip health check (CLI tools)Health Check (
scripts/health-check.sh)test_service()to handle all three health check types/dev/tcp/host/portnonetype to skip health check entirelycheck_service()to skip CLI toolsUsage
Add
health_typeto manifest.yaml:Testing
tcptype uses bash built-in/dev/tcpfor port checking (no external dependencies)nonetype sets result to 'skipped' and returns successhttptype maintains backward compatibilitySigned-off-by: Arunesh Dwivedi arunesh.devops@gmail.com