ExoStack - Production-Ready Distributed AI Orchestration Platform

ExoStack is an enterprise-grade distributed AI orchestration platform that enables seamless deployment, management, and scaling of AI models across multiple nodes. Built for production workloads, it provides intelligent scheduling, comprehensive monitoring, and multi-tenant isolation for AI inference at scale.

🚀 Key Features

Core Platform

🏗️ Distributed Architecture: Horizontally scalable AI inference across multiple nodes
🧠 Model Registry: Centralized model management with auto-loading from HuggingFace, local files, and remote URLs
⚡ GPU Agent Detection: Automatic detection and intelligent routing to GPU-enabled nodes
📡 Streaming Inference: Real-time inference via Server-Sent Events (SSE) and WebSocket
🔧 Fine-tuning Integration: Built-in support for model fine-tuning with popular frameworks

Production-Ready Features

💾 Persistent Task Storage: PostgreSQL/SQLite backend with comprehensive task tracking and metrics
🎯 Model-Aware Scheduling: Intelligent node selection based on model compatibility and performance history
📦 Advanced Model Packaging: Multi-source model support with caching, preloading, and validation
📊 Web Dashboard: Modern React-based dashboard with real-time monitoring and analytics
🔒 Worker Pool & Container Execution: Multi-tenant isolation via containers, processes, and thread pools
🚨 Alerts & Monitoring: Comprehensive alerting system with email, webhook, and Slack notifications
🌍 Global Deployment: Kubernetes-ready with multi-region support and automated deployment

Enterprise Features

📈 Real-time Monitoring: Live performance metrics, resource utilization, and health monitoring
🔄 Load Balancing: Intelligent request distribution with multiple scheduling strategies
🛡️ Security & Isolation: Container-based task isolation with resource limits and security controls
📱 Multi-Channel Alerts: Email, Slack, Discord, webhook, and SMS notification support
🎛️ Threshold Monitoring: CPU, memory, GPU, and custom metric threshold alerting
📋 Task Queue Management: Persistent task queues with priority scheduling and failure handling

🏗️ Architecture

ExoStack uses a hub-and-spoke architecture optimized for production deployments:

┌─────────────────┐    ┌──────────────────┐    ┌─────────────────┐
│   Web Dashboard │    │   ExoStack Hub   │    │  Alert Manager  │
│   (React UI)    │◄──►│  (Coordinator)   │◄──►│  (Monitoring)   │
└─────────────────┘    └──────────────────┘    └─────────────────┘
                                │
                    ┌───────────┼───────────┐
                    │           │           │
            ┌───────▼──┐ ┌──────▼──┐ ┌──────▼──┐
            │ Agent    │ │ Agent   │ │ Agent   │
            │ (GPU)    │ │ (CPU)   │ │ (Edge)  │
            └──────────┘ └─────────┘ └─────────┘

Core Components

🎯 Hub: Central coordinator managing task distribution, node registration, and model registry
🤖 Agent: Worker nodes executing AI inference with support for CPU, GPU, and specialized hardware
📊 Dashboard: Real-time web interface for monitoring, management, and analytics
🚨 Alert Manager: Comprehensive monitoring and notification system
💾 Database: Persistent storage for tasks, metrics, and system state

🚀 Quick Start

Option 1: One-Command Deployment (Recommended)

# Clone the repository
git clone https://github.com/yourusername/exostack.git
cd exostack

# Deploy with Docker Compose (easiest)
./deployment/install.sh --mode docker-compose

# Or deploy to Kubernetes
./deployment/install.sh --mode kubernetes --domain your-domain.com

# Or standalone deployment
./deployment/install.sh --mode standalone

Option 2: Manual Setup

Prerequisites

Python 3.8+
PostgreSQL 12+ (or SQLite for development)
Redis 6+ (for caching and pub/sub)
Docker (optional, for containerized deployment)
CUDA-compatible GPU (optional, for GPU acceleration)

Installation

Clone and setup:

git clone https://github.com/yourusername/exostack.git
cd exostack
pip install -r requirements.txt

Configure environment:

cp config/.env.example config/.env
# Edit config/.env with your settings

Initialize database:

# For PostgreSQL
export DATABASE_URL="postgresql://user:pass@localhost:5432/exostack"
python -m alembic upgrade head

# For SQLite (development)
export DATABASE_URL="sqlite:///./exostack.db"
python -m alembic upgrade head

Start services:

# Terminal 1: Start Hub
python -m exo_hub.main

# Terminal 2: Start Agent
python -m exo_agent.main

# Terminal 3: Start Dashboard (optional)
cd web_dashboard && npm install && npm start

📊 Web Dashboard

Access the modern React-based dashboard at http://localhost:3000:

📈 Live Metrics: Real-time task execution, node health, and performance charts
🎛️ Task Queue: Monitor pending, running, and completed tasks with filtering
🖥️ Node Health: Interactive heatmap of node resources and status
📋 Task History: Detailed execution history with performance analytics
🔧 Model Registry: Manage and monitor loaded models across nodes
⚙️ Settings: Configure thresholds, alerts, and system parameters

🔧 Configuration

Environment Variables

# Database Configuration
DATABASE_URL=postgresql://user:pass@localhost:5432/exostack
REDIS_URL=redis://localhost:6379/0

# Service Configuration
HUB_HOST=0.0.0.0
HUB_PORT=8000
AGENT_PORT=8001
DASHBOARD_PORT=3000

# Features
ENABLE_GPU=true
ENABLE_MONITORING=true
ENABLE_ALERTS=true
ENABLE_CONTAINERS=true

# Performance
MAX_CONCURRENT_TASKS=10
TASK_TIMEOUT=300
MODEL_CACHE_SIZE=10GB
HEARTBEAT_INTERVAL=30

# Security
JWT_SECRET=your-secret-key
ENABLE_AUTH=false

# Logging
LOG_LEVEL=INFO
LOG_FORMAT=json

Alert Configuration

Configure monitoring thresholds in shared/config/alert_config.json:

{
  "threshold_rules": [
    {
      "rule_id": "cpu_usage_default",
      "name": "CPU Usage Monitor",
      "threshold_type": "cpu_usage",
      "warning_threshold": 80.0,
      "critical_threshold": 95.0,
      "notification_channels": ["email", "slack"]
    }
  ],
  "notification_configs": {
    "email": {
      "smtp_host": "smtp.gmail.com",
      "smtp_port": 587,
      "email_from": "alerts@yourcompany.com",
      "email_to": ["admin@yourcompany.com"]
    },
    "slack": {
      "slack_webhook_url": "https://hooks.slack.com/services/..."
    }
  }
}

📚 API Reference

Hub Endpoints

Task Management

# Submit inference task
POST /api/tasks
{
  "model_id": "TinyLlama/TinyLlama-1.1B-Chat-v1.0",
  "input_data": {"prompt": "Hello, world!"},
  "priority": 1
}

# Get task status
GET /api/tasks/{task_id}

# List tasks with filtering
GET /api/tasks?status=running&limit=10

# Cancel task
DELETE /api/tasks/{task_id}

Node Management

# List registered nodes
GET /api/nodes

# Get node details
GET /api/nodes/{node_id}

# Update node configuration
PUT /api/nodes/{node_id}
{
  "max_concurrent_tasks": 5,
  "enabled": true
}

Model Registry

# List available models
GET /api/models

# Load model on nodes
POST /api/models/load
{
  "model_id": "microsoft/DialoGPT-medium",
  "source": "huggingface",
  "target_nodes": ["node-gpu-01"]
}

# Unload model
DELETE /api/models/{model_id}/nodes/{node_id}

Monitoring & Metrics

# Get system health
GET /api/health

# Get dashboard statistics
GET /api/stats/dashboard

# Get node metrics
GET /api/metrics/nodes/{node_id}

# Get task metrics
GET /api/metrics/tasks

Agent Endpoints

Health & Status

# Agent health check
GET /health

# Agent capabilities
GET /capabilities

# Resource usage
GET /metrics

Model Operations

# List loaded models
GET /models

# Load model
POST /models/load
{
  "model_id": "gpt2",
  "source": "huggingface"
}

# Run inference
POST /inference
{
  "model_id": "gpt2",
  "input_data": {"prompt": "Hello"},
  "parameters": {"max_tokens": 100}
}

🚀 Deployment Options

Docker Compose (Recommended for Development)

# Quick start with Docker Compose
./deployment/install.sh --mode docker-compose

# Custom configuration
./deployment/install.sh --mode docker-compose --domain localhost --no-gpu

Kubernetes (Production)

# Deploy to existing cluster
./deployment/install.sh --mode kubernetes --domain your-domain.com

# With custom namespace
./deployment/install.sh --mode kubernetes --namespace exostack-prod --domain api.yourcompany.com

Standalone (Development)

# Local development setup
./deployment/install.sh --mode standalone --no-monitoring

🛠️ Development

Running Tests

# Run all tests
pytest tests/

# Run specific test suite
pytest tests/test_hub.py
pytest tests/test_agent.py
pytest tests/test_integration.py

# Run with coverage
pytest --cov=exo_hub --cov=exo_agent tests/

# Run dashboard tests
cd web_dashboard && npm test

Building for Production

# Build Docker images
docker build -t exostack/hub:latest -f docker/Dockerfile.hub .
docker build -t exostack/agent:latest -f docker/Dockerfile.agent .
docker build -t exostack/dashboard:latest -f docker/Dockerfile.dashboard ./web_dashboard

# Build and push to registry
docker-compose -f docker-compose.prod.yml build
docker-compose -f docker-compose.prod.yml push

Local Development

# Start development environment
docker-compose -f docker-compose.dev.yml up -d

# Watch for changes (hot reload)
python -m exo_hub.main --reload
python -m exo_agent.main --reload

# Dashboard development server
cd web_dashboard && npm run dev

🔧 Advanced Configuration

Worker Pool Configuration

# shared/config/worker_config.py
WORKER_POOL_CONFIG = {
    "max_thread_workers": 4,
    "max_process_workers": 2,
    "enable_containers": True,
    "container_image": "exostack/inference:latest",
    "resource_limits": {
        "memory_mb": 2048,
        "cpu_cores": 2,
        "timeout_seconds": 300
    }
}

Model Registry Configuration

# shared/config/model_config.py
MODEL_REGISTRY_CONFIG = {
    "cache_dir": "/app/models_cache",
    "max_cache_size_gb": 50,
    "preload_models": [
        "TinyLlama/TinyLlama-1.1B-Chat-v1.0",
        "microsoft/DialoGPT-medium"
    ],
    "sources": {
        "huggingface": {"enabled": True},
        "local": {"enabled": True, "base_path": "/models"},
        "remote": {"enabled": True, "timeout": 300}
    }
}

Scheduler Configuration

# shared/config/scheduler_config.py
SCHEDULER_CONFIG = {
    "strategy": "balanced",  # performance, resource, balanced
    "weights": {
        "model_compatibility": 0.4,
        "resource_availability": 0.3,
        "performance_history": 0.3
    },
    "cache_ttl_seconds": 300,
    "max_retries": 3
}

📊 Monitoring & Observability

Metrics Collection

ExoStack automatically collects comprehensive metrics:

Task Metrics: Execution time, success rate, queue length
Node Metrics: CPU, memory, GPU utilization, disk usage
Model Metrics: Load time, inference latency, memory usage
System Metrics: Request rate, error rate, response time

Grafana Dashboard

Import the provided Grafana dashboard for visualization:

# Import dashboard
curl -X POST http://grafana:3000/api/dashboards/db \
  -H "Content-Type: application/json" \
  -d @monitoring/grafana/exostack-dashboard.json

Prometheus Integration

ExoStack exposes metrics in Prometheus format:

# prometheus.yml
scrape_configs:
  - job_name: 'exostack-hub'
    static_configs:
      - targets: ['exostack-hub:8000']
    metrics_path: '/metrics'

  - job_name: 'exostack-agents'
    static_configs:
      - targets: ['exostack-agent:8001']
    metrics_path: '/metrics'

🔒 Security

Authentication & Authorization

# Enable authentication
export ENABLE_AUTH=true
export JWT_SECRET=your-super-secret-key

# Create API key
curl -X POST http://localhost:8000/api/auth/keys \
  -H "Authorization: Bearer $ADMIN_TOKEN" \
  -d '{"name": "production-key", "permissions": ["read", "write"]}'

Container Security

# Kubernetes security context
securityContext:
  runAsNonRoot: true
  runAsUser: 1000
  fsGroup: 1000
  capabilities:
    drop:
      - ALL
  readOnlyRootFilesystem: true

🚀 Production Deployment

High Availability Setup

# Deploy with HA configuration
./deployment/install.sh --mode kubernetes \
  --domain api.yourcompany.com \
  --replicas 3 \
  --enable-monitoring \
  --enable-backup

Backup & Recovery

# Database backup
kubectl exec -n exostack postgres-0 -- pg_dump -U exostack exostack > backup.sql

# Model cache backup
kubectl exec -n exostack exostack-hub-0 -- tar -czf - /app/models_cache > models_backup.tar.gz

# Restore from backup
kubectl exec -i -n exostack postgres-0 -- psql -U exostack exostack < backup.sql

Scaling

# Scale agents
kubectl scale deployment exostack-agent-cpu --replicas=5 -n exostack

# Scale hub (if stateless)
kubectl scale deployment exostack-hub --replicas=3 -n exostack

# Auto-scaling with HPA
kubectl autoscale deployment exostack-agent-cpu --cpu-percent=70 --min=2 --max=10 -n exostack

🤝 Contributing

We welcome contributions! Please see our Contributing Guide for details.

Development Setup

Fork and clone:

git clone https://github.com/yourusername/exostack.git
cd exostack

Setup development environment:

python -m venv venv
source venv/bin/activate
pip install -r requirements-dev.txt
pre-commit install

Run tests:

pytest tests/
cd web_dashboard && npm test

Submit PR:

git checkout -b feature/amazing-feature
git commit -m 'Add amazing feature'
git push origin feature/amazing-feature

📝 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

HuggingFace for the transformers library and model hub
FastAPI for the excellent web framework
React and Tailwind CSS for the modern UI
PostgreSQL and Redis for reliable data storage
Docker and Kubernetes for containerization and orchestration
The open-source community for inspiration and contributions

📞 Support & Community

📋 GitHub Issues: Report bugs and request features
📖 Documentation: Comprehensive docs and tutorials
💬 Discord Community: Join our community
📧 Email Support: support@exostack.dev

🔄 Changelog

See CHANGELOG.md for detailed version history and release notes.

Built with ❤️ for the AI community

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
deployment		deployment
docker		docker
exo_agent		exo_agent
exo_cli		exo_cli
exo_hub		exo_hub
exo_ui		exo_ui
k8s		k8s
scripts		scripts
shared		shared
tests		tests
web_dashboard		web_dashboard
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
README_FEATURES.md		README_FEATURES.md
cli.py		cli.py
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
requirements.txt		requirements.txt
setup.py		setup.py
start_exostack.py		start_exostack.py
test_exostack.py		test_exostack.py

Folders and files

Latest commit

History

Repository files navigation

ExoStack - Production-Ready Distributed AI Orchestration Platform

🚀 Key Features

Core Platform

Production-Ready Features

Enterprise Features

🏗️ Architecture

Core Components

🚀 Quick Start

Option 1: One-Command Deployment (Recommended)

Option 2: Manual Setup

Prerequisites

Installation

📊 Web Dashboard

🔧 Configuration

Environment Variables

Alert Configuration

📚 API Reference

Hub Endpoints

Task Management

Node Management

Model Registry

Monitoring & Metrics

Agent Endpoints

Health & Status

Model Operations

🚀 Deployment Options

Docker Compose (Recommended for Development)

Kubernetes (Production)

Standalone (Development)

🛠️ Development

Running Tests

Building for Production

Local Development

🔧 Advanced Configuration

Worker Pool Configuration

Model Registry Configuration

Scheduler Configuration

📊 Monitoring & Observability

Metrics Collection

Grafana Dashboard

Prometheus Integration

🔒 Security

Authentication & Authorization

Container Security

🚀 Production Deployment

High Availability Setup

Backup & Recovery

Scaling

🤝 Contributing

Development Setup

📝 License

🙏 Acknowledgments

📞 Support & Community

🔄 Changelog

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages