Version: 3.0.0 (Updated with full architecture audit) Date: 2026-04-30 Scope: All AI agents, human contributors, and automated processes working on HelixCode Authority: Derived from HelixAgent AGENTS.md with HelixCode-specific enhancements
Base agent rules:
constitution/AGENTS.md— READ IT FIRST. The base file is authoritative for any topic not covered here. Project-specific rules below extend them; they never weaken them.
The HelixConstitution submodule (./constitution/, added 2026-05-14)
provides the universal CLI-agent operating manual. Every CLI agent
working on this project — Claude Code, Codex, Cursor, Aider,
Continue, Gemini CLI, future LLMs — MUST read both
constitution/AGENTS.md AND this project AGENTS.md, treating the
project file as a tightening overlay (never a weakening).
Project-tightening note — HelixCode forbids GitFlic + GitVerse
Git remotes per CLAUDE.md §6.W; the universal AGENTS.md's broader
"multi-upstream" guidance is narrowed here. Other tightenings (e.g.
CONST-033 host power management hard-ban, CONST-035 anti-bluff
covenant, CONST-042 secret-leak policy, CONST-045 distribution-host
config policy, CONST-046 no-hardcoded-content, CONST-047 recursive
submodule application mandate) all SURVIVE this inheritance.
HelixCode is an enterprise-grade distributed AI development platform built in Go. It enables intelligent task division, work preservation, cross-platform development workflows, and multi-provider LLM integration through a unified REST API, CLI, Terminal UI, Desktop, and Mobile client architecture.
Current Status: The internal/ foundation is largely solid (auth, database, server, worker, task, workflow, tools, editor, notification, MCP, verifier are real implementations). Critical bluff and stub areas remain in select entry points and peripheral packages. All agents MUST prioritize zero-bluff implementation.
LLMsVerifier Integration Status: internal/verifier/ package is now implemented with REST API client, two-tier cache, circuit breaker health monitor, background poller, score adapter, and event publisher. BLUFF-002 (hardcoded CLI models) and BLUFF-004 (hardcoded external models) are FIXED. BLUFF-005 (scoring ignores verifier data) is FIXED in ModelManager.SelectOptimalModel().
Key Features:
- Distributed Computing: SSH-based worker pools with health monitoring, auto-installation, and consensus
- Multi-Provider LLM Integration: 15+ providers (OpenAI, Anthropic, Gemini, Ollama, Azure, Bedrock, Groq, Mistral, Cohere, xAI, DeepSeek, Qwen, OpenRouter, HuggingFace, Llama.cpp)
- Development Workflows: Automated planning, building, testing, refactoring with real shell execution
- Task Management: Intelligent task division with priorities, dependencies, checkpointing, and Redis caching
- MCP Protocol: Full Model Context Protocol server over WebSocket with tool dispatch
- Multi-Client Architecture: REST API (Gin), Cobra CLI, Terminal UI (tview), Desktop (Fyne), Mobile (gomobile), WebSocket
- Memory Systems: In-memory, filesystem, Redis, Memcached, Cognee, ChromaDB, Qdrant, Weaviate integrations
- Advanced Editor: Multi-format code editing (diff, whole-file, search/replace, line-based) with backups
- Tools Ecosystem: 40+ tools across filesystem, shell, web, browser, mapping, multiedit, confirmation, notebook, git
- Notifications: Multi-channel support (Slack, Email, Telegram, Discord, Yandex Messenger, Max)
Core Technologies:
- Language: Go 1.24.0 with toolchain go1.24.9
- Module:
dev.helix.code - HTTP Framework: Gin v1.11.0
- Authentication: JWT v4.5.2, bcrypt + argon2
- Database: PostgreSQL 15+ via pgx/v5 (optional)
- Cache: Redis 7+ via go-redis/v9 (optional)
- Configuration: Viper v1.21.0
- CLI Framework: Cobra v1.8.0
- Testing: Testify v1.11.1
UI Technologies:
- Desktop: Fyne v2.7.0
- Terminal UI: tview v0.42.0
- Mobile: gomobile bindings
External Integrations:
- Browser Automation: chromedp v0.14.2
- Web Scraping: goquery v1.10.3
- Tree-sitter: go-tree-sitter
- Identity: Azure SDK, AWS SDK v2
- Vector/Memory: Cognee, ChromaDB, Qdrant, Weaviate clients
- Container Orchestration: digital.vasic.containers (vasic-digital/Containers submodule)
CRITICAL: All build and test commands must be run from the helix_code/ subdirectory, not the repository root.
cd HelixCode| Command | Purpose |
|---|---|
make build |
Build server binary to bin/helixcode |
make test |
Run go test -v ./... |
make test-all |
Run tests + coverage + benchmarks + docs |
make test-coverage |
Generate coverage report |
make test-benchmark |
Run Go benchmarks |
make logo-assets |
Generate logo assets (required before first build) |
make setup-deps |
Run go mod tidy |
make fmt |
Run go fmt ./... |
make lint |
Run golangci-lint run ./... |
make clean |
Clean build artifacts |
make dev |
Start development server |
make prod |
Cross-platform production build |
make mobile |
Build iOS + Android targets |
make aurora-os |
Build Aurora OS target |
make harmony-os |
Build Harmony OS target |
| Command | Purpose |
|---|---|
make test-infra-up |
Start full Docker test infrastructure |
make test-infra-down |
Stop full Docker test infrastructure |
make test-full |
ALL tests with real infrastructure (zero skips) |
make test-unit-full |
Unit tests with real services |
make test-integration-full |
Integration tests with -tags=integration |
make test-e2e-full |
E2E challenge tests via runner |
make test-security-full |
Security test suite |
make test-load-full |
Load tests |
make test-complete |
Sequential run of all full test types |
make coverage-full |
Coverage with full infrastructure |
| Command | Purpose |
|---|---|
make container-builder-image |
Build the builder container image |
make container-build |
Build application inside container |
make container-test |
Run tests inside container |
make container-lint |
Run linter inside container |
make container-shell |
Interactive shell in builder container |
make container-dev-up |
Start containerized dev environment |
make container-dev-down |
Stop containerized dev environment |
make container-release |
Full release build in container |
./scripts/containers/build-in-container.sh |
Convenience wrapper script |
The builder container includes: Go 1.24, gcc, postgresql-client, redis, docker-cli, golangci-lint, and all build tools. The only host requirement is Docker/Podman.
| Script | Purpose |
|---|---|
./run_tests.sh --unit |
Unit tests |
./run_tests.sh --integration |
Integration tests |
./run_tests.sh --e2e |
E2E tests |
./run_tests.sh --coverage |
Coverage analysis |
./run_tests.sh --security |
Security tests |
./run_all_tests.sh |
Orchestrates ALL suites sequentially |
./run_integration_tests.sh |
DB integration tests with Docker |
go test -v -run TestName ./path/to/package
go test -v -tags=integration ./internal/database
cd tests/e2e/challenges && go run cmd/runner/main.go -challenge ascii-art-generator-001 -providers ollamahelix_code/
├── cmd/ # Application entry points
│ ├── server/main.go # HTTP server entry point
│ ├── cli/main.go # Legacy flag-based CLI client
│ ├── root.go # Cobra root command (`helix`)
│ ├── main_commands.go # `helix start`, `helix auto`
│ ├── other_commands.go # `helix server`, `helix version`, etc.
│ ├── local-llm.go # `helix local-llm` command tree
│ ├── local-llm-advanced.go # Advanced local-llm commands
│ ├── helix-config/main.go # Dedicated config management CLI
│ ├── security-test/main.go # Simulated security test runner
│ ├── security-fix/main.go # Security fix wrapper
│ ├── security-fix-standalone/main.go # Standalone security scanner
│ ├── performance-optimization/main.go # Performance optimizer
│ ├── performance-optimization-standalone/main.go # Standalone perf simulator
│ └── config-test/main.go # Config hot-reload test utility
│
├── internal/ # Internal packages (~40 packages)
│ ├── auth/ # JWT authentication, bcrypt/argon2, sessions
│ ├── llm/ # LLM provider implementations (15+ providers)
│ │ ├── providers/ # Per-provider HTTP clients
│ │ ├── compression/ # Context compression
│ │ └── vision/ # Vision/multimodal support
│ ├── provider/ # Provider abstractions
│ ├── providers/ # Provider management
│ ├── worker/ # SSH-based worker pool, health checks
│ ├── task/ # Task queues, dependencies, checkpoints
│ ├── server/ # Gin HTTP server, routes, middleware
│ ├── database/ # PostgreSQL pgx pool, schema initialization
│ ├── redis/ # go-redis wrapper with graceful degradation
│ ├── tools/ # 40+ tool ecosystem registry
│ │ ├── filesystem/ # fs_read, fs_write, fs_edit, glob, grep
│ │ ├── shell/ # shell, shell_background with sandbox
│ │ ├── web/ # web_fetch, web_search
│ │ ├── browser/ # browser_launch, browser_navigate, browser_screenshot
│ │ ├── multiedit/ # Transactional multi-file editing
│ │ └── git/ # Git automation
│ ├── editor/ # Multi-format code editing with backups
│ ├── memory/ # Memory providers (in-mem, filesystem, Redis, etc.)
│ ├── cognee/ # Cognee.ai memory integration
│ ├── context/ # Hierarchical context management with TTL
│ ├── notification/ # Multi-channel notification engine
│ ├── mcp/ # Model Context Protocol WebSocket server
│ ├── workflow/ # Development workflow execution
│ ├── config/ # Viper-based configuration management
│ ├── event/ # Pub/sub event bus
│ ├── logging/ # Structured logging wrapper
│ ├── monitoring/ # Metric collection framework
│ ├── security/ # Security scanning (stubbed)
│ ├── session/ # Development session management
│ ├── agent/ # Agent orchestration
│ ├── project/ # Project management
│ ├── rules/ # Rules engine
│ ├── hooks/ # Hook system
│ ├── focus/ # Focus chain management
│ ├── template/ # Template system
│ ├── persistence/ # State persistence
│ ├── deployment/ # Deployment management
│ ├── discovery/ # Service/model discovery
│ ├── hardware/ # Hardware abstraction
│ ├── repomap/ # Repository mapping
│ ├── version/ # Version management
│ ├── fix/ # Security fix engine
│ ├── performance/ # Performance optimization
│ ├── testutil/ # Test utilities
│ └── mocks/ # Shared mocks
│
├── applications/ # Platform-specific applications
│ ├── desktop/ # Fyne desktop app
│ ├── terminal-ui/ # tview terminal UI
│ ├── android/ # Android app
│ ├── ios/ # iOS app
│ ├── aurora-os/ # Aurora OS client
│ └── harmony-os/ # Harmony OS client
│
├── api/ # OpenAPI specification
│ └── openapi.yaml # Full REST API spec (OpenAPI 3.0.3)
│
├── config/ # Configuration files
│ ├── config.yaml # Primary application config
│ ├── production-config.yaml # Enterprise production config
│ ├── minimal-config.yaml # Minimal test config (DB/Redis disabled)
│ ├── test-config.yaml # Test-specific config
│ ├── working-config.yaml # Working variant
│ ├── azure_example.yaml # Azure-specific example
│ └── model-aliases.example.yaml# Model alias examples
│
├── tests/ # New test framework
│ ├── e2e/challenges/ # Challenge-based E2E tests
│ └── automation/ # Hardware automation tests
│
├── test/ # Legacy/parallel test suites
│ ├── integration/ # Integration tests
│ ├── e2e/ # Legacy E2E tests
│ ├── automation/ # Provider automation tests
│ └── load/ # Load tests
│
├── benchmarks/ # Performance benchmarks
├── security/ # Security tests
├── standalone_tests/ # Standalone CLI tests
├── docker/ # Docker assets and extended compose
├── scripts/ # Build and deployment scripts
└── assets/ # Logo and image assets
File: internal/auth/auth.go (~470 lines)
Assessment: Production-ready
- User registration with validation
- Password hashing with bcrypt + argon2 fallback
- JWT token generation and verification (JWT v4)
- Session management with crypto-random tokens
- Constant-time comparison for timing attack prevention
- Full test coverage in
internal/auth/auth_test.go(~777 lines)
File: internal/database/database.go
Assessment: Production-ready
- PostgreSQL connection pool via pgx/v5
- Full schema initialization (users, workers, tasks, projects, sessions, LLM providers, MCP servers, notifications, audit logs)
DatabaseInterfacefor testability- Graceful degradation when host is empty
File: internal/server/server.go
Assessment: Production-ready
- Gin-based server with 50+ routes across
/api/v1/ - JWT auth middleware, CORS, security headers
- WebSocket endpoint for MCP
- Health check with DB + Redis validation
- Graceful shutdown (30s timeout)
File: internal/llm/ (~5000+ lines across providers)
Assessment: Real HTTP clients
AnthropicProvider(~752 lines): Full SSE streaming, prompt caching, extended thinking, tool callsOpenAIProvider(~431+ lines): Full HTTP API clientModelManager: Multi-provider orchestration, selection strategy, fallback chain- 16 provider subdirectories with real HTTP implementations
- Note: The
internal/llm/package is genuine. Bluff areas are atcmd/cli/main.goonly.
File: internal/worker/ (~800+ lines)
Assessment: Real distributed worker management
WorkerManager: Register, heartbeat, assign tasks, complete tasks- SSH config parsing, capability matching, resource tracking
- Health checks with TTL
File: internal/task/ (~1000+ lines)
Assessment: Real task lifecycle
- Priority queues, dependency validation, checkpointing
- Redis caching with graceful degradation
- Retry logic and cleanup
File: internal/workflow/ (~1100+ lines)
Assessment: Real shell execution
Executordispatches to realexec.CommandContext()calls- Security filtering via
isDangerousCommand()(rm, dd, mkfs, fork bombs, etc.) - LLM integration with real
LLMRequest - Supports Go, Node, Python, Rust project types
File: internal/tools/ (~2000+ lines)
Assessment: Real tool registry
- 8 categories: filesystem, shell, web, browser, mapping, multiedit, confirmation, notebook
- Real chromedp browser automation
- Transactional multi-file editing
File: internal/editor/ (~600+ lines)
Assessment: Real file I/O
- Diff, whole-file, search/replace, line-based editors
- Automatic file backup with
io.Copy EditApplier/EditValidatorinterfaces
File: internal/notification/ (~800+ lines)
Assessment: Real HTTP/SMTP calls
- Slack (webhook HTTP POST), Email (SMTP via
net/smtp), Telegram (Bot API), Discord (webhook) - Yandex Messenger (OAuth API), Max (enterprise API)
- Rate limiting, retry, queue, metrics
File: internal/mcp/ (~400+ lines)
Assessment: Real WebSocket server
- gorilla/websocket concurrent session handling
- JSON-RPC-like message format
- Tool execution dispatch
File: internal/config/ (~1700+ lines)
Assessment: Full Viper integration
- Environment variable binding (
HELIX_*) - Config file search (
.,$HOME/.helixcode,/etc/helixcode) - Validation rules, default config creation
ConfigManagerfor load/save/merge
Files: internal/helixqa/, internal/server/qa_handlers.go, applications/terminal_ui/main.go
Assessment: Full embedded QA engine with real session lifecycle
Enginestruct manages QA sessions with map + sync.RWMutexStartSession(),CancelSession(),GetSession(),ListSessions()with real state tracking- REST API:
POST /api/v1/qa/session,GET /api/v1/qa/session/:id/status,GET /api/v1/qa/session/:id/report,GET /api/v1/qa/session/:id/screenshot/:name,DELETE /api/v1/qa/session/:id - CLI flags:
--qa-run,--qa-list,--qa-report,--qa-screenshot,--qa-cancel - TUI dashboard with session table, stats panel, refresh/cancel actions
- Screenshot pipeline: 8 platform engines (Linux, Web, iOS, Android, CLI, TUI, macOS, Windows)
- Tests:
internal/helixqa/wrapper_test.go,internal/server/qa_handlers_test.go,pkg/screenshot/*_test.go
File: cmd/cli/main.go lines ~236-284
Evidence: Previously returned fmt.Sprintf("Generated response for: %s...", prompt) without calling any provider.
Fix: handleGenerate() now constructs a real llm.LLMRequest with user messages and calls provider.Generate() / provider.GenerateStream(). Errors are propagated to the user if the provider is unavailable.
Verification: go build -tags nogui ./cmd/cli/ compiles; provider call is real (returns error if Ollama/etc. is not running).
Fix Priority: P0 — RESOLVED
File: cmd/cli/main.go lines ~101-128
Evidence: Previously only 3 hardcoded models. No dynamic discovery.
Fix: Replaced with verifier-aware handleListModels() that queries LLMsVerifier adapter first, falls back to provider discovery, then to constitutional FallbackModels (7 models with scores and verification status).
Verification: go test -v ./internal/verifier/... passes; go build ./cmd/cli/... compiles.
Fix Priority: P0 — RESOLVED
File: cmd/cli/main.go lines ~310-324
Evidence: Previously printed the command and slept for 1 second without executing anything.
Fix: handleCommand() uses exec.CommandContext(ctx, "sh", "-c", command) with real os.Stdout/os.Stderr redirection. Exit codes are reported.
Verification: go build -tags nogui ./cmd/cli/ compiles.
Fix Priority: P0 — RESOLVED
File: internal/security/security.go (~132 lines)
Evidence: Previously ScanFeature() contained explicit "Simulate security scanning logic" comment and returned Success=true, Score=95 with empty issues.
Fix: Replaced with real Scanner interface, SonarQube/Snyk clients. ScanFeature() now dispatches real security scanners. Score is no longer hardcoded.
Verification: 17/17 tests PASS, zero "simulated" in production code.
Fix Priority: P1 — RESOLVED (commit 33ddf6a)
File: internal/memory/ (~1800+ lines)
Evidence: Previously RedisMemoryProvider and MemcachedMemoryProvider stored data in local maps.
Fix: internal/redis/redis.go provides real go-redis/v9 client. Dedicated memory stubs no longer exist.
Fix Priority: P2 — RESOLVED
File: cmd/security_test/main.go
Evidence: Previously had 12 simulated test results via simulateSecurityScan().
Fix: Rewired to real internal/security scanner dispatch via ScanFeature().
Fix Priority: P2 — RESOLVED (commit 33ddf6a)
File: cmd/other_commands.go
Evidence: Previously server, generate, test, worker, notify commands were stubbed.
Fix: Wired to real server.New(), llm.ModelManager, notification engine, and go test dispatch.
Fix Priority: P2 — RESOLVED (commit 33ddf6a)
File: cmd/helix_config/main.go
Evidence: Many template/history/schema subcommands print placeholder messages.
Fix Priority: P3
File Pattern: internal/verifier/*.go
Evidence: Previously returned hardcoded 8.5 scores, empty discovery results.
Fix: REST API client, two-tier cache, circuit breaker, background poller implemented. Fallback models are constitutionally permitted (CONST-036).
Verification: Fallback models list only at fallback_models.go (permitted). All other verifier code uses real client.
Fix Priority: P0 — RESOLVED
File Pattern: internal/verifier/startup.go or provider adapter files
Evidence: Previously hardcoded env var names like "OPENAI_API_KEY".
Fix: Env vars sourced from config struct fields, not hardcoded strings.
Verification: grep sweep clean.
Fix Priority: P1 — RESOLVED
File Pattern: internal/llm/*.go
Evidence: Previously SupportsToolUse: true as struct literal.
Fix: No hardcoded capabilities found in grep sweep.
Fix Priority: P1 — RESOLVED
File Pattern: *_test.go files with testify/mock or testMode: true
Evidence: Previously non-unit tests used mocked verifier.
Fix: No mocks in non-unit verifier tests.
Verification: grep sweep clean.
Fix Priority: P0 — RESOLVED
File Pattern: configs/verifier.yaml or internal/verifier/config.go
Evidence: Previously scoring weights misconfigured.
Fix: Weights configurable via ScoringWeights struct in adapter.go. No hardcoded weight issues found.
Fix Priority: P2 — RESOLVED
File: internal/server/handlers.go lines ~834-855
Evidence: All dynamic metrics (goroutines, memory, database connections) were hardcoded to 0.
Fix: getMetrics() now calls runtime.ReadMemStats(), runtime.NumGoroutine(), and s.db.Pool.Stat() to return real values.
Fix Priority: P0 — RESOLVED
File: internal/tools/multiedit/transaction.go lines ~352-369
Evidence: detectFileConflict() always returned nil, nil with comment "For now, we'll assume no conflicts."
Fix: Implemented real conflict detection — reads the file from disk, computes SHA-256, and compares against the Checksum field. Returns ConflictModified or ConflictDeleted when appropriate.
Fix Priority: P1 — RESOLVED
Main config at config/config.yaml:
server:
address: "0.0.0.0"
port: 8080
read_timeout: 30
write_timeout: 30
idle_timeout: 300
shutdown_timeout: 30
database:
host: "" # Empty string disables PostgreSQL
port: 5432
user: "helix"
password: "${HELIX_DATABASE_PASSWORD}"
dbname: "helixcode_prod"
sslmode: "disable"
redis:
host: "redis"
port: 6379
password: "${HELIX_REDIS_PASSWORD}"
db: 0
enabled: true
auth:
jwt_secret: "${HELIX_AUTH_JWT_SECRET}"
token_expiry: 86400
session_expiry: 604800
bcrypt_cost: 12
workers:
health_check_interval: 30
health_ttl: 120
max_concurrent_tasks: 10
tasks:
max_retries: 3
checkpoint_interval: 300
cleanup_interval: 3600
llm:
default_provider: "local"
max_tokens: 4096
temperature: 0.7
timeout: 30
max_retries: 3
providers:
<name>:
type: <provider-type>
endpoint: <url>
enabled: true
parameters:
timeout: 30.0
max_retries: 3
streaming_support: true
api_key: ""
selection:
strategy: "performance"
fallback_enabled: true
health_check_interval: 30
logging:
level: "info"
format: "text"
output: "stdout"
notifications:
enabled: true
rules:
- name: "..."
condition: "type==error"
channels: ["slack", "email"]
priority: urgent
enabled: true
channels:
slack: { enabled, webhook_url, channel, username, timeout }
telegram: { enabled, bot_token, chat_id, timeout }
email: { enabled, smtp: { server, port, username, password, tls }, recipients, timeout }
discord: { enabled, webhook_url, timeout }Required for Production:
HELIX_DATABASE_PASSWORDHELIX_AUTH_JWT_SECRETHELIX_REDIS_PASSWORD
LLM Provider Keys (as needed):
OPENAI_API_KEY,ANTHROPIC_API_KEY,GEMINI_API_KEY,XAI_API_KEY,DEEPSEEK_API_KEY,GROQ_API_KEY,MISTRAL_API_KEY,COHERE_API_KEY,AZURE_OPENAI_API_KEY,AWS_ACCESS_KEY_ID/AWS_SECRET_ACCESS_KEY
Notification Integrations:
HELIX_SLACK_WEBHOOK_URLHELIX_TELEGRAM_BOT_TOKEN,HELIX_TELEGRAM_CHAT_IDHELIX_EMAIL_SMTP_SERVER,HELIX_EMAIL_USERNAME,HELIX_EMAIL_PASSWORDHELIX_DISCORD_WEBHOOK_URL
- Unit tests: Mocks allowed,
*_test.go,-shortflag - Contract tests: Real API schemas, no mocks
- Component tests: Real subsystems wired together
- Integration tests: Full app with real dependencies (
-tags=integration) - E2E challenges: Complete user workflows against real LLM APIs
- Security tests: OWASP compliance
- Performance tests: Benchmarks
- Automation tests: Provider/hardware automation (
-tags=automation) - Load tests: Stress testing
- Unit tests: Mocks OK
- ALL other tests: Real infrastructure ONLY
- Every PASS guarantees Quality + Completion + Usability
- Challenges fail on simulated/stubbed behavior
- No bare
t.Skip()withoutSKIP-OK: #<ticket>marker
docker-compose.test.yml: PostgreSQL 16, Redis 7, Memcached, Cognee, ChromaDB, Qdrant, Ollama, Prometheus, Grafanadocker-compose.full-test.yml: Complete stack with mock-LLM server, Selenium, ChromeDP, SSH server + 3 workers, Cognee, Weaviate, mock-Slack, multicast router
The most rigorous test system validates HelixCode by having it generate real projects and testing them:
- Challenge Definitions: JSON specs (ASCII art generator, CLI task manager, JSON validator, notes API, tic-tac-toe TUI, URL shortener)
- Execution Flow: Load spec → Call real LLM API → Parse generated code → Compile → Test → Runtime validation
- Validation Layers: Directory structure, code quality, compilation, testing, functionality, runtime validation with diverse data
- Test Matrix: Supports CLI, TUI, REST, WebSocket interfaces across 15+ providers and worker pool distributions
# Basic
cd HelixCode && make test
# Full infrastructure (recommended for validation)
make test-infra-up
make test-complete
make test-infra-down
# Individual categories
make test-unit-full
make test-integration-full
make test-e2e-full
make test-security-full
make test-load-full
# Legacy scripts
./run_tests.sh --all
./run_all_tests.sh
./run_integration_tests.shServices: helixcode-server (8080, 2222), postgres:15, redis:7, nginx (80, 443), prometheus (9090), grafana (3000)
cd HelixCode
cp .env.example .env
# Edit .env with secure passwords
docker compose up -d
docker compose ps
curl http://localhost/health| File | Purpose |
|---|---|
docker-compose-simple.yml |
Minimal dev (postgres + redis only) |
docker-compose.test.yml |
Integration/E2E testing stack |
docker-compose.full-test.yml |
Zero-skip full test infrastructure |
docker-compose.aurora-os.yml |
Security-focused Aurora OS platform |
docker-compose.harmony-os.yml |
Distributed Harmony OS platform |
docker-compose.specialized-platforms.yml |
Combined Aurora + Harmony |
docker/docker-compose.yml |
Extended full-stack with Milvus, Elasticsearch, MLflow, Jaeger, Jupyter, Portainer |
- Healthchecks on every service
- Docker profiles:
monitoring,distributed,with-redis,production,dev,server - Isolated bridge networks per deployment
- Named persistent volumes for all stateful services
.envfile for secrets
- Standard Go formatting:
go fmt ./... - Linting:
golangci-lint run ./...(timeout 10m in CI) - Vet:
go vet ./... - Table-driven tests with
t.Run()subtests - Build tags for integration/automation tests:
//go:build integration
- Always work from
helix_code/subdirectory - Generate logo assets before first build:
make logo-assets - Database/Redis optional: Disable by setting
database.host: "" - Environment variables override config file
- Use
internal/for all core packages; nopkg/directory in active use - Error handling: explicit, no silent failures
- Concurrent access: use
sync.RWMutexor channel patterns
- REST API documented in
api/openapi.yaml(OpenAPI 3.0.3) - Base path:
/api/v1 - Authentication: Bearer JWT via
Authorizationheader - Health endpoint:
GET /health(no auth required)
- Password hashing: bcrypt (cost 12) with argon2 fallback
- JWT with constant-time comparison
- CORS middleware, security headers (X-Frame-Options, CSP, HSTS)
- Rate limiting support in production config
- Session timeout, concurrent session limits, IP binding options
- Workflow
isDangerousCommand()filter blocks rm, dd, mkfs, fork bombs, etc. - Input validation in auth and server packages
security/security_test.go: OWASP Top 10, SAST, DAST, credential scanning, TLS enforcement, input validation (path traversal, XSS, SQL injection, command injection, SSRF)- File permission checks (0600 for configs)
internal/security/security.go: Simulated scanning (always returns clean)cmd/security_test/main.go: Entirely simulated security tests
- Use
HELIX_AUTH_JWT_SECRETwith high entropy - Enable PostgreSQL SSL in production
- Enable Redis authentication
- Configure CORS
allowed_originsexplicitly - Enable audit logging
- Set
bcrypt_cost: 14in production
- NO CI/CD pipelines (Note: existing workflow files in
.github/workflows/are legacy and must not be expanded) - NO HTTPS for Git (SSH only)
- NO manual container commands (orchestrator-owned)
- 100% Test Coverage (unit, integration, E2E, automation, security, benchmark)
- Challenge Coverage (every component)
- Real Data (actual API calls, real DB, live services)
- Health & Observability (health endpoints, circuit breakers)
- Documentation & Quality (update docs with code changes)
- Validation Before Release (full suite + all challenges)
- No Mocks in Production
- Comprehensive Verification (runtime, compile, structure, dependencies, compatibility)
- Resource Limits (30-40% of host resources max)
- Bugfix Documentation (root cause, affected files, fix, verification link)
- Real Infrastructure for All Non-Unit Tests
- Reproduction-Before-Fix (Challenge first, then fix)
- Concurrent-Safe Collections
A change is NOT done because code compiles. "Done" requires:
- Pasted terminal output from a real run
- No self-certification words without evidence
- Demo commands that run against real artifacts
- Loud skips with
SKIP-OK: #<ticket>markers
A test or Challenge that PASSES is a CLAIM that the tested behavior works for the end user of the product.
The HelixAgent project has repeatedly hit the failure mode where every test ran green AND every Challenge reported PASS, yet most product features did not actually work — buggy challenge wrappers masked failed assertions, scripts checked file existence without executing the file, "reachability" tests tolerated timeouts, contracts were honest in advertising but broken in dispatch. This MUST NOT recur in HelixCode.
Every PASS result MUST guarantee: a. Quality — correct behavior under real inputs, edge cases, concurrency b. Completion — wired end-to-end with no stub/placeholder gaps c. Full usability — a user following documentation succeeds
A passing test that doesn't certify all three is a bluff and MUST be tightened.
- Wrapper bluff — assertions PASS but wrapper's exit-code logic is buggy
- Contract bluff — system advertises capability but rejects it in dispatch
- Structural bluff — file exists but doesn't contain working code
- Comment bluff — comment promises behavior code doesn't have
- Skip bluff —
t.Skip("not running yet")withoutSKIP-OK: #<ticket>marker
The taxonomy is illustrative, not exhaustive. Every Challenge or test added going forward MUST pass an honest self-review against this taxonomy before being committed.
Verbatim user mandate: "We had been in position that all tests do execute with success and all Challenges as well, but in reality the most of the features does not work and can't be used! This MUST NOT be the case and execution of tests and Challenges MUST guarantee the quality, the completion and full usability by end users of the product!"
Operative rule: The bar for shipping is not "tests pass" but "users can use the feature." Every PASS in this codebase MUST carry positive runtime evidence captured during execution. Metadata-only / configuration-only / absence-of-error / grep-based PASS without runtime evidence are critical defects regardless of how green the summary line looks. No false-success results are tolerable.
No API key, token, password, certificate, or other credential may be committed to any repository owned by HelixDevelopment or vasic-digital. All secrets live in .env files (mode 0600) listed in .gitignore. Any leak is a release blocker until rotated and post-mortemed.
No force push, force-with-lease push, history rewrite, branch deletion of main/master, or upstream-overwriting operation may be performed without explicit, in-conversation user approval per operation. Authorization for one push does not extend further. Bypassing hooks / signing / protected-branch rules also requires explicit approval.
The docs/CONTINUATION.md document MUST be maintained in sync with actual programme state at all times. It is the authoritative resumption record for any CLI agent or LLM picking up the CLI-Agent Fusion programme. Every commit that advances state — task completion, feature close-out, push, known-issue discovery, deferred-item resolution, phase transition, submodule/remote add or remove — MUST update CONTINUATION in the same commit. Out-of-sync CONTINUATION is a CRITICAL DEFECT — same severity as a false-success test result under CONST-035 / Article XI §11.9. Cascade applies to every owned-by-us repo's three governance files. See CONSTITUTION.md Article XIII §13.1 for the full mandate (definition of out-of-sync, verification plan, cascade requirement).
ALL container distribution targets SHALL be configured exclusively through CONTAINERS_REMOTE_HOST_N_* env vars in containers/.env. NO host (hostname, IP, user, key path, runtime, label) may be hardcoded in ANY source file, test, challenge, config template, script, or governance document. Adding/removing hosts = editing containers/.env only; NO code change. Tests SHALL read .env at runtime and skip with SKIP-OK: marker when CONTAINERS_REMOTE_ENABLED=false. See CONSTITUTION.md §CONST-045 for the full mandate.
Rule: LLMsVerifier SHALL BE the sole authoritative source for:
- All model metadata (names, IDs, context windows, capabilities)
- All provider metadata (endpoints, auth types, supported models)
- All verification status (verified, partial, failed, pending)
- All scoring data (overall scores, capability scores, tier rankings)
Prohibition: NO hardcoded model lists, NO hardcoded provider lists, NO simulated model discovery. Any code path that presents a model or provider listing to a user MUST fetch that listing from the LLMsVerifier subsystem or its cached replica.
Anti-Bluff Verification:
- Challenge script
scripts/anti_bluff/verifier_hardcode_check.shscans all Go source files for hardcoded model arrays. - The only permitted hardcoded data is the 7-entry fallback list in
internal/verifier/fallback_models.go.
Rule: Every model displayed to an end user MUST have been verified by LLMsVerifier within the last 24h. Models older than this MUST display a "stale" indicator and be deprioritized.
Anti-Bluff Testing:
- Unit tests MAY mock the verifier client.
- Integration tests MUST start the verifier server and perform real provider discovery.
- The Makefile target
make test-verifier-integrationMUST exist and run without mocks.
Rule: Model status (available, rate-limited, cooldown, offline, deprecated) displayed to users MUST reflect the actual state as known by LLMsVerifier within 60 seconds.
Polling vs. Push:
- If WebSocket/SSE push is unavailable, the system MUST poll LLMsVerifier at most every 60s.
- The TUI MUST display a "last updated" timestamp with every model listing.
- Models in "cooldown" or "rate-limited" state MUST show the estimated recovery time if known.
Rule: HelixCode MUST integrate with ALL providers that LLMsVerifier supports, subject only to:
- The provider being explicitly disabled in configuration (
enabled: false) - The API key being absent and the provider requiring one
- The provider being marked
deprecatedin the verifier database
Minimum Provider Set (SHALL NOT be reduced without constitutional amendment): OpenAI, Anthropic, Gemini, DeepSeek, Groq, Mistral, xAI, OpenRouter, Ollama, Llama.cpp.
Rule: LLMsVerifier integration SHALL extend beyond basic model listing to cover ALL capability dimensions:
- MCP: The verifier MUST report which models support MCP tool calling.
- LSP: The verifier MUST report code-analysis capabilities.
- ACP: The verifier MUST report multi-agent coordination support.
- Embedding: The verifier MUST report
supports_embeddingsfor each model. - RAG: The verifier MUST report context-window sizes for chunking strategies.
- Skills / Plugins: The verifier MUST track plugin compatibility.
Prohibition: Capability flags MUST NOT be hardcoded. The Provider.GetCapabilities() method MUST return data sourced from the verifier's VerificationResult fields.
Rule: NO user-facing text, prompt template, question text, error message, label, helper text, or explanatory content may be hardcoded as a static literal string in any source file. All such content MUST be:
- Generated dynamically by an LLM at runtime based on user context (language, prompt content, session state), OR
- Loaded from an i18n resource file (
.yaml,.json,.toml) that supports locale overrides, OR - Composed programmatically from verifier metadata, configuration data, or provider responses.
Rationale: Hardcoded English text silently breaks the product for non-English users. A clarification question hardcoded as "Which file has the bug?" may be shown identically to a Japanese, Serbian, or Spanish user — producing an incoherent, unusable experience. Every piece of user-facing text MUST adapt.
Violation examples: []string{"Which file has the bug?"}, hardcoded TUI labels, static error prose, hardcoded English prompt templates.
Compliant patterns: LLM-generated questions via DetectAmbiguity(prompt), i18n-loaded labels with locale fallback, verifier-composed model descriptions.
Enforcement: make lint MUST include a hardcoded-text scan. Anti-bluff sweeps MUST flag obvious hardcoded-text patterns.
See CONSTITUTION.md §CONST-046 for the full mandate and cascade requirements.
Verbatim user mandate (2026-05-14): "Make sure all work we do is applied ALWAYS to all Submodules we control under our organizations (vasic-digital and HelixDevelopment) fully recursively everywhere with full bluff-proofing and comprehensive documentation, user manuals and guides and full tests and Challenges coverage!"
Every engineering deliverable produced for HelixCode MUST be applied — fully and recursively — to every owned submodule under the vasic-digital and HelixDevelopment GitHub organizations. Each owned submodule MUST receive in lockstep: (1) the same anti-bluff posture (CONST-035 / Article XI §11.9), (2) comprehensive documentation (README + user manuals + guides matching actual capabilities), (3) full tests + Challenges coverage with captured runtime evidence, (4) recursive propagation through nested submodules under the same orgs, (5) synchronized commits when meta-repo state advances a submodule's surface.
Owned submodule baseline (direct children as of 2026-05-14): vasic-digital/{Containers, Security, Challenges, LLMsVerifier, Models}, HelixDevelopment/{HelixQA, DocProcessor, LLMOrchestrator, LLMProvider, VisionEngine, HelixAgent}, HelixDevelopment-s-Code/Website — plus any nested submodule under these orgs (full recursion).
Cascade requirement: This rule (verbatim or by CONST-047 ID reference) MUST appear in every owned-by-us submodule's CONSTITUTION.md, CLAUDE.md, and AGENTS.md.
See CONSTITUTION.md §CONST-047 for the full mandate and verification details.
Verbatim user mandate (2026-05-15): "Make sure that every feature, every functionality, every flow, every use case, every edge case, every service or application, on every platform we support is covered with full automation tests which will confirm anti-bluff policy and provide the proof of fully working capabilities, working implementation as expected, no issues, no bugs, fully documented, tests covered! Nothing less than this does not give us a chance to deliver stable product! This is mandatory constraint which MUST BE respected without ignoring, skipping, slacking or forgetting it!"
No feature / functionality / flow / use case / edge case / service / application on any supported platform of HelixCode is deliverable until automation tests prove six invariants: (1) anti-bluff posture with captured runtime evidence (CONST-035); (2) proof of working capability end-to-end on target topology; (3) implementation matches documented promise; (4) no open issues/bugs surfaced; (5) full documentation in sync; (6) four-layer test floor (pre-build + post-build + runtime + paired mutation). Coverage ledger regenerated at release-gate sweep.
Cascade requirement: This rule (verbatim or by CONST-048 ID reference) MUST appear in every owned-by-us submodule's CONSTITUTION.md, CLAUDE.md, and AGENTS.md. See constitution submodule Constitution.md §11.4.25 for the full mandate.
CONST-049: Constitution-Submodule Update Workflow Mandate (cascaded from constitution submodule §11.4.26)
Verbatim user mandate (2026-05-15): "Every time we add something into our root (constitution Submodule) Constitution, CLAUDE.MD and AGENTS.MD we MUST FIRST fetch and pull all new changes / work from constitution Submodule first! All changes we apply MUST BE commited and pushed to all constitution Submodule upstreams! In case of conflict, IT MUST BE carefully resolved! Nothing can be broken, made faulty, corrupted or unusable! After merging full validation and verification MUST BE done!"
7-step pipeline: (1) fetch + pull first inside constitution submodule; (2) apply with §11.4.17 classification + verbatim mandate; (3) validate (meta-test + no conflict markers + cross-file consistency); (4) commit + push to EVERY configured upstream (governance files only, no git add -A); (5) careful conflict resolution preserving union — force-push forbidden (CONST-043); (6) post-merge cascade verification (CONST-047); (7) bump consuming .gitmodules pointer in SAME commit.
Cascade requirement: This rule (verbatim or by CONST-049 ID reference) MUST appear in every owned-by-us submodule's CONSTITUTION.md, CLAUDE.md, and AGENTS.md. See constitution submodule Constitution.md §11.4.26 for the full mandate.
CONST-050: No-Fakes-Beyond-Unit-Tests + 100%-Test-Type-Coverage Mandate (cascaded from constitution submodule §11.4.27)
Verbatim user mandate (2026-05-15): "Mocks, stubs, placeholders, TODOs or FIXMEs are allowed to exist ONLY in Unit tests! All other test types MUST interract with real fully implemented System! No fakes, empty implementations or bluffing is allowed of any kind! All codebase of the project MUST BE 100% covered with every supported test type: unit tests, integration tests, e2e tests, full automation tests, security tests, ddos tests, scaling tests, chaos tests, stress tests, performance tests, benchmarking tests, ui tests, ux tests, Challenges (fully incorporating our Challenges Submodule). EVERYTHING MUST BE tested using helix_qa (fully incorporating helix_qa Submodule). helix_qa MUST BE used with all possible written tests suites (test banks) for every applications, service, platform, etc and execution of the full helix_qa QA autonomous sessions! All required dependency Submodules MUST BE added into the project as well (fully recursive!!!)."
Two invariants. (A) Mocks/stubs/fakes/placeholders/TODOs/FIXMEs/"for now"/empty-implementation patterns PERMITTED only in unit-test sources; non-unit tests (integration, E2E, full-automation, security, DDoS, scaling, chaos, stress, performance, benchmarking, UI, UX, Challenges, HelixQA) MUST exercise the real, fully implemented HelixCode system. Production code MUST NOT import helix_code/internal/mocks/. (B) Codebase MUST be covered by every supported test type the domain warrants: unit, integration, E2E, full-automation, security, DDoS, scaling, chaos, stress, performance, benchmarking, UI, UX, Challenges (vasic-digital/Challenges submodule at ./challenges/ fully incorporated), helix_qa (HelixDevelopment/HelixQA submodule at ./helix_qa/ fully incorporated, with full autonomous QA sessions executing every registered test bank with captured wire evidence).
Required dependency submodules (recursive per CONST-047): Challenges (git@github.com:vasic-digital/Challenges.git at ./challenges/) + helix_qa (git@github.com:HelixDevelopment/HelixQA.git at ./helix_qa/) + any other functionality submodules under vasic-digital/HelixDevelopment orgs HelixCode depends on. Pointers bumped to upstream HEAD in same commit as cascade work (CONST-049 step 7).
Cascade requirement: This rule (verbatim or by CONST-050 ID reference) MUST appear in every owned-by-us submodule's CONSTITUTION.md, CLAUDE.md, and AGENTS.md. See constitution submodule Constitution.md §11.4.27 for the full mandate.
CONST-051: Submodules-As-Equal-Codebase + Decoupling + Dependency-Layout Mandate (cascaded from constitution submodule §11.4.28)
Verbatim user mandate (2026-05-15): "All existing Submodules in the project that we are controlling and belong to some our organizations (vasic-digital, HelixDevelopment, red-elf, ATMOSphere1234321, Bear-Suite, BoatOS123456, Helix-Flow, Helix-Track, Server-Factory) are equal parts of the project's codebase! We MUST work on that code as much as we do with main project's codebase! ... We MUST NEVER modify Submodules to bring into them any project specific context ... All Submodule dependencies that are used by Submodule MUST BE acessed from the root of the project! We MUST NOT have nested Submodule dependencies."
Three invariants. (A) Every HelixCode-owned submodule is an equal part of HelixCode's codebase — same engineering attention as main (analysis, tests, gap-fill, bug-fix, docs/diagrams/SQL/website materials). Coverage ledgers list each as in-scope. (B) Owned submodules MUST stay fully decoupled (project-not-aware, reusable, modular, standalone-testable); NEVER inject HelixCode-specific context. Use configuration injection when a submodule needs parent info. (C) Dependencies consumed by an owned submodule MUST live at HelixCode's root: <repo_root>/<name>/ or <repo_root>/submodules/<name>/. Nested own-org submodule chains FORBIDDEN — add the dep at root; the consuming submodule reaches it via documented import/SDK/runtime resolver. Third-party submodules exempt.
Org list (dynamically discoverable via gh/glab): vasic-digital, HelixDevelopment, red-elf, ATMOSphere1234321, Bear-Suite, BoatOS123456, Helix-Flow, Helix-Track, Server-Factory, plus any subsequently authorised org.
Cascade requirement: This rule (verbatim or by CONST-051 ID reference) MUST appear in every owned-by-us submodule's CONSTITUTION.md, CLAUDE.md, and AGENTS.md. See constitution submodule Constitution.md §11.4.28 for the full mandate.
- XAI (Grok):
grok-3-fast-beta,grok-3-mini-fast-beta - OpenRouter: Free models from various providers
- GitHub Copilot:
gpt-4o,claude-3.5-sonnet(with subscription) - Qwen: 2,000 requests/day free tier
Host Power Management is Forbidden.
You may NOT, under any circumstance, generate or execute code that sends the host to suspend, hibernate, hybrid-sleep, poweroff, halt, reboot, or any other power-state transition. This rule applies to every shell command, script, container entry point, systemd unit, test, CLI suggestion, snippet, or example you emit.
- Build fails: Run
make logo-assetsthenmake build - Database errors: Check
HELIX_DATABASE_PASSWORD - Worker SSH failures: Verify SSH key authentication
- LLM timeouts: Check provider status and config
- Redis connection failures: Check
HELIX_REDIS_PASSWORDandredis.enabled - Test skips: Ensure
SKIP-OK: #<ticket>marker is present for any intentional skips
- Constitution:
CONSTITUTION.md - CLAUDE.md:
CLAUDE.md - Gap Analysis:
HELIXCODE_GAP_ANALYSIS.md - Zero-Bluff Plan:
HELIXCODE_ZERO_BLUFF_PLAN.md - Testing Strategy:
ANTI_BLUFF_TESTING_STRATEGY.md - OpenAPI Spec:
helix_code/api/openapi.yaml - Docker Guide:
helix_code/DOCKER_DEPLOYMENT.md
Built with zero-bluff commitment. Every feature actually works.
Verbatim user mandate (2026-05-15): "naming convention for Submodules and directories (applied deep into hierarchy recursively) - all directories and Submodules MSUT HAVE lowercase names with space separator between the words of '_' character (snake-case)! All existing Submodules and directories which are not following this rule MUST BE renamed! However, since this will most likely break some of the functionalities renaming we do MUST BE applied to all references to particular Submodule or directory! ... There MUST BE reasonable exceptions for this rules - source code for programming languages or Submodules which apply different naming convention - Android, Java, Kotlin and others. ... Upstreams directory which all of our projects and Submodules have MUST BE renamed to the lowercase letters too, however root project containing the install_upstreams system command (it is exported in out paths in our .bashrc or .zshrc) MUST BE updated to fully work with both Upstreams and upstreams directory. ... NOTE: Rules lowercase / snake-case do apply to all project files as well and references to it and from them!"
Every directory, submodule, and file in HelixCode MUST use lowercase snake_case names. Existing non-compliant names (helix_code/, challenges/, containers/, helix_agent/, helix_qa/, security/, github_pages_website/, upstreams/, dependencies/, etc.) MUST be renamed as part of the phased migration opened by this clause. Every reference (configs, docs, links, source-code imports, governance files) MUST be updated atomically with the rename — reference drift after a rename is a CONST-052 violation of equal severity to the rename itself.
Common-sense exceptions (technology-preserving): language-mandated case for Java/Kotlin/Android/Apple/C#/Swift INSIDE the language root (submodule root follows our convention; subtree follows language convention); vendor/upstream third-party submodules keep upstream names; build artefacts (node_modules, __pycache__, .git, target, build, bin) keep tool-mandated names. The test "does renaming break the technology?" trumps the rule.
upstreams/ → upstreams/ transition: the constitution submodule's install_upstreams.sh (exported via .bashrc/.zshrc) supports BOTH upstreams/ and upstreams/ directory layouts (commit 45d3678 of the constitution submodule); lowercase wins when both present.
Test coverage of renames (per CONST-050(B)): every rename batch ships with (i) regression test verifying every reference now resolves, (ii) full test-type matrix run post-rename, (iii) anti-bluff wire-evidence captured.
Phased execution per the operator's explicit instruction: comprehensive brainstorming → phase-divided plan → fine-grained tasks/subtasks → every change covered by every applicable test type. §11.4.20 subagent delegation for cross-cutting rename sweeps.
Cascade requirement: This anchor (verbatim or by CONST-052 ID reference) MUST appear in every owned submodule's CONSTITUTION.md, CLAUDE.md, and AGENTS.md. Severity-equivalent to a §11.4 PASS-bluff at the reference-integrity layer. No escape hatch beyond the common-sense exceptions enumerated above. See constitution submodule Constitution.md §11.4.29 for the full mandate.
CONST-053: .gitignore + No-Versioned-Build-Artifacts Mandate (cascaded from constitution submodule §11.4.30)
Verbatim user mandate (2026-05-15): "every project module, every Submodule, every servcie and apolication MUST HAVE proper .gitignore file! We MUST NOT git version build artifacts, cache files, tmp files, main .env file(s) or any files containing sensitive data, API keys or token! Any build derivate which we can recreate by executing proper mechanism for generating MUST NOT be versioned! We MUST pay attention what is going to be commited every time we are preparing to execute commit! If any violetion is detected it MUST be fixed before commit is executed!"
Every project module, owned-by-us submodule, service, and application MUST ship a proper .gitignore. Forbidden-from-version-control classes:
- Build artefacts:
/bin/,/build/,/dist/,/out/,target/,*.exe,*.dll,*.so,*.dylib,*.a,*.o,*.class,*.pyc, generator-produced files when the generator is committed. - Cache files:
__pycache__/,.pytest_cache/,.mypy_cache/,.ruff_cache/,node_modules/,.next/,.cache/,.gradle/,.terraform/, language-server caches. - Temp files:
*.tmp,*.swp,*~,.DS_Store,Thumbs.db,*.orig,*.rej. - Sensitive-data files:
.env,.env.*(allow.env.exampleplaceholder only — no real secrets even as examples),*.pem,*.key,*.crt,id_rsa*,id_ed25519*,.netrc,secrets/,api_keys.sh. - Generated reports/logs:
*.log,coverage.out,htmlcov/, runtime captures unless reference assets. - OS/IDE personal state:
.idea/,.history/,.vscode/(except shared settings).
Anti-bluff invariant: .gitignore line alone is not sufficient — no file matching the forbidden patterns may be CURRENTLY TRACKED. A tracked *.log despite the ignore-line is a violation of equal severity to no ignore-line at all.
Pre-commit attention: every commit author (human OR agent) MUST inspect git diff --staged + git status BEFORE executing the commit. Forbidden-class hits abort the commit until fixed (un-stage, add to .gitignore, scrub if already-tracked). Gate CM-GITIGNORE-PRECOMMIT-AUDIT + paired mutation.
Secret-leak intersection (CONST-042 / §11.4.10): a .env leak is BOTH a CONST-053 and a CONST-042 violation; rotation + post-mortem required.
Recreatable-content test: if a documented mechanism regenerates the file from sources, it is a build derivative and MUST be ignored. The committed sources MUST include the generator.
Cascade requirement: This anchor (verbatim or by CONST-053 ID reference) MUST appear in every owned submodule's CONSTITUTION.md, CLAUDE.md, and AGENTS.md. Severity-equivalent to a §11.4 PASS-bluff at the repository-hygiene layer. See constitution submodule Constitution.md §11.4.30 for the full mandate.
Verbatim user mandate (2026-05-15): "We MUST HAVE mechanism for each Submodule to determine / know what are its Submodule dependencies so new projects or palces we are incorporate them can add these Submodules to the project root and make them available! Suggested idea is configuration file with expected Submodules Git ssh urls perhaps? New project can read it, and recursively add each Submodule to the root of the project and install / expose it to veryone."
Every owned-by-us submodule MUST ship helix-deps.yaml at its root declaring its own-org dependencies. Schema: schema_version, deps: [{name, ssh_url, ref, why, layout: flat|grouped}], transitive_handling.{recursive,conflict_resolution}, language_specific_subtree. Tooling: incorporate-submodule <ssh-url> adds the submodule at the parent project's canonical path (CONST-051(C)), reads helix-deps.yaml, recurses for each declared dep, aborts on conflicting refs, emits <root>/.helix-manifest.yaml audit record.
Anti-bluff guarantee: every manifest paired with a Challenge that bootstraps a throwaway consuming project, runs incorporate-submodule, asserts produced layout matches the manifest, runs the submodule's own tests against the bootstrapped layout, captures wire evidence per §11.4.2. A manifest without this proof is a CONST-054 violation.
§11.4.31 / CONST-054 is the operational complement of CONST-051(C): nested own-org submodule chains are FORBIDDEN — manifests are the bridge that lets consumers reconstruct the dependency graph at the parent root.
Cascade requirement: This anchor (verbatim or by CONST-054 ID reference) MUST appear in every owned submodule's CONSTITUTION.md, CLAUDE.md, and AGENTS.md. Severity-equivalent to §11.4 PASS-bluff at the dependency-graph layer. See constitution submodule Constitution.md §11.4.31 for the full mandate.
CONST-055: Post-Constitution-Pull Validation Mandate (cascaded from constitution submodule §11.4.32)
Verbatim user mandate (2026-05-15): "Every time we fetch and pull new changes on constitution Submodule we MUST process the whole project and all Submodule (deep recursively) for validation and verification taht every single rule or mandatory constraint is followed and respected! If it is not, IT MUST BE!"
Whenever a project's constitution submodule is fetched + pulled with any content change, the project MUST run scripts/verify-all-constitution-rules.sh BEFORE the new constitution HEAD is treated as canonical for any other work. The sweep re-runs the governance-cascade verifier AND every implementable rule gate (CONST-053 .gitignore audit, CONST-051(C) nested-own-org-chain audit, CONST-052 case audit, CONST-050(A) mock-from-production audit, CONST-035 anti-bluff smoke, etc.) against the post-pull tree. Failures populate the project's Issues tracker per §11.4.15 (Status: Reopened, Type: Bug); closure requires positive-evidence per §11.4.
Pull-time invocation: git submodule update --remote constitution triggers the sweep automatically (post-update hook OR commit-wrapper invocation). Operator-explicit manual invocation also available.
Anti-bluff: the sweep's own meta-test (paired mutation per §1.1) plants a known violation of each enforced gate and asserts the sweep reports FAIL for the planted gate. A sweep that exits PASS without running every implementable gate is a CONST-055 violation.
CONST-055 is the enforcement engine for every other §11.4.x and CONST-NNN rule — without it, new rules cascade as anchors but never get enforced.
Cascade requirement: This anchor (verbatim or by CONST-055 ID reference) MUST appear in every owned submodule's CONSTITUTION.md, CLAUDE.md, and AGENTS.md. Severity-equivalent to §11.4 PASS-bluff at the constitutional-enforcement layer. See constitution submodule Constitution.md §11.4.32 for the full mandate.
CONST-056: Mandatory install_upstreams on clone/add Mandate (cascaded from constitution submodule §11.4.36)
Verbatim user mandate (2026-05-15): "Every Submodule or Git repository we add or clone MUST BE upstreams installed using Upstreamable utility which MUST BE available through exported paths of the host system (in .bashrc or .zhrc) using install_upstreams command executed from the root of the cloned (added) repository - only if in it is Upstreams or upstreams directory present with bash script files (recipes) for all repository's upstreams!"
Every clone / add of a Git repository under HelixCode MUST be followed by install_upstreams invocation from the repository's root IF its tree contains upstreams/ (or legacy upstreams/ per CONST-052 transition) populated with *.sh recipe files. The utility (installed on operator's PATH via .bashrc/.zshrc; implementation in the constitution submodule's install_upstreams.sh — already supports BOTH directory names since constitution commit 45d3678) reads the recipe files, configures every declared upstream as a named git remote, and fans out origin push URLs.
Skipping the invocation when upstreams/ is present silently breaks §2.1 (multi-upstream push is the norm) — the next push lands on only one upstream. Gate CM-INSTALL-UPSTREAMS-ON-CLONE + paired mutation. Automation: the future incorporate-submodule per CONST-054 auto-invokes; manual invocation supported. Pre-commit check: git remote -v | grep -c push reports expected count.
Cascade requirement: This anchor (verbatim or by CONST-056 ID reference) MUST appear in every owned submodule's CONSTITUTION.md, CLAUDE.md, and AGENTS.md. See constitution submodule Constitution.md §11.4.36 for the full mandate.
Every project tracking work items by Type per §11.4.16 MUST close them with the Type-appropriate terminal **Status:** value, drawn from this 3-element closed map:
Item **Type:** |
Closure **Status:** value |
|---|---|
Bug |
Fixed (→ Fixed.md) |
Feature |
Implemented (→ Fixed.md) |
Task |
Completed (→ Fixed.md) |
The (→ Fixed.md) suffix is preserved across all three so the existing migration-discipline tooling (atomic Issues.md → Fixed.md move per §11.4.19) keeps working without per-Type branching. Generators (generate_issues_summary.sh, generate_fixed_summary.sh, the §11.4.23 colorizer) MUST treat the three terminal values as semantically equivalent (all "closed, positive evidence captured") while preserving the literal in the emitted document.
Closing a Feature with Fixed (→ Fixed.md) or a Task with Implemented (→ Fixed.md) is a CONST-057 violation. Gate CM-CLOSURE-VOCAB-TYPE-AWARE walks every Fixed.md heading + every Issues.md heading whose **Status:** is one of the three terminal values and asserts the Status-Type match. Composes with §11.4.15 / §11.4.16 / §11.4.19 / §11.4.23.
Cascade requirement: This anchor (verbatim or by CONST-057 ID reference) MUST appear in every owned submodule's CONSTITUTION.md, CLAUDE.md, and AGENTS.md. See constitution submodule Constitution.md §11.4.33 for the full mandate.
Every Issues.md (or equivalent project tracker) heading whose **Status:** is Reopened MUST carry, within 8 non-blank lines of the heading, a **Reopened-Details:** line capturing four sub-facts:
- By:
AIorUser(source-of-truth observer who flipped the status).AIcovers in-loop reopens (test failure, gate regression, captured-evidence retrospect).Usercovers operator-side observations (manual testing, end-user report, design reconsideration). - On: ISO date (
YYYY-MM-DD). - Reason: one-line cause classification — chosen from the closed vocabulary
{ test-failed | manual-testing-detected | captured-evidence-contradicts | end-user-report | cycle-re-discovered | design-reconsidered }. Other values permitted with explicitReason: <free text>annotation but the closed list MUST be tried first. - Evidence: path to or short description of the captured artefact justifying the reopen — log file, recording, gate failure ID, operator quote, etc. Reopens without evidence are §11.4.6 / §11.4.7 violations (demotion from Fixed requires captured evidence under the conditions that re-exposed the defect).
The Issues_Summary.md Status column MUST distinguish the four Reopened sub-states by source so a sweep query for "reopens by AI in the last 30 days" is mechanically possible. Suggested column rendering: Reopened (AI: test-failed) vs Reopened (User: manual-testing). Gate CM-ITEM-REOPENED-DETAILS mirrors CM-ITEM-OPERATOR-BLOCKED-DETAILS (§11.4.21 walk pattern). Composes with §11.4.6 / §11.4.7 / §11.4.15 / §11.4.21.
Cascade requirement: This anchor (verbatim or by CONST-058 ID reference) MUST appear in every owned submodule's CONSTITUTION.md, CLAUDE.md, and AGENTS.md. See constitution submodule Constitution.md §11.4.34 for the full mandate.
The constitution submodule's three files (constitution/Constitution.md, constitution/CLAUDE.md, constitution/AGENTS.md) ARE the canonical root (also called the parent files). They contain only universal rules per §11.4.17.
The consuming project's repository-root files (<project-root>/CLAUDE.md, <project-root>/AGENTS.md, optionally <project-root>/Constitution.md) are consumer extensions. They MUST start with the inheritance pointer (either the Claude-Code native @constitution/CLAUDE.md import or the portable ## INHERITED FROM constitution/CLAUDE.md heading). They contain only project-specific rules per §11.4.17.
When in doubt about which file to edit: universal rule → constitution submodule's file; project-specific rule → consumer's file. Default consumer-side when uncertain (§11.4.17 — narrower scope is cheap to widen).
Terminology: "the parent CLAUDE.md" / "the root Constitution" → constitution-submodule file at constitution/<filename>; "the project CLAUDE.md" / "this project's AGENTS.md" → consumer-side file at <project-root>/<filename>.
No silent demotion or silent promotion. Moving a rule between layers MUST be a visible commit — git mv of a section if it's a clean clone, or explicit Lifted from <project> to constitution per §11.4.35 / Demoted from constitution to <project> per §11.4.35 commit-message annotation.
Gate CM-CANONICAL-ROOT-CLARITY verifies (a) consumer's CLAUDE.md opens with the inheritance pointer, (b) constitution submodule's three files are present at the expected path, (c) no ## INHERITED FROM block in the constitution submodule's own files (those ARE the source-of-truth, not consumers). Composes with §11.4.17.
Cascade requirement: This anchor (verbatim or by CONST-059 ID reference) MUST appear in every owned submodule's CONSTITUTION.md, CLAUDE.md, and AGENTS.md. See constitution submodule Constitution.md §11.4.35 for the full mandate.
Verbatim user mandate (2026-05-15): "Make sure that feedback_fetch_before_edit memory rule is part of our constitution Submodule - the root Consitution, AGENTS.MD and CLAUDE.MD. Validate and verify that Proejct-Toolkit and all Submodules do inherit all of them! Follow the constitution Submodule documentation for details."
The FIRST git-touching action of every session, on every consuming project (owned or third-party), MUST be:
git fetch --all --prune
git log --oneline HEAD..@{u}
git submodule foreach --recursive 'git fetch --all --prune --quiet'If HEAD..@{u} is non-empty, integrate the upstream changes BEFORE any local edit. Acting on stale local state produces three failure modes documented in the originating §11.4.37 incident (multi-agent / parallel-session work): (1) redundant work — the agent re-does what a parallel session already finished, (2) false confidence — completion reports for already-done work, (3) divergent history — duplicate sibling commits that double the conflict surface on next push.
Anti-bluff invariant: the fetch+log check MUST produce captured evidence — the actual HEAD..@{u} output, even if empty. Skipping the check on the basis of "I just fetched" or "nothing could have changed in the last N minutes" is a §11.4.6 (no-guessing) violation: the remote state is not knowable without a fetch.
Cascade requirement: This anchor (verbatim or by CONST-060 ID reference) MUST appear in every owned submodule's CONSTITUTION.md, CLAUDE.md, and AGENTS.md. Severity-equivalent to §11.4 PASS-bluff at the parallel-session-coordination layer. See constitution submodule Constitution.md §11.4.37 for the full mandate.
§11.4.68 — Positive Sink-Side / Downstream Evidence Mandate (cascaded from constitution submodule §11.4.68)
Verbatim user mandate (2026-05-20): "We still do not hear any audio played from D3 device! Arvus Web Dashboard when we play music from D3 shows nothing for Codec In Use! This MUST BE investigated and fixed! How come we passed the tests with Arvus validation? What were values for the Codec In Use field? Empty means nothing! This is not working! It MUST BE FIXED, TESTED AND VERIFIED WITH FULL AUTOMATION TESTING ASAP!!!"
A test that asserts audio or video routing PASS MUST capture and verify positive sink-side or downstream evidence — never config-only, never metadata-only, never PCM-open-state-only. At least one of the closed enumeration MUST be captured for every audio/video routing PASS: (1) sink-side codec-state with non-empty Codec-In-Use matching the expected codec regex; (2) strictly-positive PCM frames-written delta from /proc/asound/.../status hw_ptr; (3) ALSA ELD/EDID-Like-Data showing negotiated channel count + format; (4) ffprobe-on-captured-mp4 with non-zero frame count + expected codec/resolution/fps; (5) recording-analyzer event match per §11.4.2/§11.4.5; (6) tinycap RMS amplitude above the line-level floor. Empty / <unreachable> / <N.E.> / <None> placeholders are NOT positive evidence; a missing-but-required sink is OPERATOR-BLOCKED (release-blocker), never SKIP, never PASS. No escape hatch — no --skip-sink-evidence, --allow-empty-codec, --sink-unreachable-is-pass, --metadata-only-suffices flag exists.
Cascade requirement: This anchor (verbatim or by §11.4.68 reference) MUST appear in every owned submodule's CONSTITUTION.md, CLAUDE.md, and AGENTS.md. Severity-equivalent to a §11.4 PASS-bluff at the sink-side-evidence layer.
Canonical authority: constitution submodule Constitution.md §11.4.68 for the full mandate.
Verbatim user mandate (2026-05-20): "Always do if possible Subagent-driven! Add this into our root (constitution Submodule) Constitution.md, CLAUDE.md and AGENTS.md. This should be the default choice ALWAYS!"
When executing implementation plans (or any task-decomposed execution flow), the default execution model is subagent-driven per superpowers:subagent-driven-development. Inline execution is permitted ONLY when (a) the task is trivial AND fits a single sub-300-line edit, OR (b) the operator explicitly requests inline at brainstorm-handoff time. Subagent-driven is the default because it gives isolated context per task, naturally enforces two-stage review, is parallel-PWU compatible (§11.4.58), creates an anti-bluff seam (§11.4), and survives operator absence. No escape hatch — --inline-execution-required, --no-subagents, --monolithic-execution are NOT permitted flags. Skipping subagent-driven for non-trivial work without recorded operator authorisation is itself a §11.4 PASS-bluff.
Cascade requirement: This anchor (verbatim or by §11.4.70 reference) MUST appear in every owned submodule's CONSTITUTION.md, CLAUDE.md, and AGENTS.md. Severity-equivalent to a §11.4 PASS-bluff at the execution-model layer.
Canonical authority: constitution submodule Constitution.md §11.4.70 for the full mandate.
§11.4.71 — Pre-Push Fetch + Investigate + Integrate Mandate (cascaded from constitution submodule §11.4.71)
Verbatim user mandate (2026-05-20): "before pushing changes to any upstream for any repository - main repo or Submodule, we MUST fetch and pull all changes. Once these are obtained WE MUST investigate what is different compared to head position we were on last time before fetching and pulling new changes! We MUST understand what is done and for what purpose, easpecially how that does affect our project and our System in general! Any mandatory changes or improvements required by fresh changes we just have brough in MUST BE incorporated, covered with all supported types of the tests which will produce as a result of its success execution REAL PROOFS of working for all componetns and functionalities covered and work fully in anti-bluff manner!"
The everyday-push variant of §11.4.41. EVERY push (every repository — main + every submodule) MUST follow the 5-step cycle: (1) fetch all remotes (git fetch --all --prune --tags, capture stdout); (2) pull all upstream branches whose tip differs, resolving conflicts per consumer judgment (never auto---ours/--theirs); (3) investigate the diff vs OUR previous HEAD — read EVERY foreign commit's body, understand what/why/how-it-affects-our-system; (4) integrate mandatory changes with §11.4.4(b) four-layer coverage + §11.4.43 TDD-fix discipline, every PASS carrying §11.4.5 captured-evidence (REAL PROOFS, not metadata-only); (5) only then push, verifying with git ls-remote post-push. No escape hatch — no --skip-fetch, --no-investigate, --fast-push, --trust-upstream flag.
Cascade requirement: This anchor (verbatim or by §11.4.71 reference) MUST appear in every owned submodule's CONSTITUTION.md, CLAUDE.md, and AGENTS.md. Severity-equivalent to a §11.4 PASS-bluff at the push-discipline layer.
Canonical authority: constitution submodule Constitution.md §11.4.71 for the full mandate.
Verbatim user mandate (2026-05-20): "Make sure all fixes for audio are always top priority in main working stream!"
The conductor (main working stream — Claude Code session, AI agent, or human operator) MUST treat audio fixes as the highest-priority class on the serial dispatch queue. Any time the conductor faces a choice between dispatching an audio task vs a non-audio task on the SAME serial resource, the audio task wins. Parallel BACKGROUND subagents (research, refactors, infrastructure documentation) MAY run concurrently with audio work but do NOT preempt audio on the main-stream serial dispatch queue. No escape hatch — there is no "but this non-audio task is faster" or "but this research is more interesting" override; audio-stack regressions are user-perceptible and high-impact while research and refactors can wait.
Cascade requirement: This anchor (verbatim or by §11.4.72 reference) MUST appear in every owned submodule's CONSTITUTION.md, CLAUDE.md, and AGENTS.md. Severity-equivalent to a process violation at the dispatch-priority layer.
Canonical authority: constitution submodule Constitution.md §11.4.72 for the full mandate.
§11.4.73 — Main-Specification Document Versioning + Revision Discipline (cascaded from constitution submodule §11.4.73)
Verbatim user mandate (2026-05-20): "Make sure everything we add now in previous and upcoming requests IS ALWAYS applied to the main specification — if we have one. Since all these are not major changes we could increase Specification version per change for secondary version instead of the primary. Primary version MUST BE increased for much bigger levels of changes! Add this into root (constitution Submodule) Constitution.md, CLAUDE.md and AGENTS.md as mandatory rule / constraint applicable ONLY IF we have something like the main specification document or we do recognize something like the main specification document. Document MUST BE updated ALWAYS to follow the versioning rules we are appling here + revision and other properties we have!"
Applies only when a project recognises a main specification document. When it does: (1) every additive operator requirement, refinement, or accepted recommendation MUST be applied to the spec before or as part of the implementing work; (2) spec versioning has two axes — primary (V1/V2/V3, bumped for major rewrites by explicit operator decision, old versions archived) and secondary (the §11.4.61 metadata-table Revision integer, bumped for every other change); (3) the metadata table MUST stay current (Revision, Last modified, Status summary, Fixed); (4) propagated copies of the rule MUST reference the active specification.V<primary>.md, not a stale archive; (5) on primary bump the old file moves to <spec-dir>/archive/ with Status: superseded. Classification: universal, applicable conditionally per the scope condition.
Cascade requirement: This anchor (verbatim or by §11.4.73 reference) MUST appear in every owned submodule's CONSTITUTION.md, CLAUDE.md, and AGENTS.md. Severity-equivalent to a release blocker when a project has a main spec and lets it drift.
Canonical authority: constitution submodule Constitution.md §11.4.73 for the full mandate.
§11.4.74 — Submodule-Catalogue-First Discovery + Extend-Don't-Reimplement (cascaded from constitution submodule §11.4.74)
Verbatim user mandate (2026-05-20): "We MUST ALWAYS check which already developed features / functionalities do exist as a part of our comprehensive Submodules catalogue located in vasic-digital and HelixDevelopment organizations on GitHub and GitLab both! Project MUST BE aware of all its existence so we do not implement same things multiple times if they are already done as some of existing universal, reusable general development purpose Submodules! For any missing features that some Submodules we incorporate may be missing we MUST IMPLEMENT the properly and extend those Submodules furter! We do control all of the and we CAN and MUST maintain and extend the regularly! All development cycle rules we have MUST BE applied to them and fully respected!"
Before scaffolding ANY new module, package, helper, or utility, the contributor (human or AI agent) MUST: (1) survey the canonical Submodule catalogue — vasic-digital and HelixDevelopment on both GitHub AND GitLab; (2) inventory existing Submodules; (3) reuse before reimplement — if a Submodule provides the functionality (or 80%+ of it), add it as a Git submodule rather than write fresh; (4) extend in-place when 80%+ matches but features are missing — add the missing features TO THAT SUBMODULE (PR upstream + bump pointer), never as a duplicating consuming-project helper; (5) apply all development-cycle rules to those Submodules; (6) document the survey result in the feature's tracker entry with a Catalogue-Check: field (reuse <org/repo>@<sha> / extend <org/repo>@<sha> / no-match <date>). Classification: universal.
Cascade requirement: This anchor (verbatim or by §11.4.74 reference) MUST appear in every owned submodule's CONSTITUTION.md, CLAUDE.md, and AGENTS.md. Severity-equivalent to a process violation; duplicate implementations landed without catalogue check are release blockers.
Canonical authority: constitution submodule Constitution.md §11.4.74 for the full mandate.