Personal / operator reference: stem separation + in-browser mixer + export, shipped as a SPA on React (Vite) with a Node (Express) API, Python FastAPI inference service, optional S3 stem delivery, Clerk auth, and Stripe subscriptions / usage tokens.
End users of the public site do not read this repo; this file is for direction and deploy consistency (especially EC2 + Docker Compose builds).
| Layer | Role |
|---|---|
frontend/ |
Upload, plan gating, polling, waveforms, mixer (Web Audio), export (WAV, MP3, ZIP of job stems; optional server master when env flags allow). Audio-to-MIDI conversion UI with piano roll, batch conversion, and interactive note editor. Clerk + Stripe.js. |
backend/ |
Auth/usage, proxy to stem/speech/midi services, /api/stems/file, presigned S3 redirects, billing webhooks, malware scan hooks, rate limits, optional POST /api/stems/server-export. |
stem_service/ |
FastAPI (port 5000): 2-stem default, expand to 4, quality modes, SCNet / hybrid Demucs paths (see docs/stem-pipeline.md — single source of truth for routing). Optional S3 upload after job. |
speech_service/ |
FastAPI (port 5001): LavaSR-based speech enhancement/denoising. Single-worker async queue, CPU inference (PyTorch). Requires model weights in speech_models/. |
midi_service/ |
FastAPI (port 5002): Audio-to-MIDI transcription via Spotify Basic Pitch. Single-worker async queue, CPU inference (TensorFlow/ONNX). Model bundled in pip package — no external weights needed. |
docker-compose.yml |
Local / EC2: frontend (nginx → host 5173), backend, stem_service, speech_service, midi_service, shared tmp/stems, tmp/speech, tmp/midi, plus bind mounts ./models → /repo/models, ./server_models → /repo/server_models, and ./speech_models → /repo/speech_models. |
models/ vs server_models/ |
Weights are not baked into Docker images (see .dockerignore — both dirs omitted from context). Inference reads whichever subtree STEM_MODELS_DIR names; see § Models layout below + docs/MODELS-INVENTORY.md. |
Not in the Compose path: stem_api/ (Rust) — archived experiment; stem_api/README.md · docs/archive/IMPLEMENTATION-HYBRID.md.
Satellite Vite apps (not bundled in main Compose):
burnt-beats-pricing-structure/— standalone pricing/transparency (subscriptions, pay‑as‑you‑go, packs) for users who bounce before signup; see itsREADME.md.gamer_tag/— casual block-dropping mini-game (Tetris‑like; rename before broader marketing); seeREADME.md.
-
models/— canonical workstation treeFilled only where you dev via
scripts/copy-models.shfrom your huge upstream stem-models bank. That upstream tree can reach roughly ~100 GiB;copy-models.shcopies a reduced set into./models— yetmodels/can still become very large. Never rsync/sync the upstream bank—or an entire bloated./models—to EC2 “just because it exists”; ship a curated tree instead (next bullet). -
server_models/— curated runtime tree used on UbuntuBuild on the workstation:
python scripts/export_server_models.py. That script always resolves exports from./models(see its docstring—it pinsSTEM_MODELS_DIR=modelsinternally), then emitsserver_models/with exactly what inference needs.Typical layout on disk:
D:\burntbeats-aws\server_models(Windows dev) mirrored to/home/ubuntu/burntbeats-aws/server_modelson the instance.server_models/is gitignored—you maintain it per host. -
Container selection
stem_service/config.pyresolves weights underREPO_ROOT / $STEM_MODELS_DIR(POSIX path inside Compose:/repo/modelsor/repo/server_models). Compose defaultsSTEM_MODELS_DIR=models(docker-compose.yml); recommended for AWS: setSTEM_MODELS_DIR=server_modelsin root.envso prod never depends on workstation-only bulk. -
Reference docs
docs/MODEL-LAYOUT.md— directory / script / Docker layoutdocs/MODEL-PATH-AND-SELECTION-INVESTIGATION-2026-04-15.md— runtime path auditdocs/DEPLOY-DOCKER-EC2.md— EC2 Compose (§ Models)
- Browser →
POST /api/stems/split(Node: auth, metering, upload verify) →stem_service. - Stem service returns 202 +
job_id; work runs asynchronously (queued concurrency configurable). - Browser polls
GET /api/stems/status/:job_id. - Stems load via
GET /api/stems/file/:job_id/:stemId.wav(disk stream or S3 proxy whenprogress.jsonhass3metadata). Seedocs/STEM-S3-AND-CPU.md. - Mix / export in browser; see
docs/ARCHITECTURE-FLOW.mdfor client vs optional server export.
- Browser →
POST /api/speech/enhance(Node: auth, metering, upload verify) →speech_service. - Speech service returns 202 +
job_id; LavaSR inference runs asynchronously. - Browser polls
GET /api/speech/status/:job_id. - Enhanced audio via
GET /api/speech/file/:job_id/enhanced.wav.
- Browser →
POST /api/midi/convert(Node: auth, metering, upload or stem reference) →midi_service. - MIDI service returns 202 +
job_id; Basic Pitch inference runs asynchronously (2-8s typical). - Browser polls
GET /api/midi/status/:job_id(includes piano roll note data on completion). - MIDI file via
GET /api/midi/file/:job_id/output.mid. - Optional:
POST /api/midi/mergecombines multiple completed jobs into a multi-track MIDI Type 1 file. - Optional export orchestration:
POST /api/midi/exportwithmode,selected_stems, andsource_jobs.- Returns 202 +
export_idandexport_token. - Poll
GET /api/midi/export/status/:export_id. - Download archive via
GET /api/midi/export/file/:export_id/stems.zip(v1mode=stems).
Generated MIDI job artifacts under tmp/midi/<job_id>/ also include metadata.json with conversion settings, note analysis, and an additive midi_file_analysis subtree derived from the emitted output.mid file.
From repo root:
docker compose build
docker compose up -d
docker compose psHealth checks:
curl -fsS http://127.0.0.1:5173/api/health
curl -fsS http://127.0.0.1:5000/health
curl -fsS http://127.0.0.1:5001/health
curl -fsS http://127.0.0.1:5002/health- Frontend (nginx):
127.0.0.1:5173— same-origin/api/*is reverse-proxied to the backend container. - Backend (Express):
127.0.0.1:3001(published in defaultdocker-compose.ymlfor localhost debugging; production often hides this behind the edge proxy only). - Stem service:
127.0.0.1:5000 - Speech service:
127.0.0.1:5001(requiresspeech_models/volume mount) - MIDI service:
127.0.0.1:5002(model bundled in package — no external weights)
Scripts under scripts/ (run from repo root, bash):
bash scripts/run-all-local.sh(stem + speech + midi + backend + frontend)bash scripts/run-stem-service.shbash scripts/run-speech-service.sh(port 5001)bash scripts/run-midi-service.sh(port 5002)bash scripts/run-backend.shbash scripts/run-frontend.sh
Helpers:
bash scripts/check-models.shbash scripts/check-segments.shbash scripts/test-stem-splits.sh
Primary file for Compose: root .env (see each service’s .env.example where present).
| Area | Examples |
|---|---|
Frontend build (VITE_*) |
VITE_CLERK_PUBLISHABLE_KEY, VITE_STRIPE_PUBLISHABLE_KEY, optional VITE_API_BASE_URL, VITE_GA_MEASUREMENT_ID |
| Auth | CLERK_SECRET_KEY, Clerk webhook signing secret |
| Billing | STRIPE_SECRET_KEY, STRIPE_WEBHOOK_SECRET, STRIPE_PRICE_ID_* |
| Metering | USAGE_TOKENS_ENABLED |
| Job hardening | JOB_TOKEN_SECRET (per-job x-job-token), optional API_KEY |
| Speech service | SPEECH_SERVICE_API_TOKEN (service-to-service auth), SPEECH_MAX_UPLOAD_MB |
| MIDI service | MIDI_SERVICE_API_TOKEN (service-to-service auth), MIDI_MAX_QUEUE_DEPTH, MIDI_TOKEN_COST |
| Optional server master export | SERVER_EXPORT_ENABLED=1 (backend) · VITE_SERVER_EXPORT_ENABLED=1 (frontend build) — docs/ARCHITECTURE-FLOW.md. Default Compose does not enable this. |
| S3 | S3_ENABLED, bucket/region/keys, S3_DELETE_LOCAL_AFTER_UPLOAD; bucket CORS if browsers fetch presigned URLs |
Important behaviors
- Split/expand require Clerk when
USAGE_TOKENS_ENABLED=1(seebackend/server.jsstartup checks in production). JOB_TOKEN_SECRETbinds status/file reads to signed tokens when set.API_KEY, if set, gates administrative routes / gateway auth per server config.
Typical loop (Ubuntu + Docker):
git pull --ff-only origin main
docker compose build --no-cache backend frontend stem_service speech_service midi_service
docker compose up -d
docker compose psDetails:
docs/DEPLOY-DOCKER-EC2.mddocs/DEPLOY-SERVER-BUNDLE.mddocs/DEPLOY-MARKETING-SITE.md(separate pricing site package)
Pre-flight: docs/PRODUCTION-READINESS-CHECKLIST.md · docs/SANITY-CHECKS.md.
- Never commit real
.envsecrets. - Rotate keys if they appear in logs.
- Keep stem service and temp dirs on trusted networks; enforce TLS at the edge for production.
Canonical “what runs”
- This
README.md— stack + deploy overview docs/ARCHITECTURE-FLOW.md— request path, export, billing hooks, opsdocs/stem-pipeline.md— model routing & quality modes
Index of everything else: docs/README.md
Env & integrations: docs/ENVIRONMENT-MATRIX.md · Legal / policy routing: docs/LEGAL-LAYOUT.md
Plans & backlog (not source of truth for behavior): docs/roadmap/
Benchmark tables (human-maintained): docs/benchmarks/
Archived investigations: docs/archive/
If a document contradicts stem-pipeline.md, ARCHITECTURE-FLOW.md, or this README for current runtime behavior, treat it as stale unless it lives under docs/roadmap/ or docs/research/ by design.