| project | riffstack | |||||
|---|---|---|---|---|---|---|
| type | design-doc | |||||
| status | vision | |||||
| date | 2025-11-23 | |||||
| author | Scott Senften | |||||
| keywords |
|
|||||
| beth_topics |
|
A composable, declarative language for harmonic movement, voice-leading, and musical structure
This document defines the vision for a Harmony DSL that integrates with RiffStack and MLIR to create a complete creative compiler stack. Instead of writing chord symbols, musicians describe harmonic motion, color, and voice-leading constraints in a playful, composable language.
The Big Idea: Harmony is movement, not labels. The DSL captures intent; the compiler solves the voice-leading SAT problem.
Integration:
- Harmony DSL → musical structure (what notes, what chords)
- RiffStack → audio structure (what timbres, what effects)
- MLIR → multi-level compilation (harmony IR → pitch IR → audio IR → DSP IR → machine code)
This creates a layered ecosystem unlike anything available today.
A great harmony DSL balances musical intuition, expressive power, and algorithmic clarity.
Principle: Harmony is about motion, not symbols.
Instead of:
Cmin11 → F7 → Bbmaj9
Write what actually matters:
start Cmin11.lush
+5
-2
Why this matters:
- Captures the feeling of harmony (tension, release, gravity)
- Generalizes across keys, modes, tuning systems
- Focuses on musical intent, not notation
Motion Operators:
+N/-N- semitone motionT- tritone substitution5- fifth up=- stay on root (reharmonize)
Principle: Describe what you want, not how to construct it.
min11.lush.smooth
maj9.dark.spread
7#11.bright.common
This tells the engine:
- Chord flavor (
min11,maj9,7#11) - Emotional mood (
lush,dark,bright) - Voicing constraints (
smooth,spread,common)
The engine picks exact voicings that satisfy constraints.
This is the harmonic equivalent of RiffStack's:
pluck |> reverb |> lowpass 1200
Both are declarative pipelines describing intent.
Principle: The DSL knows how voices want to move.
Constraints:
smooth- minimize motion (≤ whole step per voice)common- maximize common tonescontrary- voices oppose bass motionspread- wide voicingstight- close voicingsrotated- rotate highest voice to bottom
Why this is powerful:
- Composer focuses on intent
- DSL solves the voice-leading SAT problem
- Mirrors RiffStack philosophy: "Describe intent, compiler figures out execution"
Principle: Harmony is nothing without rhythm.
Modifiers:
sync(1/8) // sync to grid
stab // short, accented
hold // sustain
anticipate(1/16) // early by 1/16
roll // arpeggiated entry
arp(up) // ascending arpeggio
Example:
Cmin11.lush(smooth).sync(1/8)
+1.bright(anticipate).stab
T.dark(spread).hold
-3.lush(arp(up))
This transforms the DSL from "chord generator" to performance language.
Principle: Small pieces combine freely.
Building blocks:
- Motion primitives (
+1,-3,T) - Voicing traits (
smooth,spread) - Chord colors (
min11,maj9,sus4) - Rhythmic transforms (
sync,anticipate,arp) - Generative rules (
choose,repeat,invert)
Example composition:
Cmin11.lush(smooth)
+1.bright(anticipate)
T.dark(spread)
-3.lush(arp(up))
Every element is a tiny unit you can stack.
Principle: Work at every level of musical structure.
Scales:
Chord → single harmony
Phrase → 2-8 chord progression
Progression → full section (verse, chorus)
Section → song component
Song → complete composition
Just like:
- Hydra works on multiple time scales (feedback → layer → composition)
- MLIR works on multiple IR levels (high → mid → low → machine)
Example:
verse:
phrase:
- Cmin11.lush
- +5.bright
- -2.dark
repeat: 2
chorus:
phrase:
- Fmaj9.spread
- T.lush
- =.bright(arp)
repeat: 4| Category | Examples | Purpose |
|---|---|---|
| Root Motion | +1, -3, T, 5, = |
Harmonic movement |
| Chord Quality | maj7, min11, 7#11, sus4, ø |
Chord flavor |
| Voice-Leading | smooth, common, contrary, spread, tight |
Voice behavior |
| Color/Mood | lush, bright, dark, empty, rich |
Emotional quality |
| Rhythm | sync(8th), arp(up), stab, roll, hold |
Temporal feel |
| Structure | loop 4, repeat 2, invert, reharmonize |
Compositional control |
| Generative | choose([+5, -2], weight=[0.7, 0.3]) |
Controlled randomness |
RiffStack handles audio structure. Harmony DSL handles musical structure.
They slot together like this:
Harmony DSL → note streams → RiffStack synths/patches → audio
| Harmony DSL | RiffStack |
|---|---|
| "What notes and chords?" | "What timbres and effects?" |
| Movement, tension, voicing | Oscillators, filters, loops |
| Rhythm modifiers | Envelopes, loops, patterns |
| High-level musical semantics | Sample/block-level DSP |
They're two sides of the same creative language stack.
- Harmony DSL says what to play
- RiffStack says how it sounds
Both are:
- ✅ Declarative
- ✅ Composable
- ✅ Playful
- ✅ LEGO-like
MLIR makes this dual-language system feasible and high-performance.
Define ops like:
%root = theory.root "C4"
%next = theory.move %root, motion = "+1"
%chord = theory.color %next, quality = "min11", mood = "lush"
%voiced = theory.voicelead %chord, policy = "smooth"
%pattern = theory.rhythm %voiced, mode = "sync(1/8)"This is a semantic graph of musical intent.
%osc = audio.sine %pitch
%env = audio.adsr %osc
%fx = audio.reverb %env, time = 0.3
%out = audio.mix %fx┌─────────────────────────────────────┐
│ Harmony IR (musical intent) │
│ theory.move, theory.voicelead │
└─────────────────┬───────────────────┘
│ lowering pass
↓
┌─────────────────────────────────────┐
│ Pitch/Note IR (generated notes) │
│ note.pitch, note.duration │
└─────────────────┬───────────────────┘
│ lowering pass
↓
┌─────────────────────────────────────┐
│ Audio IR (synthesis ops) │
│ audio.sine, audio.reverb │
└─────────────────┬───────────────────┘
│ lowering pass
↓
┌─────────────────────────────────────┐
│ DSP IR (vector ops) │
│ vector.fma, vector.shuffle │
└─────────────────┬───────────────────┘
│ lowering pass
↓
┌─────────────────────────────────────┐
│ LLVM / SPIR-V (machine code) │
│ CPU/GPU target code │
└─────────────────┬───────────────────┘
│
↓
┌─────────────────────────────────────┐
│ Real-time audio runtime │
└─────────────────────────────────────┘
This is EXACTLY what MLIR is built for:
- ✅ Multi-level
- ✅ Cross-device
- ✅ Declarative
- ✅ Multi-dialect
- ✅ Pass-driven lowering
You get a full creative compiler.
No other tool allows deep harmony exploration with composable primitives.
Existing tools:
- Ableton - great for loops, weak for harmony
- Max/MSP - visual patching, not harmonic language
- SuperCollider - powerful synthesis, clunky harmony
- Hooktheory - educational, not performative
This DSL - playful, composable, performative harmony language.
RiffStack handles the sound while Harmony DSL handles the music.
Together they form a complete creative stack.
MLIR makes it run everywhere:
- CPU (native, fast)
- GPU (parallel, massive)
- WebGPU (browser, accessible)
- Mobile (iOS/Android)
One language, multiple backends.
This combination gives you:
| Feature | Source |
|---|---|
| Songwriting power | Ableton |
| Playfulness | Hydra |
| Synthesis depth | SuperCollider |
| Performance | LLVM-level compilation |
| Composability | Max/MSP |
| Plus: Readable, shareable, fun musical language | Unique |
verse:
start: Cmin11.lush(smooth)
progression:
- +5.bright(anticipate)
- -2.dark(spread)
- T.lush(arp(up))
repeat: 2
chorus:
start: Fmaj9.spread(common)
progression:
- +1.bright(stab)
- =.rich(hold)
- -3.lush(sync(1/8))
repeat: 4instruments:
- id: pad
type: synth
expr: "sine $pitch 0.6 chorus 2.0 0.5 reverb 0.8 play"
- id: bass
type: synth
expr: "sine $pitch 0.8 lowpass 400 0.7 play"
tracks:
- instrument: pad
source: harmony.verse
- instrument: bass
source: harmony.verse.bass_notes// Harmony layer
%root = theory.root "C4"
%chord1 = theory.color %root, quality = "min11", mood = "lush"
%voiced1 = theory.voicelead %chord1, policy = "smooth"
// Pitch layer
%pitches = note.extract %voiced1
%bass = note.bass %pitches
%pad = note.all %pitches
// Audio layer
%osc_bass = audio.sine %bass, amp = 0.8
%filt_bass = audio.lowpass %osc_bass, cutoff = 400
%osc_pad = audio.sine %pad, amp = 0.6
%fx_pad = audio.reverb %osc_pad, time = 0.8
%mix = audio.mix [%filt_bass, %fx_pad]
audio.out %mixPress play → hears actual music with:
- Smooth voice-leading
- Rich pad textures
- Tight bass lines
- CPU/GPU optimized DSP
Deliverables:
- Motion operator syntax (
+1,-3,T) - Chord quality macros (
min11,maj9) - Basic voice-leading solver (smooth, common)
- YAML parser for progressions
- MIDI output (proof-of-concept)
Outcome: Can write progressions, hear MIDI playback
Deliverables:
- Harmony → RiffStack bridge
- Parameter binding (
$pitchfrom harmony) - Multi-track rendering
- Example compositions (verse/chorus/bridge)
Outcome: Full audio synthesis from harmony DSL
Deliverables:
- Define
theoryMLIR dialect - Lowering passes (harmony → pitch → audio)
- LLVM backend targeting
- Performance optimization
Outcome: Compiled, optimized audio engine
Deliverables:
- Rhythm modifiers (arp, stab, roll)
- Generative rules (choose, randomize)
- Alternative tunings (microtonal, just intonation)
- Live performance mode (MIDI control, looping)
Outcome: Production-ready creative tool
- David Cope - EMI algorithmic composition
- William Schottstaedt - Common Music
- Miller Puckette - Max/MSP, Pure Data
- Ableton - Live looping, clip launching
- Hooktheory - Theory-aware composition
- Captain Plugins - Chord progression tools
- TidalCycles - Pattern-based livecoding
- Sonic Pi - Educational music language
- SuperCollider - Synthesis language
- No tool combines: harmony theory + audio synthesis + compiler optimization
- No language is: composable, playful, AND harmonically intelligent
- No system targets: multiple backends (CPU/GPU/WebGPU) with one DSL
- Syntax design - Finalize harmony DSL grammar
- Voice-leading solver - Implement constraint satisfaction
- MLIR dialect - Define
theoryops and types - Example compositions - Demonstrate expressiveness
- Should root motion be relative or absolute?
- How to handle alternative tunings (12-TET vs microtonal)?
- YAML vs custom syntax vs Python DSL?
- Constraint solver: SAT/SMT vs heuristic?
- Can we voice-lead ii-V-I smoothly?
- Can we generate jazz reharmonization?
- Can we compile to real-time audio at <10ms latency?
- Can we run on WebGPU in browser?
Technical:
- Harmony DSL → MIDI in <100ms
- Voice-leading quality matches human arrangers
- Compiled audio latency <10ms
- Runs on CPU/GPU/WebGPU
Creative:
- Musicians create full compositions in DSL
- Livecoding performances using harmony language
- Educational adoption (music theory + programming)
Business:
- 1000+ users experiment with DSL
- 100+ paying users (SaaS/education tier)
- Integration requests from DAW vendors
- RiffStack README - Audio synthesis system
- RiffStack Architecture Patterns - Design philosophy
- MLIR Architecture - Compiler design (to be created)
- Execution Brief - Business analysis
Last Updated: 2025-11-23 Status: Vision document Next: Syntax specification + voice-leading solver prototype