project

riffstack

type

design-doc

status

vision

date

2025-11-23

author

Scott Senften

keywords

harmony-dsl

music-theory

voice-leading

mlir

compiler-design

beth_topics

riffstack-harmony

music-dsl

mlir-audio

🎼 Harmony DSL Vision: Musical Intent as Code

A composable, declarative language for harmonic movement, voice-leading, and musical structure

Executive Summary

This document defines the vision for a Harmony DSL that integrates with RiffStack and MLIR to create a complete creative compiler stack. Instead of writing chord symbols, musicians describe harmonic motion, color, and voice-leading constraints in a playful, composable language.

The Big Idea: Harmony is movement, not labels. The DSL captures intent; the compiler solves the voice-leading SAT problem.

Integration:

Harmony DSL → musical structure (what notes, what chords)
RiffStack → audio structure (what timbres, what effects)
MLIR → multi-level compilation (harmony IR → pitch IR → audio IR → DSP IR → machine code)

This creates a layered ecosystem unlike anything available today.

I. Core Principles

A great harmony DSL balances musical intuition, expressive power, and algorithmic clarity.

1. Root-Motion–First Thinking (Movement > Labels)

Principle: Harmony is about motion, not symbols.

Instead of:

Cmin11 → F7 → Bbmaj9

Write what actually matters:

start Cmin11.lush
+5
-2

Why this matters:

Captures the feeling of harmony (tension, release, gravity)
Generalizes across keys, modes, tuning systems
Focuses on musical intent, not notation

Motion Operators:

+N / -N - semitone motion
T - tritone substitution
5 - fifth up
= - stay on root (reharmonize)

2. Declarative Chord Color + Voicing Rules

Principle: Describe what you want, not how to construct it.

min11.lush.smooth
maj9.dark.spread
7#11.bright.common

This tells the engine:

Chord flavor (min11, maj9, 7#11)
Emotional mood (lush, dark, bright)
Voicing constraints (smooth, spread, common)

The engine picks exact voicings that satisfy constraints.

This is the harmonic equivalent of RiffStack's:

pluck |> reverb |> lowpass 1200

Both are declarative pipelines describing intent.

3. Automatic Voice-Leading Constraints (Built-in Intelligence)

Principle: The DSL knows how voices want to move.

Constraints:

smooth - minimize motion (≤ whole step per voice)
common - maximize common tones
contrary - voices oppose bass motion
spread - wide voicings
tight - close voicings
rotated - rotate highest voice to bottom

Why this is powerful:

Composer focuses on intent
DSL solves the voice-leading SAT problem
Mirrors RiffStack philosophy: "Describe intent, compiler figures out execution"

4. Rhythm + Motion Modifiers

Principle: Harmony is nothing without rhythm.

Modifiers:

sync(1/8)      // sync to grid
stab           // short, accented
hold           // sustain
anticipate(1/16) // early by 1/16
roll           // arpeggiated entry
arp(up)        // ascending arpeggio

Example:

Cmin11.lush(smooth).sync(1/8)
+1.bright(anticipate).stab
T.dark(spread).hold
-3.lush(arp(up))

This transforms the DSL from "chord generator" to performance language.

5. Composable Mini-Languages (LEGO-Style Design)

Principle: Small pieces combine freely.

Building blocks:

Motion primitives (+1, -3, T)
Voicing traits (smooth, spread)
Chord colors (min11, maj9, sus4)
Rhythmic transforms (sync, anticipate, arp)
Generative rules (choose, repeat, invert)

Example composition:

Cmin11.lush(smooth)
+1.bright(anticipate)
T.dark(spread)
-3.lush(arp(up))

Every element is a tiny unit you can stack.

6. Multi-Scale: From Chords → Phrases → Sections → Songs

Principle: Work at every level of musical structure.

Scales:

Chord      → single harmony
Phrase     → 2-8 chord progression
Progression → full section (verse, chorus)
Section    → song component
Song       → complete composition

Just like:

Hydra works on multiple time scales (feedback → layer → composition)
MLIR works on multiple IR levels (high → mid → low → machine)

Example:

verse:
  phrase:
    - Cmin11.lush
    - +5.bright
    - -2.dark
  repeat: 2

chorus:
  phrase:
    - Fmaj9.spread
    - T.lush
    - =.bright(arp)
  repeat: 4

II. Key Features Specification

Essential Operators

Category	Examples	Purpose
Root Motion	`+1`, `-3`, `T`, `5`, `=`	Harmonic movement
Chord Quality	`maj7`, `min11`, `7#11`, `sus4`, `ø`	Chord flavor
Voice-Leading	`smooth`, `common`, `contrary`, `spread`, `tight`	Voice behavior
Color/Mood	`lush`, `bright`, `dark`, `empty`, `rich`	Emotional quality
Rhythm	`sync(8th)`, `arp(up)`, `stab`, `roll`, `hold`	Temporal feel
Structure	`loop 4`, `repeat 2`, `invert`, `reharmonize`	Compositional control
Generative	`choose([+5, -2], weight=[0.7, 0.3])`	Controlled randomness

III. Integration with RiffStack

RiffStack handles audio structure. Harmony DSL handles musical structure.

They slot together like this:

Harmony DSL → note streams → RiffStack synths/patches → audio

The Connection

Harmony DSL	RiffStack
"What notes and chords?"	"What timbres and effects?"
Movement, tension, voicing	Oscillators, filters, loops
Rhythm modifiers	Envelopes, loops, patterns
High-level musical semantics	Sample/block-level DSP

They're two sides of the same creative language stack.

Harmony DSL says what to play
RiffStack says how it sounds

Both are:

✅ Declarative
✅ Composable
✅ Playful
✅ LEGO-like

IV. MLIR Architecture: The Unified Creative Compiler

MLIR makes this dual-language system feasible and high-performance.

1. Harmony DSL as MLIR Dialect

Define ops like:

%root = theory.root "C4"
%next = theory.move %root, motion = "+1"
%chord = theory.color %next, quality = "min11", mood = "lush"
%voiced = theory.voicelead %chord, policy = "smooth"
%pattern = theory.rhythm %voiced, mode = "sync(1/8)"

This is a semantic graph of musical intent.

2. RiffStack as Audio Dialect

%osc = audio.sine %pitch
%env = audio.adsr %osc
%fx = audio.reverb %env, time = 0.3
%out = audio.mix %fx

3. Lowering Pipeline: Multi-Level Compilation

┌─────────────────────────────────────┐
│ Harmony IR (musical intent)         │
│   theory.move, theory.voicelead     │
└─────────────────┬───────────────────┘
                  │ lowering pass
                  ↓
┌─────────────────────────────────────┐
│ Pitch/Note IR (generated notes)     │
│   note.pitch, note.duration         │
└─────────────────┬───────────────────┘
                  │ lowering pass
                  ↓
┌─────────────────────────────────────┐
│ Audio IR (synthesis ops)            │
│   audio.sine, audio.reverb          │
└─────────────────┬───────────────────┘
                  │ lowering pass
                  ↓
┌─────────────────────────────────────┐
│ DSP IR (vector ops)                 │
│   vector.fma, vector.shuffle        │
└─────────────────┬───────────────────┘
                  │ lowering pass
                  ↓
┌─────────────────────────────────────┐
│ LLVM / SPIR-V (machine code)        │
│   CPU/GPU target code               │
└─────────────────┬───────────────────┘
                  │
                  ↓
┌─────────────────────────────────────┐
│ Real-time audio runtime             │
└─────────────────────────────────────┘

This is EXACTLY what MLIR is built for:

✅ Multi-level
✅ Cross-device
✅ Declarative
✅ Multi-dialect
✅ Pass-driven lowering

You get a full creative compiler.

V. Why This Is Unique and Worth Building

🎶 Gives Musicians a Playground for Harmony

No other tool allows deep harmony exploration with composable primitives.

Existing tools:

Ableton - great for loops, weak for harmony
Max/MSP - visual patching, not harmonic language
SuperCollider - powerful synthesis, clunky harmony
Hooktheory - educational, not performative

This DSL - playful, composable, performative harmony language.

🎛️ Gives Sound Designers a Patch Language That Slots In

RiffStack handles the sound while Harmony DSL handles the music.

Together they form a complete creative stack.

🧮 Gives Developers a Portable, Multi-Device IR

MLIR makes it run everywhere:

CPU (native, fast)
GPU (parallel, massive)
WebGPU (browser, accessible)
Mobile (iOS/Android)

One language, multiple backends.

🔥 Creates a Musical System Like Nothing Else Today

This combination gives you:

Feature	Source
Songwriting power	Ableton
Playfulness	Hydra
Synthesis depth	SuperCollider
Performance	LLVM-level compilation
Composability	Max/MSP
Plus: Readable, shareable, fun musical language	Unique

VI. Example: Full Stack in Action

Harmony DSL (Musical Intent)

verse:
  start: Cmin11.lush(smooth)
  progression:
    - +5.bright(anticipate)
    - -2.dark(spread)
    - T.lush(arp(up))
  repeat: 2

chorus:
  start: Fmaj9.spread(common)
  progression:
    - +1.bright(stab)
    - =.rich(hold)
    - -3.lush(sync(1/8))
  repeat: 4

RiffStack Patch (Audio Structure)

instruments:
  - id: pad
    type: synth
    expr: "sine $pitch 0.6 chorus 2.0 0.5 reverb 0.8 play"

  - id: bass
    type: synth
    expr: "sine $pitch 0.8 lowpass 400 0.7 play"

tracks:
  - instrument: pad
    source: harmony.verse

  - instrument: bass
    source: harmony.verse.bass_notes

Generated MLIR (Compiler IR)

// Harmony layer
%root = theory.root "C4"
%chord1 = theory.color %root, quality = "min11", mood = "lush"
%voiced1 = theory.voicelead %chord1, policy = "smooth"

// Pitch layer
%pitches = note.extract %voiced1
%bass = note.bass %pitches
%pad = note.all %pitches

// Audio layer
%osc_bass = audio.sine %bass, amp = 0.8
%filt_bass = audio.lowpass %osc_bass, cutoff = 400

%osc_pad = audio.sine %pad, amp = 0.6
%fx_pad = audio.reverb %osc_pad, time = 0.8

%mix = audio.mix [%filt_bass, %fx_pad]
audio.out %mix

Runtime (Real Audio)

Press play → hears actual music with:

Smooth voice-leading
Rich pad textures
Tight bass lines
CPU/GPU optimized DSP

VII. Roadmap

Phase 1: Harmony DSL Prototype (Months 1-3)

Deliverables:

Motion operator syntax (+1, -3, T)
Chord quality macros (min11, maj9)
Basic voice-leading solver (smooth, common)
YAML parser for progressions
MIDI output (proof-of-concept)

Outcome: Can write progressions, hear MIDI playback

Phase 2: RiffStack Integration (Months 3-6)

Deliverables:

Harmony → RiffStack bridge
Parameter binding ($pitch from harmony)
Multi-track rendering
Example compositions (verse/chorus/bridge)

Outcome: Full audio synthesis from harmony DSL

Phase 3: MLIR Compilation (Months 6-12)

Deliverables:

Define theory MLIR dialect
Lowering passes (harmony → pitch → audio)
LLVM backend targeting
Performance optimization

Outcome: Compiled, optimized audio engine

Phase 4: Advanced Features (Months 12-18)

Deliverables:

Rhythm modifiers (arp, stab, roll)
Generative rules (choose, randomize)
Alternative tunings (microtonal, just intonation)
Live performance mode (MIDI control, looping)

Outcome: Production-ready creative tool

VIII. Related Work & Inspiration

Academic

David Cope - EMI algorithmic composition
William Schottstaedt - Common Music
Miller Puckette - Max/MSP, Pure Data

Commercial

Ableton - Live looping, clip launching
Hooktheory - Theory-aware composition
Captain Plugins - Chord progression tools

Open Source

TidalCycles - Pattern-based livecoding
Sonic Pi - Educational music language
SuperCollider - Synthesis language

What's Missing (Our Opportunity)

No tool combines: harmony theory + audio synthesis + compiler optimization
No language is: composable, playful, AND harmonically intelligent
No system targets: multiple backends (CPU/GPU/WebGPU) with one DSL

IX. Next Steps

Immediate Actions

Syntax design - Finalize harmony DSL grammar
Voice-leading solver - Implement constraint satisfaction
MLIR dialect - Define theory ops and types
Example compositions - Demonstrate expressiveness

Questions to Answer

Should root motion be relative or absolute?
How to handle alternative tunings (12-TET vs microtonal)?
YAML vs custom syntax vs Python DSL?
Constraint solver: SAT/SMT vs heuristic?

Validation Experiments

Can we voice-lead ii-V-I smoothly?
Can we generate jazz reharmonization?
Can we compile to real-time audio at <10ms latency?
Can we run on WebGPU in browser?

X. Success Metrics

Technical:

Harmony DSL → MIDI in <100ms
Voice-leading quality matches human arrangers
Compiled audio latency <10ms
Runs on CPU/GPU/WebGPU

Creative:

Musicians create full compositions in DSL
Livecoding performances using harmony language
Educational adoption (music theory + programming)

Business:

1000+ users experiment with DSL
100+ paying users (SaaS/education tier)
Integration requests from DAW vendors

FilesExpand file tree

HARMONY_DSL_VISION.md

Latest commit

History

HARMONY_DSL_VISION.md

File metadata and controls

🎼 Harmony DSL Vision: Musical Intent as Code

Executive Summary

I. Core Principles

1. Root-Motion–First Thinking (Movement > Labels)

2. Declarative Chord Color + Voicing Rules

3. Automatic Voice-Leading Constraints (Built-in Intelligence)

4. Rhythm + Motion Modifiers

5. Composable Mini-Languages (LEGO-Style Design)

6. Multi-Scale: From Chords → Phrases → Sections → Songs

II. Key Features Specification

Essential Operators

III. Integration with RiffStack

The Connection

IV. MLIR Architecture: The Unified Creative Compiler

1. Harmony DSL as MLIR Dialect

2. RiffStack as Audio Dialect

3. Lowering Pipeline: Multi-Level Compilation

V. Why This Is Unique and Worth Building

🎶 Gives Musicians a Playground for Harmony

🎛️ Gives Sound Designers a Patch Language That Slots In

🧮 Gives Developers a Portable, Multi-Device IR

🔥 Creates a Musical System Like Nothing Else Today

VI. Example: Full Stack in Action

Harmony DSL (Musical Intent)

RiffStack Patch (Audio Structure)

Generated MLIR (Compiler IR)

Runtime (Real Audio)

VII. Roadmap

Phase 1: Harmony DSL Prototype (Months 1-3)

Phase 2: RiffStack Integration (Months 3-6)

Phase 3: MLIR Compilation (Months 6-12)

Phase 4: Advanced Features (Months 12-18)

VIII. Related Work & Inspiration

Academic

Commercial

Open Source

What's Missing (Our Opportunity)

IX. Next Steps

Immediate Actions

Questions to Answer

Validation Experiments

X. Success Metrics

Related Documents