Skip to content

Latest commit

 

History

History
590 lines (435 loc) · 15.2 KB

File metadata and controls

590 lines (435 loc) · 15.2 KB
project riffstack
type design-doc
status vision
date 2025-11-23
author Scott Senften
keywords
harmony-dsl
music-theory
voice-leading
mlir
compiler-design
beth_topics
riffstack-harmony
music-dsl
mlir-audio

🎼 Harmony DSL Vision: Musical Intent as Code

A composable, declarative language for harmonic movement, voice-leading, and musical structure


Executive Summary

This document defines the vision for a Harmony DSL that integrates with RiffStack and MLIR to create a complete creative compiler stack. Instead of writing chord symbols, musicians describe harmonic motion, color, and voice-leading constraints in a playful, composable language.

The Big Idea: Harmony is movement, not labels. The DSL captures intent; the compiler solves the voice-leading SAT problem.

Integration:

  • Harmony DSL → musical structure (what notes, what chords)
  • RiffStack → audio structure (what timbres, what effects)
  • MLIR → multi-level compilation (harmony IR → pitch IR → audio IR → DSP IR → machine code)

This creates a layered ecosystem unlike anything available today.


I. Core Principles

A great harmony DSL balances musical intuition, expressive power, and algorithmic clarity.

1. Root-Motion–First Thinking (Movement > Labels)

Principle: Harmony is about motion, not symbols.

Instead of:

Cmin11 → F7 → Bbmaj9

Write what actually matters:

start Cmin11.lush
+5
-2

Why this matters:

  • Captures the feeling of harmony (tension, release, gravity)
  • Generalizes across keys, modes, tuning systems
  • Focuses on musical intent, not notation

Motion Operators:

  • +N / -N - semitone motion
  • T - tritone substitution
  • 5 - fifth up
  • = - stay on root (reharmonize)

2. Declarative Chord Color + Voicing Rules

Principle: Describe what you want, not how to construct it.

min11.lush.smooth
maj9.dark.spread
7#11.bright.common

This tells the engine:

  • Chord flavor (min11, maj9, 7#11)
  • Emotional mood (lush, dark, bright)
  • Voicing constraints (smooth, spread, common)

The engine picks exact voicings that satisfy constraints.

This is the harmonic equivalent of RiffStack's:

pluck |> reverb |> lowpass 1200

Both are declarative pipelines describing intent.


3. Automatic Voice-Leading Constraints (Built-in Intelligence)

Principle: The DSL knows how voices want to move.

Constraints:

  • smooth - minimize motion (≤ whole step per voice)
  • common - maximize common tones
  • contrary - voices oppose bass motion
  • spread - wide voicings
  • tight - close voicings
  • rotated - rotate highest voice to bottom

Why this is powerful:

  • Composer focuses on intent
  • DSL solves the voice-leading SAT problem
  • Mirrors RiffStack philosophy: "Describe intent, compiler figures out execution"

4. Rhythm + Motion Modifiers

Principle: Harmony is nothing without rhythm.

Modifiers:

sync(1/8)      // sync to grid
stab           // short, accented
hold           // sustain
anticipate(1/16) // early by 1/16
roll           // arpeggiated entry
arp(up)        // ascending arpeggio

Example:

Cmin11.lush(smooth).sync(1/8)
+1.bright(anticipate).stab
T.dark(spread).hold
-3.lush(arp(up))

This transforms the DSL from "chord generator" to performance language.


5. Composable Mini-Languages (LEGO-Style Design)

Principle: Small pieces combine freely.

Building blocks:

  • Motion primitives (+1, -3, T)
  • Voicing traits (smooth, spread)
  • Chord colors (min11, maj9, sus4)
  • Rhythmic transforms (sync, anticipate, arp)
  • Generative rules (choose, repeat, invert)

Example composition:

Cmin11.lush(smooth)
+1.bright(anticipate)
T.dark(spread)
-3.lush(arp(up))

Every element is a tiny unit you can stack.


6. Multi-Scale: From Chords → Phrases → Sections → Songs

Principle: Work at every level of musical structure.

Scales:

Chord      → single harmony
Phrase     → 2-8 chord progression
Progression → full section (verse, chorus)
Section    → song component
Song       → complete composition

Just like:

  • Hydra works on multiple time scales (feedback → layer → composition)
  • MLIR works on multiple IR levels (high → mid → low → machine)

Example:

verse:
  phrase:
    - Cmin11.lush
    - +5.bright
    - -2.dark
  repeat: 2

chorus:
  phrase:
    - Fmaj9.spread
    - T.lush
    - =.bright(arp)
  repeat: 4

II. Key Features Specification

Essential Operators

Category Examples Purpose
Root Motion +1, -3, T, 5, = Harmonic movement
Chord Quality maj7, min11, 7#11, sus4, ø Chord flavor
Voice-Leading smooth, common, contrary, spread, tight Voice behavior
Color/Mood lush, bright, dark, empty, rich Emotional quality
Rhythm sync(8th), arp(up), stab, roll, hold Temporal feel
Structure loop 4, repeat 2, invert, reharmonize Compositional control
Generative choose([+5, -2], weight=[0.7, 0.3]) Controlled randomness

III. Integration with RiffStack

RiffStack handles audio structure. Harmony DSL handles musical structure.

They slot together like this:

Harmony DSL → note streams → RiffStack synths/patches → audio

The Connection

Harmony DSL RiffStack
"What notes and chords?" "What timbres and effects?"
Movement, tension, voicing Oscillators, filters, loops
Rhythm modifiers Envelopes, loops, patterns
High-level musical semantics Sample/block-level DSP

They're two sides of the same creative language stack.

  • Harmony DSL says what to play
  • RiffStack says how it sounds

Both are:

  • ✅ Declarative
  • ✅ Composable
  • ✅ Playful
  • ✅ LEGO-like

IV. MLIR Architecture: The Unified Creative Compiler

MLIR makes this dual-language system feasible and high-performance.

1. Harmony DSL as MLIR Dialect

Define ops like:

%root = theory.root "C4"
%next = theory.move %root, motion = "+1"
%chord = theory.color %next, quality = "min11", mood = "lush"
%voiced = theory.voicelead %chord, policy = "smooth"
%pattern = theory.rhythm %voiced, mode = "sync(1/8)"

This is a semantic graph of musical intent.


2. RiffStack as Audio Dialect

%osc = audio.sine %pitch
%env = audio.adsr %osc
%fx = audio.reverb %env, time = 0.3
%out = audio.mix %fx

3. Lowering Pipeline: Multi-Level Compilation

┌─────────────────────────────────────┐
│ Harmony IR (musical intent)         │
│   theory.move, theory.voicelead     │
└─────────────────┬───────────────────┘
                  │ lowering pass
                  ↓
┌─────────────────────────────────────┐
│ Pitch/Note IR (generated notes)     │
│   note.pitch, note.duration         │
└─────────────────┬───────────────────┘
                  │ lowering pass
                  ↓
┌─────────────────────────────────────┐
│ Audio IR (synthesis ops)            │
│   audio.sine, audio.reverb          │
└─────────────────┬───────────────────┘
                  │ lowering pass
                  ↓
┌─────────────────────────────────────┐
│ DSP IR (vector ops)                 │
│   vector.fma, vector.shuffle        │
└─────────────────┬───────────────────┘
                  │ lowering pass
                  ↓
┌─────────────────────────────────────┐
│ LLVM / SPIR-V (machine code)        │
│   CPU/GPU target code               │
└─────────────────┬───────────────────┘
                  │
                  ↓
┌─────────────────────────────────────┐
│ Real-time audio runtime             │
└─────────────────────────────────────┘

This is EXACTLY what MLIR is built for:

  • ✅ Multi-level
  • ✅ Cross-device
  • ✅ Declarative
  • ✅ Multi-dialect
  • ✅ Pass-driven lowering

You get a full creative compiler.


V. Why This Is Unique and Worth Building

🎶 Gives Musicians a Playground for Harmony

No other tool allows deep harmony exploration with composable primitives.

Existing tools:

  • Ableton - great for loops, weak for harmony
  • Max/MSP - visual patching, not harmonic language
  • SuperCollider - powerful synthesis, clunky harmony
  • Hooktheory - educational, not performative

This DSL - playful, composable, performative harmony language.


🎛️ Gives Sound Designers a Patch Language That Slots In

RiffStack handles the sound while Harmony DSL handles the music.

Together they form a complete creative stack.


🧮 Gives Developers a Portable, Multi-Device IR

MLIR makes it run everywhere:

  • CPU (native, fast)
  • GPU (parallel, massive)
  • WebGPU (browser, accessible)
  • Mobile (iOS/Android)

One language, multiple backends.


🔥 Creates a Musical System Like Nothing Else Today

This combination gives you:

Feature Source
Songwriting power Ableton
Playfulness Hydra
Synthesis depth SuperCollider
Performance LLVM-level compilation
Composability Max/MSP
Plus: Readable, shareable, fun musical language Unique

VI. Example: Full Stack in Action

Harmony DSL (Musical Intent)

verse:
  start: Cmin11.lush(smooth)
  progression:
    - +5.bright(anticipate)
    - -2.dark(spread)
    - T.lush(arp(up))
  repeat: 2

chorus:
  start: Fmaj9.spread(common)
  progression:
    - +1.bright(stab)
    - =.rich(hold)
    - -3.lush(sync(1/8))
  repeat: 4

RiffStack Patch (Audio Structure)

instruments:
  - id: pad
    type: synth
    expr: "sine $pitch 0.6 chorus 2.0 0.5 reverb 0.8 play"

  - id: bass
    type: synth
    expr: "sine $pitch 0.8 lowpass 400 0.7 play"

tracks:
  - instrument: pad
    source: harmony.verse

  - instrument: bass
    source: harmony.verse.bass_notes

Generated MLIR (Compiler IR)

// Harmony layer
%root = theory.root "C4"
%chord1 = theory.color %root, quality = "min11", mood = "lush"
%voiced1 = theory.voicelead %chord1, policy = "smooth"

// Pitch layer
%pitches = note.extract %voiced1
%bass = note.bass %pitches
%pad = note.all %pitches

// Audio layer
%osc_bass = audio.sine %bass, amp = 0.8
%filt_bass = audio.lowpass %osc_bass, cutoff = 400

%osc_pad = audio.sine %pad, amp = 0.6
%fx_pad = audio.reverb %osc_pad, time = 0.8

%mix = audio.mix [%filt_bass, %fx_pad]
audio.out %mix

Runtime (Real Audio)

Press play → hears actual music with:

  • Smooth voice-leading
  • Rich pad textures
  • Tight bass lines
  • CPU/GPU optimized DSP

VII. Roadmap

Phase 1: Harmony DSL Prototype (Months 1-3)

Deliverables:

  • Motion operator syntax (+1, -3, T)
  • Chord quality macros (min11, maj9)
  • Basic voice-leading solver (smooth, common)
  • YAML parser for progressions
  • MIDI output (proof-of-concept)

Outcome: Can write progressions, hear MIDI playback


Phase 2: RiffStack Integration (Months 3-6)

Deliverables:

  • Harmony → RiffStack bridge
  • Parameter binding ($pitch from harmony)
  • Multi-track rendering
  • Example compositions (verse/chorus/bridge)

Outcome: Full audio synthesis from harmony DSL


Phase 3: MLIR Compilation (Months 6-12)

Deliverables:

  • Define theory MLIR dialect
  • Lowering passes (harmony → pitch → audio)
  • LLVM backend targeting
  • Performance optimization

Outcome: Compiled, optimized audio engine


Phase 4: Advanced Features (Months 12-18)

Deliverables:

  • Rhythm modifiers (arp, stab, roll)
  • Generative rules (choose, randomize)
  • Alternative tunings (microtonal, just intonation)
  • Live performance mode (MIDI control, looping)

Outcome: Production-ready creative tool


VIII. Related Work & Inspiration

Academic

  • David Cope - EMI algorithmic composition
  • William Schottstaedt - Common Music
  • Miller Puckette - Max/MSP, Pure Data

Commercial

  • Ableton - Live looping, clip launching
  • Hooktheory - Theory-aware composition
  • Captain Plugins - Chord progression tools

Open Source

  • TidalCycles - Pattern-based livecoding
  • Sonic Pi - Educational music language
  • SuperCollider - Synthesis language

What's Missing (Our Opportunity)

  • No tool combines: harmony theory + audio synthesis + compiler optimization
  • No language is: composable, playful, AND harmonically intelligent
  • No system targets: multiple backends (CPU/GPU/WebGPU) with one DSL

IX. Next Steps

Immediate Actions

  1. Syntax design - Finalize harmony DSL grammar
  2. Voice-leading solver - Implement constraint satisfaction
  3. MLIR dialect - Define theory ops and types
  4. Example compositions - Demonstrate expressiveness

Questions to Answer

  • Should root motion be relative or absolute?
  • How to handle alternative tunings (12-TET vs microtonal)?
  • YAML vs custom syntax vs Python DSL?
  • Constraint solver: SAT/SMT vs heuristic?

Validation Experiments

  • Can we voice-lead ii-V-I smoothly?
  • Can we generate jazz reharmonization?
  • Can we compile to real-time audio at <10ms latency?
  • Can we run on WebGPU in browser?

X. Success Metrics

Technical:

  • Harmony DSL → MIDI in <100ms
  • Voice-leading quality matches human arrangers
  • Compiled audio latency <10ms
  • Runs on CPU/GPU/WebGPU

Creative:

  • Musicians create full compositions in DSL
  • Livecoding performances using harmony language
  • Educational adoption (music theory + programming)

Business:

  • 1000+ users experiment with DSL
  • 100+ paying users (SaaS/education tier)
  • Integration requests from DAW vendors

Related Documents


Last Updated: 2025-11-23 Status: Vision document Next: Syntax specification + voice-leading solver prototype