Skip to content

Chunking

mzuelch edited this page Jan 24, 2026 · 2 revisions

Chunking enables model inference on long audio by splitting it into windows and reconstructing the output.

Source pointers

  • patchbay_backend/chunking.py – chunk plan + overlap-add
  • patchbay_backend/pipeline.py – uses the plan, runs per-chunk inference

When chunking is enabled

PATCHBAY’s current behavior is explicit:

  • If max_len_s is not set (None): the full file is processed in one pass.
  • If max_len_s is set: chunking is enabled.
    • overlap_s is optional; if omitted a conservative default is applied.
  • If overlap_s is provided without max_len_s: overlap is ignored and a note is written to the optional log file.

This avoids surprising “auto chunking” based on duration.

Parameters

  • Max chunk length (seconds)
    Window size passed into the model (converted to samples via the model sample rate).

  • Overlap (seconds)
    Shared region between consecutive chunks. Overlap helps reduce boundary artifacts.

Planning: samples and windows

Given:

  • chunk_len (samples)
  • overlap (samples)
  • step = chunk_len - overlap

Chunk windows are:

  • chunk 1: [0, chunk_len)
  • chunk 2: [step, step + chunk_len)
  • ...

The last chunk is clipped to the end of the file.

Reconstruction: overlap-add (linear crossfade)

Outputs are reconstructed with a simple linear crossfade in the overlap region:

  • for the non-overlap part: copy samples directly
  • for overlap: blend previous and current chunk with a linear ramp

Intuition:

  • overlap is where “seams” happen
  • a crossfade reduces clicks/level jumps at boundaries

Recommendations

Good starting points (typical speech/music):

  • Max length: 20s
  • Overlap: 2s

For difficult material or short, rare events:

  • Reduce max length (e.g. 10–15s)
  • Increase overlap slightly (e.g. 2–4s)

Anchors interaction

Anchor modes can append/prepend example spans to each chunk. If you use these modes, ensure your max length has enough headroom (the prompt is part of the model input).

See Anchor Prompting Internals.

Last updated: 2026-01-24

Clone this wiki locally