Skip to content

fix: add audio padding for Moonshine models to prevent empty transcri…#1292

Open
sudhanshu112233shukla wants to merge 1 commit intoEpicenterHQ:mainfrom
sudhanshu112233shukla:fix/moonshine-tiny-empty-transcription
Open

fix: add audio padding for Moonshine models to prevent empty transcri…#1292
sudhanshu112233shukla wants to merge 1 commit intoEpicenterHQ:mainfrom
sudhanshu112233shukla:fix/moonshine-tiny-empty-transcription

Conversation

@sudhanshu112233shukla
Copy link
Copy Markdown

Problem

Moonshine Tiny model returns empty transcriptions for short audio clips (#1282)

Solution

Added automatic audio padding for clips < 1.5 seconds to meet Moonshine's minimum context requirement.

Changes

  • Modified apps/whispering/src-tauri/src/transcription/mod.rs
  • Added MIN_MOONSHINE_SAMPLES constant (24000 = 1.5s at 16kHz)
  • Pads short audio with trailing silence while preserving original quality

Testing Needed

macOS/Linux testing required (Moonshine not available on Windows)

  • Short audio (< 1.5s) should now transcribe successfully
  • Long audio (> 1.5s) should remain unchanged

Fixes #1282

@sudhanshu112233shukla
Copy link
Copy Markdown
Author

@braden-w do check this solution and tell me if it needs any update i will fix it
thank you !!

…ptions

Moonshine models (especially tiny) require minimum audio length (~1.5s) to
generate transcriptions. Short audio clips result in empty output due to
insufficient context for the model.

This fix pads short audio with silence to meet the 1.5s threshold, ensuring
reliable transcription while preserving original audio quality.

- Added MIN_MOONSHINE_SAMPLES constant (24000 samples = 1.5s at 16kHz)
- Implemented padding logic with debug logging
- Preserves original audio, adds only trailing silence
- Applies to both tiny and base model variants

Fixes EpicenterHQ#1282
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Moonshine Tiny model consistently results in empty transcription.

1 participant