Skip to content

[Feature request]: Parallelize Audio Chunk Transcription for Improved Performance #1842

@nahu02

Description

@nahu02

Current Behavior

The audio transcription feature in Fabric currently processes audio chunks sequentially, which creates a significant performance bottleneck when transcribing large audio files.

When an audio file exceeds the 25MB limit, it is split into multiple chunks using ffmpeg, but these chunks are then processed one at a time in a for loop. This means:

  1. Each chunk must wait for the previous chunk to complete transcription before starting
  2. Network latency and API processing time compound linearly with the number of chunks
  3. For a file split into 30 chunks (~ 40 min waw audio), the total time is roughly 30x the time for a single chunk

Proposal

Implement parallel processing of audio chunks to significantly reduce transcription time for large files. Multiple chunks should be transcribed concurrently, with results assembled in the correct order.
This would dramatically speed up the process.

Implementation Considerations

Concurrency Control

  • Configurable parameter for max concurrent transcriptions (e.g., --max-concurrent flag)
  • Default to a reasonable limit (e.g., 3-5 concurrent requests)

API Rate Limits

  • Respect OpenAI API rate limits to avoid 429 errors
  • Consider implementing exponential backoff for retries

Technical Details

I assume the main change would have to be in the TranscribeFile function in internal/plugins/ai/openai/openai_audio.go where chunks are currently processed sequentially. The loop structure would need to be refactored to parallelize (using goroutines?).

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions