Bug: Dubbed video has no audio when using Coqui-XTTS (ffmpeg MP3→AAC mux failure)

## Bug description

When using Coqui-XTTS for dubbing, the final video (`video_dub.mp4`) is generated
with only ~0.02s of audio, making it effectively silent. The video track is correct.

## Root cause

The final ffmpeg mux command re-encodes `audio_mix.mp3` to AAC:

```
ffmpeg -i Video.mp4 -i audio_mix.mp3 -c:v copy -c:a aac -map 0:v -map 1:a -shortest video_dub.mp4
```

The `audio_mix.mp3` file is valid (ffprobe shows correct duration, ~198s), but the
MP3→AAC transcoding produces only 0.02s in the output container. This appears to be
an ffmpeg compatibility issue with the MP3 header generated during the audio mixing step.

## Diagnostic evidence

Using ffprobe on the intermediate files:

| File | Codec | Duration | Status |
|------|-------|----------|--------|
| `audio_dub_solo.ogg` | pcm_s32le, 41000 Hz | 198.55s | ✅ OK |
| `audio_mix.mp3` | mp3, 44100 Hz | 198.58s | ✅ OK |
| `audio_Voiceless.wav` | pcm_s16le, 44100 Hz | 190.12s | ✅ OK |
| `video_dub.mp4` (audio stream) | aac, 44100 Hz | **0.02s** | ❌ Bug |

The audio source files are all correct. The problem occurs specifically during the
MP3→AAC transcode in the mux step.

Note: `audio_dub_solo.ogg` uses `pcm_s32le` at 41000 Hz, which is non-standard for
OGG (normally Vorbis/Opus at 44100/48000 Hz). This unusual format may contribute to
the MP3 header issue downstream.

## Suggested fix

Convert `audio_mix.mp3` to WAV before the final mux. WAV→AAC transcoding works
correctly. The fix is a ~4 line change before the final ffmpeg call:

```python
# Convert MP3 to WAV before muxing (fixes MP3→AAC transcode failure)
import subprocess
subprocess.run([
    "ffmpeg", "-y", "-i", "audio_mix.mp3",
    "-acodec", "pcm_s16le", "-ar", "44100", "-ac", "2",
    "audio_mix.wav"
], capture_output=True)

# Then use audio_mix.wav instead of audio_mix.mp3 in the final mux command
```

## Environment

- Platform: Google Colab (free tier)
- TTS engine: Coqui-XTTS
- SoniTranslate version: latest from Colab notebook
- ffmpeg version: (Colab default)

## Steps to reproduce

1. Open SoniTranslate Colab notebook
2. Select Coqui-XTTS as TTS engine
3. Process any video with dubbing
4. The output video will have no audio

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Bug: Dubbed video has no audio when using Coqui-XTTS (ffmpeg MP3→AAC mux failure) #201

Bug description

Root cause

Diagnostic evidence

Suggested fix

Environment

Steps to reproduce

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

File	Codec	Duration	Status
`audio_dub_solo.ogg`	pcm_s32le, 41000 Hz	198.55s	✅ OK
`audio_mix.mp3`	mp3, 44100 Hz	198.58s	✅ OK
`audio_Voiceless.wav`	pcm_s16le, 44100 Hz	190.12s	✅ OK
`video_dub.mp4` (audio stream)	aac, 44100 Hz	0.02s	❌ Bug

Uh oh!

Bug: Dubbed video has no audio when using Coqui-XTTS (ffmpeg MP3→AAC mux failure) #201

Description

Bug description

Root cause

Diagnostic evidence

Suggested fix

Environment

Steps to reproduce

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions