feat(train): Mixer-TTS training (Lightning)#146
Conversation
Add phoonnx_train/mixertts/ — Mixer-TTS training ported from nipponjo/mixer-tts-pytorch into phoonnx_train's Lightning framework: - MixerTTSModel(LightningModule): wraps the model's training forward + _metrics (mel/duration/pitch/energy/CTC/binarization losses + unsupervised aligner) and an optional LSGAN PatchDiscriminator (manual optimization, generator + disc optimizers), matching the upstream GAN loop. - vendored pure-torch model + losses + aligner + dataset under mixertts/models and mixertts/utils (rewritten to the package namespace). - train CLI (phoonnx_train.mixertts.train) with quality tiers (1.74M/3.17M/20.6M). - [train] extra gains einops + numba (aligner). 4 smoke tests (builds, g+d optimizers, no-gan single optimizer, inference path); full suite 215 passed. Convergence needs a preprocessed dataset + GPU. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Look what I found! The automated check results are in. 🔍I've aggregated the results of the automated checks for this PR below. 📋 Repo HealthScanning for any signs of 'large file' weight gain. ⚖️ Latest Version: ✅ 🏷️ Release PreviewI've generated a preview of the upcoming changes. 🎬 Current:
🚀 Release Channel Compatibility Predicted next version:
📊 CoverageDiving deep into the code to see what's covered! 🤿 ❌ 39.4% total coverage Files below 80% coverage (37 files)
Full report: download the 🔍 LintEnsuring your contribution is moving forward. 🚀 ❌ ruff: issues found — see job log 🔒 Security (pip-audit)Ensuring our encryption is top-notch. 🔐 ✅ No known vulnerabilities found (61 packages scanned). ⚖️ License CheckChecking for any restrictive patent clauses. 📜 ❌ License violations detected (43 packages) — review required before merging. License distribution: 14× MIT License, 7× Apache Software License, 5× MIT, 3× Apache-2.0, 2× BSD-3-Clause, 2× ISC License (ISCL), 1× 3-Clause BSD License, 1× Apache Software License; BSD License, +8 more Full breakdown — 43 packages
Copyright (c) 2022 Phil Ewels Permission is hereby granted, free of charge, to any person obtaining a copy The above copyright notice and this permission notice shall be included in all THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR Policy: Apache 2.0 (universal donor). StrongCopyleft / NetworkCopyleft / WeakCopyleft / Other / Error categories fail. MPL allowed. 🔨 Build TestsChecking the plumbing of your data flows. 🚰 ✅ All versions pass
From the digital workshop of OpenVoiceOS. 🛠️ |
|
Companion tracking issue: #169 |
Adds Mixer-TTS training to
phoonnx_train, ported from nipponjo/mixer-tts-pytorch into phoonnx_train's Lightning framework (complements the inference engine in the merged Mixer-TTS PR).What's in
phoonnx_train/mixertts/— vendored pure-torch model + losses + unsupervised aligner + dataset (rewritten to the package namespace).MixerTTSModel(LightningModule)— wraps the model's training forward +_metrics(mel/duration/pitch/energy/CTC/binarization losses) and an optional LSGAN PatchDiscriminator for mel naturalness. Manual optimization with generator + discriminator optimizers, matching the upstream GAN loop.python -m phoonnx_train.mixertts.train --dataset-dir … --quality {x-low,medium,high}(1.74M/3.17M/20.6M params).[train]extra gainseinops+numba(the aligner).docs/mixertts.md.Verified
4 smoke tests — the Lightning module builds the model + GAN critic, exposes two optimizers (one without GAN), and the inference/export path runs. Full suite 215 passed.
🤖 Generated with Claude Code