Skip to content

Pre-wire instruct support in generate_voice_clone for upcoming 25Hz model#281

Open
ezzaldeeen wants to merge 1 commit intoQwenLM:mainfrom
ezzaldeeen:instruct-guided-voice-cloning
Open

Pre-wire instruct support in generate_voice_clone for upcoming 25Hz model#281
ezzaldeeen wants to merge 1 commit intoQwenLM:mainfrom
ezzaldeeen:instruct-guided-voice-cloning

Conversation

@ezzaldeeen
Copy link
Copy Markdown

@ezzaldeeen ezzaldeeen commented Mar 30, 2026

Summary

  • Added optional instruct parameter to generate_voice_clone, allowing users to pass natural-language style instructions alongside voice cloning.
  • Follows the same instruct_ids pattern already used in generate_voice_design and generate_custom_voice.
  • Fully backwards compatible, no changes required for existing callers.

Note

⚠️ This feature is experimental and may produce unstable results.
The 12Hz base model was not trained for instruct-guided voice cloning. As noted in the README, instruction control in the base model is currently unstable. Full support for combined cloning + instruction is planned for the upcoming 25Hz voice editing model.
Use at your own discretion and treat outputs as experimental.

It's a very important feature for us. Thanks @wangxiongts and your team for your great work. Feel free to drop any feedback/comments if needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant