Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

skipped phonemes in generated audio #13

Open
thivux opened this issue Feb 26, 2024 · 0 comments
Open

skipped phonemes in generated audio #13

thivux opened this issue Feb 26, 2024 · 0 comments

Comments

@thivux
Copy link

thivux commented Feb 26, 2024

hi, thank you for sharing your code.

i am trying to do voice conversion from English speech to Vietnamese speaker. to do that, i did the following steps

  • extract units for both English and Vietnamese dataset
  • train kmeans on both types of units & extract discrete labels
  • train soft encoder
  • extract soft units
  • train acoustic model
  • train hifigan on Vietnamese dataset

the output for Vietnamese speech (input audio is Vietnamese, of a different speaker) is okay. but output for English is not that good. phonemes are often skipped or mispronouced. do you have any suggestions on how i can improve the results?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant