Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about rhythm transfer #5

Open
mvoodarla opened this issue Aug 4, 2023 · 0 comments
Open

Question about rhythm transfer #5

mvoodarla opened this issue Aug 4, 2023 · 0 comments

Comments

@mvoodarla
Copy link

mvoodarla commented Aug 4, 2023

I have the following use case: I would like train urhythmic to a target speakers voice and do any-to-one voice conversion that ensures the target timbre, but I would like to variably change the rhythm, pitch, speed, intonation, etc between predictions based on the intonation of a separate source audio. Is this possible?

I have successfully trained the vocoder which sounds really good, and have actually tried inferencing with varying rhythm-fine models but can't really hear it affecting things as much as the rhythm of the source clip. In your sample, the outputs feel like they have a rhythm more true to the target voices (though it's unclear whether you trained on just that sample somehow, or the speaker in general).

Any tips or insights would be appreciated :)

Currently one approach I'm taking is to retrain the rhythm model at inference time for both the source and the target speaker and have it overfit to that single sample.

I also want to commend how clean this codebase is. This is how all OSS ML repos should be.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant