Replies: 1 comment 1 reply
-
Thanks for reaching out! |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I want to start by saying thank you for this incredible project. I have never had such a smooth installation process, especially when it comes to hardware support. The GPU passthrough worked flawlessly with my new NVIDIA RTX 5060 and 5070 cards, and the API is unbelievably fast and stable.
My primary use case is integrating this as the main STT/TTS pipeline for my real-time voice assistant in Home Assistant. Because of the project's high performance, I can now handle the entire audio stream in a single container. To get it fully working with the Home Assistant server, I built a custom integration, which has been a great success.
As I continue to build on this powerful foundation, I have two suggestions that I believe would be a fantastic addition:
Enhanced STT with Speaker Diarization: It would be a game-changer to have speaker diarization capabilities. For multi-user environments like a smart home, distinguishing between speakers is essential. An integration of a model like whisperX would be a phenomenal feature.
A Pre-configured "Development Environment Image": This project's ability to correctly configure and leverage new hardware is one of its best features. It would be a massive help to have preconfigure docker contrainer, essentially a "Development Environment Image." This would provide a complete, out-of-the-box toolkit for users, especially those with new hardware that lacks established tooling.
Thank you again for all your hard work. This has already become an essential part of my smart home ecosystem.
Beta Was this translation helpful? Give feedback.
All reactions