Running an audio chatbot on a 2Gb VRAM GPU
The chatbot is based on a 3 steps pipeline:
- STT using Whisper-small model
- LLM interaction through HuggingFace Inference API
- TTS using Kokoro
The UI is made using Gradio, with automatic VAD managed on the frontend using vad-web.
Create certificates using the following command line in the root repo directory:
$ openssl req -x509 -newkey rsa:4096 -keyout key.pem -out cert.pem -sha256 -days 365 -nodes