Smart Automated Response Assistant for Windows with voice, GUI, automation, realtime search, and image generation.
SARA is a Windows-first voice assistant that blends real-time speech recognition, text-to-speech, system automation, and AI responses with a polished PyQt5 interface. It supports conversational chat, live web search, content creation, and on-demand image generation.
- Wake word detection with interrupt-aware speech handling
- Live GUI with mic status and assistant feedback
- System automation (apps, volume, screenshots, screen recording)
- Realtime web search and conversational responses
- Content creation and presentation generation
- Image generation via Hugging Face models
Main.py- main entry point and orchestrationBackend/- speech, model routing, automation, and searchFrontend/- PyQt5 GUI and assetsData/- runtime data and generated filesrequirements.txt- Python dependencies
- Windows 10/11
- Python 3.10+ (recommended)
- Working microphone and speakers
- Install VS Code and the Python extension (Microsoft).
- Open this folder in VS Code.
- Create and activate a virtual environment:
python -m venv .venv .\.venv\Scripts\Activate.ps1 - Select the interpreter:
Ctrl+Shift+P->Python: Select Interpreter->.venv
- Install dependencies:
pip install -r requirements.txt - Create a
.envfile in the project root.
Create a .env file next to Main.py.
Example:
CohereAPIKey=Your_Cohere_API_KEY_Here
Username=Your_Name
AssistantName=Sara
GroqAPIKey=Your_Groq_API_KEY_Here
InputLanguage=en
AssistantVoice = en-IN-NeerjaNeural
HUGGINGFACE_API_KEY=Your_Huggingface_API_KEY_Here
Note:
InputLanguagecontrols speech recognition (for example:en-IN,hi-IN).
python Main.py- If audio input/output is not working, verify Windows microphone permissions and default devices.
- If
PyAudiofails to install, upgrade pip and retry:python -m pip install --upgrade pip. - If the GUI does not appear, confirm
PyQt5is installed correctly. - If image generation fails, check your Hugging Face token and internet access.