GitHub - VickyRohilla862/SARA-Voice-Assistant: SARA (Smart Automated Response Assistant) is a Windows-first Python voice assistant with a PyQt5 GUI. It combines speech recognition, interrupt-aware TTS, realtime web search, AI chat, system automation (apps, volume, screenshots, screen recording), content creation, and image generation in a modular frontend/backend architecture. for productivity.

Smart Automated Response Assistant for Windows with voice, GUI, automation, realtime search, and image generation.

Overview

SARA is a Windows-first voice assistant that blends real-time speech recognition, text-to-speech, system automation, and AI responses with a polished PyQt5 interface. It supports conversational chat, live web search, content creation, and on-demand image generation.

Highlights

Wake word detection with interrupt-aware speech handling
Live GUI with mic status and assistant feedback
System automation (apps, volume, screenshots, screen recording)
Realtime web search and conversational responses
Content creation and presentation generation
Image generation via Hugging Face models

Project Structure

Main.py - main entry point and orchestration
Backend/ - speech, model routing, automation, and search
Frontend/ - PyQt5 GUI and assets
Data/ - runtime data and generated files
requirements.txt - Python dependencies

Prerequisites

Windows 10/11
Python 3.10+ (recommended)
Working microphone and speakers

VS Code Setup

Install VS Code and the Python extension (Microsoft).
Open this folder in VS Code.

Create and activate a virtual environment:

python -m venv .venv
.\.venv\Scripts\Activate.ps1

Select the interpreter:
- Ctrl+Shift+P -> Python: Select Interpreter -> .venv
Install dependencies:
```
pip install -r requirements.txt
```
Create a .env file in the project root.

Environment Variables

Create a .env file next to Main.py.

Example:

CohereAPIKey=Your_Cohere_API_KEY_Here
Username=Your_Name
AssistantName=Sara
GroqAPIKey=Your_Groq_API_KEY_Here
InputLanguage=en
AssistantVoice = en-IN-NeerjaNeural
HUGGINGFACE_API_KEY=Your_Huggingface_API_KEY_Here

Note:

InputLanguage controls speech recognition (for example: en-IN, hi-IN).

Run

python Main.py

Troubleshooting

If audio input/output is not working, verify Windows microphone permissions and default devices.
If PyAudio fails to install, upgrade pip and retry: python -m pip install --upgrade pip.
If the GUI does not appear, confirm PyQt5 is installed correctly.
If image generation fails, check your Hugging Face token and internet access.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Overview

Highlights

Project Structure

Prerequisites

VS Code Setup

Environment Variables

Run

Troubleshooting

About

Uh oh!

Releases 1

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 70 Commits
Backend		Backend
Data		Data
Frontend		Frontend
Main.py		Main.py
README.md		README.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Overview

Highlights

Project Structure

Prerequisites

VS Code Setup

Environment Variables

Run

Troubleshooting

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Contributors

Uh oh!

Languages