🎙️ Voice-to-Slack AI Pipeline

An advanced AI pipeline that converts voice messages into Slack DMs using LiveKit, Deepgram STT, Gemini AI processing, and Zapier MCP integration.

🚀 Features

🎤 Voice Recording: Capture real-time voice input from your microphone
🔄 Speech-to-Text: High-quality transcription using Deepgram API
🤖 AI Processing: Message cleanup and enhancement using Google Gemini
📤 Slack Integration: Direct message delivery via Zapier MCP
🏠 LiveKit Agent: Full voice agent capabilities (optional)

📋 Pipeline Flow

Voice Input → Deepgram STT → Gemini Processing → Zapier MCP → Slack DM

🛠️ Quick Setup

Option 1: Automated Setup (Recommended)

# Run the automated setup script
python setup.py

This will:

✅ Check Python version compatibility
✅ Create virtual environment
✅ Install all dependencies
✅ Create .env file from template
✅ Test microphone access

Option 2: Manual Setup

1. Clone and Install

# Clone the repository (if from git)
git clone <your-repo>
cd livekit-mcp

# Or navigate to your existing project directory
cd d:\AI\agent\livekit-mcp

# Create virtual environment
python -m venv venv

# Activate virtual environment
venv\Scripts\activate  # Windows
# source venv/bin/activate  # Linux/Mac

# Install dependencies
pip install -r requirements.txt

2. Environment Configuration

Copy the example environment file:

cp .env.example .env

Fill in your API keys in .env:

Required API Keys:

LiveKit: Get from LiveKit Cloud Dashboard
Google Gemini: Get from Google AI Studio
Deepgram: Get from Deepgram Console
Zapier MCP: Set up Slack integration at Zapier MCP

3. Zapier Slack Setup

Go to Zapier MCP
Connect your Slack workspace
Set up "Send Direct Message" action
Copy the MCP URL to your .env file

🎯 Usage

Voice-to-Slack Pipeline (Recommended)

python voice_pipeline.py

Speak your message when prompted (default 5-10 seconds)
Review the processed transcript
Copy the message and tell the AI assistant: "Send this to Kiruthivarma: [your message]"
Message delivered! 🎉

LiveKit Agent (Advanced)

python starter_agent.py dev

Run the full LiveKit agent with voice capabilities and Slack integration.

MCP Server (Testing)

python starter_server.py

Test LiveKit room management via MCP protocol.

📁 Project Structure

livekit-mcp/
├── voice_pipeline.py          # 🎯 Main voice-to-Slack pipeline
├── starter_agent.py           # 🤖 LiveKit agent with Slack functions
├── starter_server.py          # 🏠 LiveKit MCP server
├── test_credentials.py        # 🔧 API credentials testing
├── setup.py                   # 🚀 Automated setup script
├── requirements.txt           # 📦 Python dependencies
├── .env.example              # 🔑 Environment template
├── .env                      # 🔑 Your API keys (keep private!)
└── README.md                 # 📖 This file

🎤 Voice Pipeline Details

Recording

Duration: Configurable (default 5 seconds)
Quality: 16kHz, mono, 16-bit
Format: WAV (temporary file, auto-cleaned)

Speech-to-Text (Deepgram)

Model: Nova-2 (high accuracy)
Language: English
Real-time: Yes

AI Processing (Gemini)

Model: gemini-2.0-flash-exp
Purpose: Clean up transcription, fix grammar
Output: Natural, readable message

Slack Delivery (Zapier MCP)

Method: Direct Message
Target: Configurable user (e.g., Kiruthivarma)
Format: Text with microphone emoji 🎙️

🔧 Environment Setup Guide

Windows Setup

# Create and activate virtual environment
python -m venv venv
venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Copy environment template
copy .env.example .env

# Edit .env with your API keys
notepad .env

Linux/Mac Setup

# Create and activate virtual environment
python3 -m venv venv
source venv/bin/activate

# Install dependencies
pip install -r requirements.txt

# Copy environment template
cp .env.example .env

# Edit .env with your API keys
nano .env

Voice Recording

# In voice_pipeline.py
duration = 5  # Recording duration in seconds
sample_rate = 16000  # Audio quality

Gemini Processing

# Customize the processing prompt
prompt = "Clean up this transcription and format as a clear message..."

Slack Recipients

# Change target user in the MCP call
channel="username_here"

🚨 Troubleshooting

Voice Recording Issues

No microphone detected: Check sounddevice installation
Permission denied: Grant microphone access to terminal/IDE
Poor quality: Adjust sample_rate or duration

API Issues

401 Unauthorized: Check API keys in .env
Quota exceeded: Verify API usage limits
Network errors: Check internet connection

Slack Integration

Messages not appearing:
- Check if user exists in workspace
- Verify Zapier app permissions
- Look for messages from "Zapier" bot
Wrong recipient: Update username in MCP call

LiveKit Agent Issues

401 on LiveKit: Verify LIVEKIT_API_KEY and LIVEKIT_API_SECRET
Connection refused: Check LIVEKIT_URL format
Module not found: Install missing dependencies

📊 API Usage & Costs

Service	Usage	Typical Cost
Deepgram STT	~5 sec audio	~$0.0004
Google Gemini	Text processing	~$0.0001
LiveKit	Agent hosting	Variable
Zapier	Message sending	Free tier available

🔐 Security Notes

Never commit .env to version control
Rotate API keys regularly
Use environment-specific configurations
Limit API key permissions where possible

🚀 Advanced Features

Custom Voice Commands

Extend voice_pipeline.py to recognize specific commands:

if "urgent" in transcript.lower():
    priority = "🚨 URGENT: "

Multiple Recipients

Add support for different Slack users:

recipients = {
    "kiruthivarma": "Kiruthivarma",
    "team": ["user1", "user2", "user3"]
}

Voice Assistant Integration

Integrate with LiveKit agent for two-way conversations:

await context.session.say("Message sent to Kiruthivarma!")

🤝 Contributing

Fork the repository
Create a feature branch
Add your improvements
Test thoroughly
Submit a pull request

📄 License

This project is licensed under the MIT License.

🙏 Acknowledgments

LiveKit for real-time communication infrastructure
Deepgram for high-quality speech recognition
Google Gemini for AI text processing
Zapier for seamless integrations

🎊 Success Stories

"Just built a voice-to-Slack pipeline that lets me send messages by just speaking! The AI cleans up my speech and delivers it perfectly. FUCKING LEGENDARY! 🚀🔥💪" - Happy User

Ready to revolutionize your communication workflow? Start speaking to Slack today! 🎙️✨

🔧 Final Setup Checklist

Python 3.8+ installed
Virtual environment created and activated
Dependencies installed from requirements.txt
.env file created and configured with all API keys
Microphone permissions granted
Zapier MCP Slack integration configured
Test run completed successfully

🆘 Support

If you encounter issues:

Check the troubleshooting section above
Verify all API keys are correctly set
Ensure microphone permissions are granted
Test individual components (Deepgram, Gemini, Zapier) separately

🎯 Quick Test Commands

# Automated setup (first time only)
python setup.py

# Test API credentials
python test_credentials.py

# Test voice recording only
python -c "import sounddevice as sd, numpy as np; print('Recording...'); audio = sd.rec(int(5 * 16000), samplerate=16000, channels=1); sd.wait(); print('Recording complete!')"

# Run main pipeline
python voice_pipeline.py

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.env.example		.env.example
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py
starter_agent.py		starter_agent.py
starter_server.py		starter_server.py
test_credentials.py		test_credentials.py
voice_pipeline.py		voice_pipeline.py

Folders and files

Latest commit

History

Repository files navigation

🎙️ Voice-to-Slack AI Pipeline

🚀 Features

📋 Pipeline Flow

🛠️ Quick Setup

Option 1: Automated Setup (Recommended)

Option 2: Manual Setup

1. Clone and Install

2. Environment Configuration

Required API Keys:

3. Zapier Slack Setup

🎯 Usage

Voice-to-Slack Pipeline (Recommended)

LiveKit Agent (Advanced)

MCP Server (Testing)

📁 Project Structure

🎤 Voice Pipeline Details

Recording

Speech-to-Text (Deepgram)

AI Processing (Gemini)

Slack Delivery (Zapier MCP)

🔧 Environment Setup Guide

Windows Setup

Linux/Mac Setup

Voice Recording

Gemini Processing

Slack Recipients

🚨 Troubleshooting

Voice Recording Issues

API Issues

Slack Integration

LiveKit Agent Issues

📊 API Usage & Costs

🔐 Security Notes

🚀 Advanced Features

Custom Voice Commands

Multiple Recipients

Voice Assistant Integration

🤝 Contributing

📄 License

🙏 Acknowledgments

🎊 Success Stories

🔧 Final Setup Checklist

🆘 Support

🎯 Quick Test Commands

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages