Blink Speech

Turning blinks and gaze into voice – communication without boundaries.

Blink Speech is a revolutionary browser-based assistive communication application that transforms eye blink patterns and gaze gestures into spoken phrases using advanced computer vision and speech synthesis. Built with modern web technologies, it operates entirely client-side to ensure maximum privacy and accessibility.

📊 Project Stats

✨ Key Features

🎯 Real-time Gesture Recognition - Advanced computer vision detects blinks and gaze directions
🗣️ Natural Speech Synthesis - High-quality text-to-speech using Web Speech API
🎨 Fully Customizable - Create your own gesture-to-phrase mappings
🔒 Privacy-First - All processing happens locally, no video data transmitted
⚡ Zero Installation - Runs in any modern web browser with HTTPS
♿ Accessibility-Focused - Designed for users with motor impairments and speech limitations
🌍 Multi-language Support - Works with any language or custom phrases
📱 Cross-Platform - Compatible with desktop, tablet, and mobile devices

🛠️ Technology Stack

🌍 Vision & Impact

Everyone Deserves a Voice

Blink Speech was born from a simple yet powerful belief: communication is a fundamental human right. By transforming natural eye movements into spoken words, we're breaking down barriers that prevent people from expressing their thoughts, needs, and emotions.

Who We Help

🏥 Healthcare Patients

ICU patients who cannot speak due to intubation
Post-surgery recovery when vocal communication is difficult
Individuals with locked-in syndrome or severe paralysis
Emergency communication when traditional methods fail

♿ People with Disabilities

ALS (Lou Gehrig's disease) patients as speech deteriorates
Individuals with muscular dystrophy or cerebral palsy
Stroke survivors during speech therapy recovery
Anyone with motor impairments affecting traditional communication

⏰ Temporary Conditions

Recovery from oral or throat surgery
Severe laryngitis or vocal cord issues
Medication side effects affecting speech
Fatigue-related communication difficulties

🌍 Global Accessibility

Works in any language or cultural context
No specialized hardware or expensive equipment required
Runs on existing devices (computers, tablets, phones)
Free and open-source for maximum accessibility

🚀 Quick Start

For Users

📱 Try the Demo Video: YouTube Demo
📖 Read the User Guide: Complete usage instructions
🎯 Complete Calibration: Follow the 5-point setup for optimal accuracy
🗣️ Start Communicating: Use blinks and gaze to speak!

For Developers

📥 Clone the Repository

git clone https://github.com/akshad-exe/Blink-Speech.git
cd Blink-Speech

⚙️ Install Dependencies

# Frontend
cd frontend && npm install

# Backend
cd ../backend && npm install

🔧 Configure Environment
- Set up Supabase project
- Copy .env.example files and configure variables
- See Installation Guide for details

🎬 Run Development Servers

# Terminal 1 - Frontend (https://localhost:5173)
cd frontend && npm run dev

# Terminal 2 - Backend (http://localhost:3001)
cd backend && npm run dev

📚 Complete Documentation

📖 User Documentation

📋 User Guide - How to use Blink Speech effectively
🔧 Troubleshooting - Solve common issues
⚙️ Configuration - Customize settings and preferences

🏗️ Technical Documentation

🛠️ Installation Guide - Development and production setup
🏛️ Architecture Overview - System design and data flow
💻 Development Guide - Contributing and best practices
🧩 Frontend Components - React components and hooks
🔗 API Documentation - Backend endpoints and database

🔬 Core Technologies

👁️ Gesture Detection - Computer vision implementation
🌐 Frontend Architecture - React + Vite structure
🚀 Deployment Guide - Production deployment

📋 Complete Documentation Hub - Start here for all documentation

👥 Team

Role	Name	GitHub
🧠 Project Lead	Md Athar Jamal Makki	@atharhive
🎨 Frontend Lead	Akshad Jogi	@akshad-exe
🛠️ Backend Lead	Ayush Sarkar	@dev-Ninjaa

🎯 How It Works

1. 👁️ Gesture Recognition

Advanced computer vision powered by MediaPipe and WebGazer.js detects:

Blink Patterns: Single, double, triple, and long blinks
Gaze Directions: Left, right, up, down, and center positioning
Combined Gestures: Blinks + gaze for complex communication (20+ combinations)

2. 🎯 Real-time Processing

<150ms Detection Latency: Near-instantaneous gesture recognition
Eye Aspect Ratio (EAR): Scientific method for accurate blink detection
Adaptive Thresholds: Automatic calibration for optimal performance
15-30 FPS Processing: Smooth real-time operation

3. 🗣️ Speech Synthesis

Web Speech API: High-quality, natural-sounding voices
Multi-language Support: Works with any language
Customizable Voice: Adjust rate, pitch, and volume
<1s Speech Latency: From gesture to spoken word

4. 🔒 Privacy & Security

100% Local Processing: No video data ever leaves your device
HTTPS Encryption: Secure communication protocols
Anonymous Usage: No personal information required
Local Storage: Settings saved securely on your device

🏥 Medical & Healthcare Applications

Critical Care Benefits

🚨 Emergency Communication: Instant access to critical phrases ("Help", "Pain", "Emergency")
📊 Patient Monitoring: Non-verbal feedback for medical assessment
🔄 Telemedicine Integration: Remote patient communication capabilities
⚡ Rapid Response: Immediate notification systems for urgent needs

Rehabilitation Support

🧠 Stroke Recovery: Bridge communication during speech therapy
💪 Motor Skill Development: Eye-tracking exercises aid neurological recovery
📈 Progress Tracking: Monitor improvement in motor control and communication
🎯 Adaptive Learning: System learns and adapts to individual capabilities

Long-term Care

🏠 Home Healthcare: Enables independent communication with caregivers
📱 Family Connection: Stay connected with loved ones remotely
🔔 Alert Systems: Customizable emergency and routine notifications
📝 Care Documentation: Optional logging for healthcare providers

📊 Performance & Compatibility

System Specifications

Detection Accuracy: >95% in optimal conditions
Latency: <150ms gesture recognition, <1s speech output
Frame Rate: Adaptive 15-30 FPS based on device capabilities
Memory Usage: <100MB typical operation
Storage: ~50MB for complete application cache

Browser Support

Browser	Version	MediaPipe	WebGazer	Speech API	Status
Chrome	80+	✅	✅	✅	✅ Optimal
Firefox	75+	✅	✅	✅	✅ Excellent
Safari	13+	✅	⚠️	✅	✅ Good
Edge	80+	✅	✅	✅	✅ Excellent

Device Compatibility

🖥️ Desktop: Windows, macOS, Linux - Full feature support
📱 Tablet: iPad, Android tablets - Optimized touch interface
📲 Mobile: Smartphone support with adaptive UI
🎥 Cameras: Built-in webcams, USB cameras, HD recommended

🚀 Roadmap & Future Features

🔮 Version 2.0 (In Development)

🧠 AI-Powered Phrase Prediction: Context-aware phrase suggestions
🌍 Enhanced Multi-language: 50+ languages with native voices
📊 Analytics Dashboard: Usage patterns and communication insights
🔗 Healthcare Integrations: Direct API connections to medical systems

🌟 Future Innovations

👓 AR/VR Integration: Wearable device support (AR glasses, smart contact lenses)
🤖 Machine Learning: Personalized gesture recognition improvement
🏥 Medical Partnerships: Integration with hospital communication systems
🌐 Offline PWA: Complete offline functionality as Progressive Web App
🎮 Gamification: Interactive learning and practice modes

🤝 Community Features

👥 Gesture Sharing: Community-driven phrase mappings
📚 Learning Resources: Tutorials and best practices
🔧 Plugin System: Extensible architecture for custom integrations
📱 Mobile Apps: Native iOS/Android applications

🌟 Key Features

🎯 Core Capabilities

✅ Zero Installation - Works instantly in any modern browser
✅ Complete Privacy - 100% client-side processing, no data transmission
✅ Real-time Recognition - <150ms gesture detection latency
✅ Custom Mappings - Create your own gesture-to-phrase combinations
✅ Multi-language - Support for any language or custom phrases
✅ Offline Ready - Core features work without internet connection

♿ Accessibility Features

✅ High Contrast Mode - Enhanced visibility for users with visual impairments
✅ Large Text Options - Scalable interface for better readability
✅ Screen Reader Support - Full compatibility with assistive technologies
✅ Keyboard Navigation - Complete keyboard accessibility
✅ Voice Customization - Adjustable speech rate, pitch, and volume
✅ Emergency Mode - Quick access to critical communication phrases

🔧 Advanced Features

✅ Adaptive Performance - Automatic optimization based on device capabilities
✅ Calibration System - Personalized setup for optimal accuracy
✅ Data Export/Import - Share settings between devices and users
✅ Cloud Sync - Optional backup and synchronization (Supabase)
✅ SMS Integration - Send messages via Twilio API
✅ Real-time Logging - Optional activity tracking for healthcare providers

🤝 Contributing

We welcome contributions from developers, researchers, and accessibility advocates! Here's how you can help:

🛠️ Development

🐛 Report Bugs: Create an issue with detailed reproduction steps
💡 Suggest Features: Share ideas for improving accessibility and usability
🔧 Submit Code: Fork, develop, and create pull requests
📝 Documentation: Help improve guides, tutorials, and API docs

🧪 Testing & Feedback

🏥 Healthcare Professionals: Provide clinical insights and use case feedback
♿ Accessibility Users: Share experiences and improvement suggestions
🌍 Localization: Help translate and adapt for different languages/cultures
📊 Research: Academic collaboration on computer vision and accessibility

📋 Contribution Guidelines

Read our Development Guide
Follow our Code of Conduct
Check existing issues and discussions before creating new ones
Write clear commit messages and documentation
Test thoroughly and include relevant test cases

📄 License

Blink Speech is open-source software licensed under the MIT License. This means you can:

✅ Use - For personal, commercial, or research purposes
✅ Modify - Adapt the code to your specific needs
✅ Distribute - Share with others or deploy your own version
✅ Contribute - Help improve the project for everyone

🆘 Support & Community

📞 Get Help

📖 Documentation: Complete guides and tutorials
🔧 Troubleshooting: Common issues and solutions
💬 Discussions: GitHub Discussions for questions and ideas
🐛 Bug Reports: Issue Tracker for technical problems

🌐 Connect

🐙 GitHub: @akshad-exe/Blink-Speech
📧 Contact: For accessibility partnerships and healthcare integrations
🤝 Collaborate: Open to academic research partnerships

🚨 Emergency Support

For urgent accessibility needs or critical bugs affecting communication:

Create a high-priority GitHub issue
Include detailed system information and reproduction steps
Tag the issue with "urgent" or "accessibility-critical"

🙏 Acknowledgments

Research & Inspiration:

MediaPipe team at Google for facial landmark detection
WebGazer.js contributors for browser-based eye tracking
Accessibility research community for guidance and feedback
Healthcare professionals providing real-world insights

Open Source Technologies:

React and Vite communities for modern web development tools
TensorFlow.js for browser-based machine learning
Supabase for backend infrastructure
Tailwind CSS and Radix UI for accessible design systems

Special Thanks:

Beta testers who provided crucial feedback
Accessibility advocates who guided our design decisions
Healthcare institutions that shared use case requirements
Open source contributors who helped improve the codebase

🌟 If Blink Speech has helped you or someone you know, please consider starring the repository to help others discover this tool! 🌟

⭐ Star on GitHub ⭐

"Communication is a human right. Technology should make it accessible to everyone."

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
backend		backend
docs		docs
frontend		frontend
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
docs.md		docs.md
tech-spech.md		tech-spech.md
test-camera.html		test-camera.html

Uh oh!

License

Uh oh!

Recursion-Labs/Blink-Speech

Folders and files

Latest commit

History

Repository files navigation

Blink Speech

📊 Project Stats

✨ Key Features

🛠️ Technology Stack

🌍 Vision & Impact

Everyone Deserves a Voice

Who We Help

🚀 Quick Start

For Users

For Developers

📚 Complete Documentation

📖 User Documentation

🏗️ Technical Documentation

🔬 Core Technologies

👥 Team

🎯 How It Works

1. 👁️ Gesture Recognition

2. 🎯 Real-time Processing

3. 🗣️ Speech Synthesis

4. 🔒 Privacy & Security

🏥 Medical & Healthcare Applications

Critical Care Benefits

Rehabilitation Support

Long-term Care

📊 Performance & Compatibility

System Specifications

Browser Support

Device Compatibility

🚀 Roadmap & Future Features

🔮 Version 2.0 (In Development)

🌟 Future Innovations

🤝 Community Features

🌟 Key Features

🎯 Core Capabilities

♿ Accessibility Features

🔧 Advanced Features

🤝 Contributing

🛠️ Development

🧪 Testing & Feedback

📋 Contribution Guidelines

📄 License

🆘 Support & Community

📞 Get Help

🌐 Connect

🚨 Emergency Support

🙏 Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages