Turning blinks and gaze into voice โ communication without boundaries.
Blink Speech is a revolutionary browser-based assistive communication application that transforms eye blink patterns and gaze gestures into spoken phrases using advanced computer vision and speech synthesis. Built with modern web technologies, it operates entirely client-side to ensure maximum privacy and accessibility.
๐ฏ Real-time Gesture Recognition - Advanced computer vision detects blinks and gaze directions
๐ฃ๏ธ Natural Speech Synthesis - High-quality text-to-speech using Web Speech API
๐จ Fully Customizable - Create your own gesture-to-phrase mappings
๐ Privacy-First - All processing happens locally, no video data transmitted
โก Zero Installation - Runs in any modern web browser with HTTPS
โฟ Accessibility-Focused - Designed for users with motor impairments and speech limitations
๐ Multi-language Support - Works with any language or custom phrases
๐ฑ Cross-Platform - Compatible with desktop, tablet, and mobile devices
Blink Speech was born from a simple yet powerful belief: communication is a fundamental human right. By transforming natural eye movements into spoken words, we're breaking down barriers that prevent people from expressing their thoughts, needs, and emotions.
๐ฅ Healthcare Patients
- ICU patients who cannot speak due to intubation
- Post-surgery recovery when vocal communication is difficult
- Individuals with locked-in syndrome or severe paralysis
- Emergency communication when traditional methods fail
โฟ People with Disabilities
- ALS (Lou Gehrig's disease) patients as speech deteriorates
- Individuals with muscular dystrophy or cerebral palsy
- Stroke survivors during speech therapy recovery
- Anyone with motor impairments affecting traditional communication
โฐ Temporary Conditions
- Recovery from oral or throat surgery
- Severe laryngitis or vocal cord issues
- Medication side effects affecting speech
- Fatigue-related communication difficulties
๐ Global Accessibility
- Works in any language or cultural context
- No specialized hardware or expensive equipment required
- Runs on existing devices (computers, tablets, phones)
- Free and open-source for maximum accessibility
- ๐ฑ Try the Demo Video: YouTube Demo
- ๐ Read the User Guide: Complete usage instructions
- ๐ฏ Complete Calibration: Follow the 5-point setup for optimal accuracy
- ๐ฃ๏ธ Start Communicating: Use blinks and gaze to speak!
-
๐ฅ Clone the Repository
git clone https://github.com/akshad-exe/Blink-Speech.git cd Blink-Speech -
โ๏ธ Install Dependencies
# Frontend cd frontend && npm install # Backend cd ../backend && npm install
-
๐ง Configure Environment
- Set up Supabase project
- Copy
.env.examplefiles and configure variables - See Installation Guide for details
-
๐ฌ Run Development Servers
# Terminal 1 - Frontend (https://localhost:5173) cd frontend && npm run dev # Terminal 2 - Backend (http://localhost:3001) cd backend && npm run dev
- ๐ User Guide - How to use Blink Speech effectively
- ๐ง Troubleshooting - Solve common issues
- โ๏ธ Configuration - Customize settings and preferences
- ๐ ๏ธ Installation Guide - Development and production setup
- ๐๏ธ Architecture Overview - System design and data flow
- ๐ป Development Guide - Contributing and best practices
- ๐งฉ Frontend Components - React components and hooks
- ๐ API Documentation - Backend endpoints and database
- ๐๏ธ Gesture Detection - Computer vision implementation
- ๐ Frontend Architecture - React + Vite structure
- ๐ Deployment Guide - Production deployment
๐ Complete Documentation Hub - Start here for all documentation
| Role | Name | GitHub |
|---|---|---|
| ๐ง Project Lead | Md Athar Jamal Makki | @atharhive |
| ๐จ Frontend Lead | Akshad Jogi | @akshad-exe |
| ๐ ๏ธ Backend Lead | Ayush Sarkar | @dev-Ninjaa |
Advanced computer vision powered by MediaPipe and WebGazer.js detects:
- Blink Patterns: Single, double, triple, and long blinks
- Gaze Directions: Left, right, up, down, and center positioning
- Combined Gestures: Blinks + gaze for complex communication (20+ combinations)
- <150ms Detection Latency: Near-instantaneous gesture recognition
- Eye Aspect Ratio (EAR): Scientific method for accurate blink detection
- Adaptive Thresholds: Automatic calibration for optimal performance
- 15-30 FPS Processing: Smooth real-time operation
- Web Speech API: High-quality, natural-sounding voices
- Multi-language Support: Works with any language
- Customizable Voice: Adjust rate, pitch, and volume
- <1s Speech Latency: From gesture to spoken word
- 100% Local Processing: No video data ever leaves your device
- HTTPS Encryption: Secure communication protocols
- Anonymous Usage: No personal information required
- Local Storage: Settings saved securely on your device
๐จ Emergency Communication: Instant access to critical phrases ("Help", "Pain", "Emergency")
๐ Patient Monitoring: Non-verbal feedback for medical assessment
๐ Telemedicine Integration: Remote patient communication capabilities
โก Rapid Response: Immediate notification systems for urgent needs
๐ง Stroke Recovery: Bridge communication during speech therapy
๐ช Motor Skill Development: Eye-tracking exercises aid neurological recovery
๐ Progress Tracking: Monitor improvement in motor control and communication
๐ฏ Adaptive Learning: System learns and adapts to individual capabilities
๐ Home Healthcare: Enables independent communication with caregivers
๐ฑ Family Connection: Stay connected with loved ones remotely
๐ Alert Systems: Customizable emergency and routine notifications
๐ Care Documentation: Optional logging for healthcare providers
- Detection Accuracy: >95% in optimal conditions
- Latency: <150ms gesture recognition, <1s speech output
- Frame Rate: Adaptive 15-30 FPS based on device capabilities
- Memory Usage: <100MB typical operation
- Storage: ~50MB for complete application cache
| Browser | Version | MediaPipe | WebGazer | Speech API | Status |
|---|---|---|---|---|---|
| Chrome | 80+ | โ | โ | โ | โ Optimal |
| Firefox | 75+ | โ | โ | โ | โ Excellent |
| Safari | 13+ | โ | โ | โ Good | |
| Edge | 80+ | โ | โ | โ | โ Excellent |
๐ฅ๏ธ Desktop: Windows, macOS, Linux - Full feature support
๐ฑ Tablet: iPad, Android tablets - Optimized touch interface
๐ฒ Mobile: Smartphone support with adaptive UI
๐ฅ Cameras: Built-in webcams, USB cameras, HD recommended
- ๐ง AI-Powered Phrase Prediction: Context-aware phrase suggestions
- ๐ Enhanced Multi-language: 50+ languages with native voices
- ๐ Analytics Dashboard: Usage patterns and communication insights
- ๐ Healthcare Integrations: Direct API connections to medical systems
- ๐ AR/VR Integration: Wearable device support (AR glasses, smart contact lenses)
- ๐ค Machine Learning: Personalized gesture recognition improvement
- ๐ฅ Medical Partnerships: Integration with hospital communication systems
- ๐ Offline PWA: Complete offline functionality as Progressive Web App
- ๐ฎ Gamification: Interactive learning and practice modes
- ๐ฅ Gesture Sharing: Community-driven phrase mappings
- ๐ Learning Resources: Tutorials and best practices
- ๐ง Plugin System: Extensible architecture for custom integrations
- ๐ฑ Mobile Apps: Native iOS/Android applications
โ
Zero Installation - Works instantly in any modern browser
โ
Complete Privacy - 100% client-side processing, no data transmission
โ
Real-time Recognition - <150ms gesture detection latency
โ
Custom Mappings - Create your own gesture-to-phrase combinations
โ
Multi-language - Support for any language or custom phrases
โ
Offline Ready - Core features work without internet connection
โ
High Contrast Mode - Enhanced visibility for users with visual impairments
โ
Large Text Options - Scalable interface for better readability
โ
Screen Reader Support - Full compatibility with assistive technologies
โ
Keyboard Navigation - Complete keyboard accessibility
โ
Voice Customization - Adjustable speech rate, pitch, and volume
โ
Emergency Mode - Quick access to critical communication phrases
โ
Adaptive Performance - Automatic optimization based on device capabilities
โ
Calibration System - Personalized setup for optimal accuracy
โ
Data Export/Import - Share settings between devices and users
โ
Cloud Sync - Optional backup and synchronization (Supabase)
โ
SMS Integration - Send messages via Twilio API
โ
Real-time Logging - Optional activity tracking for healthcare providers
We welcome contributions from developers, researchers, and accessibility advocates! Here's how you can help:
- ๐ Report Bugs: Create an issue with detailed reproduction steps
- ๐ก Suggest Features: Share ideas for improving accessibility and usability
- ๐ง Submit Code: Fork, develop, and create pull requests
- ๐ Documentation: Help improve guides, tutorials, and API docs
- ๐ฅ Healthcare Professionals: Provide clinical insights and use case feedback
- โฟ Accessibility Users: Share experiences and improvement suggestions
- ๐ Localization: Help translate and adapt for different languages/cultures
- ๐ Research: Academic collaboration on computer vision and accessibility
- Read our Development Guide
- Follow our Code of Conduct
- Check existing issues and discussions before creating new ones
- Write clear commit messages and documentation
- Test thoroughly and include relevant test cases
Blink Speech is open-source software licensed under the MIT License. This means you can:
โ
Use - For personal, commercial, or research purposes
โ
Modify - Adapt the code to your specific needs
โ
Distribute - Share with others or deploy your own version
โ
Contribute - Help improve the project for everyone
- ๐ Documentation: Complete guides and tutorials
- ๐ง Troubleshooting: Common issues and solutions
- ๐ฌ Discussions: GitHub Discussions for questions and ideas
- ๐ Bug Reports: Issue Tracker for technical problems
- ๐ GitHub: @akshad-exe/Blink-Speech
- ๐ง Contact: For accessibility partnerships and healthcare integrations
- ๐ค Collaborate: Open to academic research partnerships
For urgent accessibility needs or critical bugs affecting communication:
- Create a high-priority GitHub issue
- Include detailed system information and reproduction steps
- Tag the issue with "urgent" or "accessibility-critical"
Research & Inspiration:
- MediaPipe team at Google for facial landmark detection
- WebGazer.js contributors for browser-based eye tracking
- Accessibility research community for guidance and feedback
- Healthcare professionals providing real-world insights
Open Source Technologies:
- React and Vite communities for modern web development tools
- TensorFlow.js for browser-based machine learning
- Supabase for backend infrastructure
- Tailwind CSS and Radix UI for accessible design systems
Special Thanks:
- Beta testers who provided crucial feedback
- Accessibility advocates who guided our design decisions
- Healthcare institutions that shared use case requirements
- Open source contributors who helped improve the codebase
๐ If Blink Speech has helped you or someone you know, please consider starring the repository to help others discover this tool! ๐
"Communication is a human right. Technology should make it accessible to everyone."