A Telegram bot that transcribes, translates, and summarizes voice messages using the Groq Whisper API.
Note on Serverless History: The last commit of the bot being serverless is 1ccb016ccabeb320b0c7637d3c15fc9bdedb2a48.
- Send a voice, audio, or video note to the bot
- The bot transcribes, translates, or summarizes it using the Groq Whisper API
- Results are sent back instantly and cached for future use
The bot can be added to groups to automatically transcribe voice messages. You can manually transcribe videos and audio files by replying to them with the bot commands. In DMs, the bot will automatically transcribe most media sent to it.
/start: Initializes the bot and provides a welcome message/help: Provides information on how to use the bot and its features/transcribe: Transcribes the voice, audio, or video note in the reply message/translate(aliases:english,en): Translates (into English) the voice, audio, or video note in the reply message/summarize: Summarizes the voice, audio, or video note in the reply message/caveman: Summarizes the voice, audio, or video note in a "caveman" style/privacy: Shows the privacy policy/limits(aliases:/ratelimit,/ratelimits): Shows current rate limit information (5 messages per minute, 30 messages per hour)/donate: Shows cryptocurrency donation addresses to support the project
- Language: Rust with async/await
- Telegram API: Built using the
teloxidecrate - Transcription API: Uses
reqwestwith rustls to communicate with GroqCloud's Whisper API - Database: SQLite with sqlx for fast, type-safe queries
- TLS: Uses rustls everywhere (no OpenSSL dependencies)
- Logging: Dual output to stdout and
bot.logfile using fern - Model:
whisper-large-v3-turbofor transcription andwhisper-large-v3for translation,moonshotai/kimi-k2-instruct-0905for summarization - Caching:
- Transcriptions and translations cached for 7 days
- Summaries (default & caveman) cached for 1 day
- File-based cache uses SQLite with automatic expiration cleanup
- Rate Limiting:
- Per-user tracking: 5 messages per minute, 30 messages per hour
- Reacts with 🙊 emoji when per-user limit is exceeded
- Reacts with 😴 emoji when GroqCloud rate limits are reached
- Applies to all audio operations (transcribe, translate, summarize, caveman)
This bot uses GroqCloud with Global ZDR (Zero Day Retention) enabled. No data is stored on GroqCloud servers. Audio files are processed instantly and discarded immediately—nothing is retained on their infrastructure.
TELEGRAM_BOT_TOKEN: The token for your Telegram botGROQ_API_KEY: Your GroqCloud API key(s). Supports multiple keys separated by commas for automatic failover (e.g.,key1,key2,key3)DATABASE_URL: SQLite database path (default:sqlite:duck_transcriber.db)
- Install Rust: https://rustup.rs/
- Clone the repository:
git clone https://github.com/DuckyBlender/duck_transcriber.git - Create
.envfile:TELEGRAM_BOT_TOKEN=your_bot_token GROQ_API_KEY=your_groq_key DATABASE_URL=sqlite:duck_transcriber.db
- Run the bot:
cargo run --release
Build and run the bot in Docker with an optimized multi-stage build:
docker compose up -dThe Dockerfile uses cargo-chef for efficient dependency caching, resulting in faster rebuilds.
- Robust Error Handling: All errors are properly handled and logged
- Rate Limit Fallback: Uses 🙊 for per-user limits and 😴 for GroqCloud rate limits instead of failing
- Type-Safe Errors: Uses a custom
TranscriptionErrorenum for clean error categorization - Automatic Retry: Configurable API key rotation for automatic failover (if multiple keys provided)
If you find this bot useful and would like to help cover API costs, donations are greatly appreciated! You can donate using various cryptocurrencies:
- Bitcoin:
bc1q3dqnaygpaqkwm20hjq73g3kcc534cnt47wjlmu - Bitcoin Lightning:
duckyblender@strike.me - Ethereum (or any ERC20 token):
0x87d03a9DADd7927c1f058725307a1645BC406195 - Nano:
nano_3ociqkh6taqqu7q7h99oiyuasnkugm7bss87r1r4eph7dym3tmp3cebtosc5 - Monero:
84SdAF7JmMfQS3P1sSKasJHo8sQPjR3Xp58Vp1QWG4vMYdW26iZw6XuCMqL5FbtSQnUSKsGu6WtvXNMDEkwBtrE2VgKtNSK
You can also use the /donate command in the bot to view these addresses directly.
This project is licensed under the GNU General Public License v3.0. See the LICENSE file for details.
Contributions are welcome! If you'd like to help improve this bot, please open a pull request with your changes.