Skip to content

Latest commit

 

History

History
195 lines (157 loc) · 5.29 KB

README.md

File metadata and controls

195 lines (157 loc) · 5.29 KB

TTSFM

Docker Pulls License GitHub Stars

⚠️ Disclaimer
This project is for learning & testing purposes only. For production use, please use the official OpenAI TTS service.

🚨 IMPORTANT DEVELOPMENT NOTICE 🚨
⚠️ The v2 branch is currently under active development and is not recommended for production use. 📚 For stable documentation and usage, please refer to the v1 documentation.

English | 中文

🌟 Project Overview

TTSFM is a API server that's fully compatible with OpenAI's Text-to-Speech (TTS) API format.

🎮 Try it now: Official Demo

🏗️ Project Structure

ttsfm/
├── app.py              # Main Flask application
├── celery_worker.py    # Celery configuration and tasks
├── requirements.txt    # Python dependencies
├── static/            # Frontend resources
│   ├── index.html     # English interface
│   ├── index_zh.html  # Chinese interface
│   ├── script.js      # Frontend JavaScript
│   └── styles.css     # Frontend styles
├── voices/            # Voice samples
├── Dockerfile         # Docker configuration
├── docker-entrypoint.sh # Docker startup script
├── .env.example       # Environment variables template
├── .env              # Environment variables
├── .gitignore        # Git ignore rules
├── LICENSE           # MIT License
├── README.md         # English documentation
├── README_CN.md      # Chinese documentation
├── test_api.py       # API test suite
├── test_queue.py     # Queue test suite
└── .github/          # GitHub workflows

🚀 Quick Start

System Requirements

  • Python 3.13 or higher
  • Redis server
  • Docker (optional)

Using Docker (Recommended)

# Pull the latest image
docker pull dbcccc/ttsfm:latest

# Run the container
docker run -d \
  --name ttsfm \
  -p 7000:7000 \
  -p 6379:6379 \
  -v $(pwd)/voices:/app/voices \
  dbcccc/ttsfm:latest

Manual Installation

  1. Clone the repository:
git clone https://github.com/dbccccccc/ttsfm.git
cd ttsfm
  1. Install dependencies:
pip install -r requirements.txt
  1. Start Redis server:
# On Windows
redis-server

# On Linux/macOS
sudo service redis-server start
  1. Start Celery worker:
celery -A celery_worker.celery worker --pool=solo -l info
  1. Start the server:
# Development (not recommended for production)
python app.py

# Production (recommended)
waitress-serve --host=0.0.0.0 --port=7000 app:app

Environment Variables

Copy .env.example to .env and modify as needed:

cp .env.example .env

🔧 Configuration

Server Configuration

  • HOST: Server host (default: 0.0.0.0)
  • PORT: Server port (default: 7000)
  • VERIFY_SSL: SSL verification (default: true)
  • MAX_QUEUE_SIZE: Maximum queue size (default: 100)
  • RATE_LIMIT_REQUESTS: Rate limit requests per window (default: 30)
  • RATE_LIMIT_WINDOW: Rate limit window in seconds (default: 60)

Celery Configuration

  • CELERY_BROKER_URL: Redis broker URL (default: redis://localhost:6379/0)
  • CELERY_RESULT_BACKEND: Redis result backend URL (default: redis://localhost:6379/0)

📚 API Documentation

Text-to-Speech

POST /v1/audio/speech

Request body:

{
  "input": "Hello, world!",
  "voice": "alloy",
  "response_format": "mp3",
  "instructions": "Speak in a cheerful tone"
}

Parameters

  • input (required): The text to convert to speech
  • voice (required): The voice to use. Supported voices: alloy, ash, ballad, coral, echo, fable, onyx, nova, sage, shimmer, verse
  • response_format (optional): The format of the audio output. Default: mp3. Supported formats: mp3, opus, aac, flac, wav, pcm
  • instructions (optional): Additional instructions for voice modulation

Response

  • Success: Returns audio data with appropriate content type
  • Error: Returns JSON with error message and status code

Queue Status

GET /api/queue-size

Response:

{
  "queue_size": 5,
  "max_queue_size": 100
}

Voice Samples

GET /api/voice-sample/{voice}

Parameters

  • voice (required): The voice to get a sample for. Must be one of: alloy, ash, ballad, coral, echo, fable, onyx, nova, sage, shimmer, verse

Response

  • Success: Returns MP3 audio sample
  • Error: Returns JSON with error message and status code

Version

GET /api/version

Response:

{
  "version": "v2.0.0-alpha1"
}

📝 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

  • OpenAI for the TTS API format
  • Flask for the web framework
  • Celery for task queue management
  • Waitress for the production WSGI server