Speech API

A high-performance Speech-to-Text (STT) and Text-to-Speech (TTS) API designed for Mini PCs, utilizing faster-whisper and neutts-air.

Features

STT: Powered by faster-whisper.
TTS: Powered by neutts-air.
Optimized: Uses PyTorch CPU builds (torchao, intmm) for efficient inference on CPU-only devices.
Framework: Built with FastAPI.

Requirements

Python >= 3.12
uv (for dependency management)

Installation

Clone the repository.
Install dependencies using uv:

uv sync

This project specifically targets CPU usage with PyTorch optimized for CPU.

Running the API

Start the server using uv run:

uv run uvicorn src.main:app --host 0.0.0.0 --port 8000

The API will be available at http://localhost:8000.

API Usage

Health Check

GET /health

Speech to Text (STT)

Transcribe an audio file.

Endpoint: POST /v1/speech/stt

Curl Example:

curl -X POST "http://localhost:8000/v1/speech/stt" \
  -H "accept: application/json" \
  -H "Content-Type: multipart/form-data" \
  -F "file=@/path/to/your/audio.wav"

Response:

{
  "text": "Transcribed text...",
  "language": "en",
  "probability": 0.99
}

Text to Speech (TTS)

Convert text to audio.

Endpoint: POST /v1/speech/tts

Curl Example:

curl -X POST "http://localhost:8000/v1/speech/tts" \
  -H "accept: application/json" \
  -H "Content-Type: application/json" \
  -d '{ "text": "Hello world" }' \
  --output output.wav

Response:

Returns an audio/wav file.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
src		src
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock
verify_endpoints.py		verify_endpoints.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Speech API

Features

Requirements

Installation

Running the API

API Usage

Health Check

Speech to Text (STT)

Text to Speech (TTS)

About

Uh oh!

Releases

Packages

Languages

rizalbuilds/speech-api

Folders and files

Latest commit

History

Repository files navigation

Speech API

Features

Requirements

Installation

Running the API

API Usage

Health Check

Speech to Text (STT)

Text to Speech (TTS)

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages