speaches  by speaches-ai

OpenAI API-compatible server for transcription, translation, and speech generation

created 1 year ago
2,151 stars

Top 21.4% on sourcepulse

GitHubView on GitHub
Project Summary

Speaches provides an OpenAI API-compatible server for ASR, translation, and TTS, targeting developers and researchers who want to integrate speech capabilities into their applications. It offers a unified interface for various speech models, simplifying complex workflows and enabling real-time, streaming interactions.

How It Works

Speaches leverages faster-whisper for speech-to-text and translation, and piper or kokoro for text-to-speech. Its core design mimics the OpenAI API, allowing seamless integration with existing tools and SDKs. The server supports dynamic model loading and offloading, automatically managing resources based on request activity, which is advantageous for efficient GPU/CPU utilization.

Quick Start & Requirements

  • Install: Docker Compose is the primary deployment method.
  • Prerequisites: GPU support is recommended for optimal performance.
  • Documentation: speaches.ai

Highlighted Details

  • OpenAI API compatibility for broad tool integration.
  • Streaming transcription and speech generation for real-time applications.
  • Dynamic model loading/offloading for efficient resource management.
  • Supports high-quality TTS via kokoro (ranked #1 in TTS Arena) and piper.

Maintenance & Community

The project is actively maintained, with a call for issues and feature suggestions. Links to community channels or roadmaps are not explicitly provided in the README.

Licensing & Compatibility

The README does not specify a license. Compatibility for commercial use or closed-source linking is therefore undetermined.

Limitations & Caveats

The project is described as having a "TODO" for speech generation demos, indicating this feature may still be under development or refinement. The lack of a specified license poses a significant caveat for adoption.

Health Check
Last commit

1 day ago

Responsiveness

1 day

Pull Requests (30d)
25
Issues (30d)
66
Star History
412 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Andre Zayarni Andre Zayarni(Cofounder of Qdrant), and
2 more.

RealChar by Shaunwei

0.1%
6k
Real-time AI character/companion creation and interaction codebase
created 2 years ago
updated 1 year ago
Starred by Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), and
2 more.

MiniCPM-o by OpenBMB

0.2%
20k
MLLM for vision, speech, and multimodal live streaming on your phone
created 1 year ago
updated 1 month ago
Feedback? Help us improve.