Discover and explore top open-source AI tools and projects—updated daily.
speaches-aiOpenAI API-compatible server for transcription, translation, and speech generation
Top 18.3% on SourcePulse
Speaches provides an OpenAI API-compatible server for ASR, translation, and TTS, targeting developers and researchers who want to integrate speech capabilities into their applications. It offers a unified interface for various speech models, simplifying complex workflows and enabling real-time, streaming interactions.
How It Works
Speaches leverages faster-whisper for speech-to-text and translation, and piper or kokoro for text-to-speech. Its core design mimics the OpenAI API, allowing seamless integration with existing tools and SDKs. The server supports dynamic model loading and offloading, automatically managing resources based on request activity, which is advantageous for efficient GPU/CPU utilization.
Quick Start & Requirements
Highlighted Details
kokoro (ranked #1 in TTS Arena) and piper.Maintenance & Community
The project is actively maintained, with a call for issues and feature suggestions. Links to community channels or roadmaps are not explicitly provided in the README.
Licensing & Compatibility
The README does not specify a license. Compatibility for commercial use or closed-source linking is therefore undetermined.
Limitations & Caveats
The project is described as having a "TODO" for speech generation demos, indicating this feature may still be under development or refinement. The lack of a specified license poses a significant caveat for adoption.
17 hours ago
1 day
davabase