Deep learning toolkit for speech-to-text model training and deployment
Top 19.2% on sourcepulse
Coqui STT (🐸STT) is a deep learning toolkit for training and deploying speech-to-text (STT) models, designed for both production and research use. It offers high-quality pre-trained models, an efficient multi-GPU training pipeline, streaming inference, and real-time capabilities with small acoustic model footprints.
How It Works
🐸STT leverages deep learning architectures for its STT models, enabling efficient training and deployment. Its design prioritizes speed and a small model footprint, making it suitable for resource-constrained environments. The toolkit supports multi-GPU training for faster model development and offers streaming inference for real-time applications.
Quick Start & Requirements
Installation and usage details are available in the official documentation: stt.readthedocs.io.
Highlighted Details
Maintenance & Community
This project is no longer actively maintained, and the online Model Zoo has been discontinued. The focus has shifted to newer STT models like Whisper and Coqui TTS/Studio. Models remain available in the coqui-ai/STT-models
repository. Community support is available via GitHub Discussions and a Gitter Room.
Licensing & Compatibility
The specific license is not explicitly stated in the provided README snippet, but it is an open-source project. Compatibility for commercial use or closed-source linking would require further investigation into the licensing terms of the models and codebase.
Limitations & Caveats
The project is explicitly stated as no longer actively maintained, with the online Model Zoo shut down. This indicates a lack of ongoing development, bug fixes, and feature additions, potentially posing risks for long-term adoption or reliance.
1 year ago
Inactive