ASR/TTS toolkit for multilingual speech processing
Top 63.0% on sourcepulse
Parrots is an open-source toolkit providing integrated Automatic Speech Recognition (ASR) and Text-To-Speech (TTS) capabilities, designed for ease of use with a focus on Chinese and English. It targets developers and researchers needing a quick way to implement speech processing tasks, offering pre-trained models for multi-lingual and multi-character voice synthesis.
How It Works
The ASR component leverages distilwhisper for robust speech-to-text conversion across multiple languages. The TTS engine is built upon GPT-SoVITS, enabling high-quality voice synthesis with support for various languages and distinct speaker identities, including singing voices. This dual approach allows for a unified, "out-of-the-box" solution for common speech AI applications.
Quick Start & Requirements
pip install torch
(or conda), then pip install parrots
.python setup.py install
.examples/
directory.Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
The project code is described as "rough," suggesting potential instability or areas needing refinement. While it supports multiple languages, the primary focus and default models appear to be Chinese.
8 months ago
1 day