Auralis by astramind-ai

TTS engine for fast voice cloning

Created 1 year ago

612 stars

Top 53.6% on SourcePulse

Project Summary

Auralis is a high-speed text-to-speech (TTS) engine designed for practical, real-world applications, including voice cloning. It targets developers and researchers needing to convert large volumes of text to natural-sounding speech efficiently, offering significant speedups over traditional methods.

How It Works

Auralis leverages the XTTSv2 model, optimizing its inference pipeline for speed and low memory footprint. It employs smart batching and concurrency management, allowing it to process multiple requests simultaneously on consumer GPUs. The engine supports streaming for long texts and includes built-in audio enhancement features like noise reduction and volume normalization.

Quick Start & Requirements

Install via pip: pip install auralis
Requires Python 3.10+.
Example usage and CLI server available.
Official documentation and examples are linked.

Highlighted Details

Processes a full novel in approximately 10 minutes (RTF ≈ 0.02x).
Supports voice cloning from short audio samples.
Offers automatic language detection and audio enhancement.
Allows fine-tuning with custom XTTSv2 models via a provided conversion script.

Maintenance & Community

Community contributions are welcomed, with contribution guidelines provided.
Links to community channels (Discord/Slack) are available.

Licensing & Compatibility

Codebase is released under Apache 2.0.
XTTSv2 model components are under the Coqui AI License.

Limitations & Caveats

The XTTSv2 model components are subject to the Coqui AI License, which may have restrictions on commercial use or redistribution. Specific details of this license are not elaborated upon in the README.

Auralis by astramind-ai

Explore Similar Projects

Synthalingua by cyberofficial

Open-VoiceCanvas by ItusiAI

sesame_csm_openai by phildougherty

obs-localvocal by royshil

tts by zuoban

Scriberr by rishikanthc

Chatterbox-TTS-Server by devnen

alltalk_tts by erew123

stt by jianchang512

whisper-asr-webservice by ahmetoner

buzz by chidiwilliams

tortoise-tts by neonbjb