Discover and explore top open-source AI tools and projects—updated daily.
Vyvo-LabsText-to-Speech training and inference framework powered by Large Language Models
Top 99.1% on SourcePulse
Summary
VyvoTTS is an LLM-based framework for Text-to-Speech (TTS) training and inference, designed for researchers and power users. It offers a comprehensive suite of tools for creating and deploying custom TTS models, from training LLMs from scratch to efficient voice cloning and optimized inference, significantly streamlining the TTS development pipeline.
How It Works
This framework leverages Large Language Models (LLMs) for advanced TTS capabilities. It supports full pre-training of LLM models on custom datasets, fine-tuning for specific TTS tasks, and memory-efficient adaptation using Low-Rank Adaptation (LoRA). Novel neural techniques are employed for voice cloning. A unified tokenizer simplifies dataset preparation for both Qwen3 and LFM2 model architectures, facilitating flexible data handling and model compatibility.
Quick Start & Requirements
Installation involves setting up a Python 3.10 virtual environment with uv and installing dependencies via uv pip install -r requirements.txt. For lower-end GPUs (6GB+ VRAM), a Jupyter notebook (notebook/vyvotts-lfm2-train.ipynb) is available, requiring uv pip install jupyter notebook. Fine-tuning requires a minimum of 30GB VRAM.
Highlighted Details
accelerate library for scalable training.Maintenance & Community
The provided README does not detail specific contributors, sponsorships, or community channels like Discord or Slack.
Licensing & Compatibility
The project is licensed under the permissive MIT License, which generally allows for commercial use and integration into closed-source projects without significant restrictions.
Limitations & Caveats
Fine-tuning operations demand substantial GPU resources, with a minimum requirement of 30GB VRAM. While options exist for lower-end GPUs, full-scale training and fine-tuning remain resource-intensive. The roadmap indicates ongoing development, suggesting some features may still be experimental or under active implementation.
3 days ago
Inactive
metavoiceio
RVC-Boss