Discover and explore top open-source AI tools and projects—updated daily.
chenpipi0807High-quality text-to-speech in ComfyUI
Top 62.4% on SourcePulse
This repository provides custom ComfyUI nodes for high-quality text-to-speech (TTS) using the IndexTTS model. It targets users of ComfyUI, particularly those interested in voice cloning and generating speech in both Chinese and English, offering a streamlined workflow for creative applications.
How It Works
The nodes integrate the IndexTTS model, enabling voice cloning by analyzing a reference audio sample to replicate its characteristics. It supports both Chinese and English text, with features for adjusting speech speed and various synthesis parameters. The project also includes a novel "Novel Text Structure Node" designed to parse narrative text into multi-character dialogue formats, facilitating the creation of audiobooks or multi-voice narratives.
Quick Start & Requirements
custom_nodes directory and install dependencies using .\python_embeded\python.exe -m pip install -r requirements.txt.ComfyUI/models/Index-TTS or ComfyUI/models/IndexTTS-1.5 respectively.Highlighted Details
Maintenance & Community
The project is actively updated, with recent changes focusing on text parsing, model compatibility, and audio processing enhancements. Links to community support or discussion channels are not explicitly provided in the README.
Licensing & Compatibility
The licensing is stated to refer to the original IndexTTS project. Users should verify compatibility for commercial use.
Limitations & Caveats
The novel text parsing algorithm is not perfect and may misidentify characters in complex narrative structures. Compatibility issues with PyTorch 2.7 are noted, with a workaround provided by downgrading the transformers library.
1 week ago
Inactive
RVC-Boss