Discover and explore top open-source AI tools and projects—updated daily.
Saganaki22ComfyUI tool for zero-shot multilingual text-to-speech
New!
Top 93.7% on SourcePulse
OmniVoice TTS provides advanced text-to-speech capabilities within the ComfyUI workflow, targeting users seeking high-fidelity, multilingual voice synthesis. It enables zero-shot voice cloning, custom voice design, and multi-speaker dialogue generation, offering state-of-the-art quality and extensive language support.
How It Works
This project integrates the OmniVoice TTS models into ComfyUI via custom nodes. It leverages diffusion models for synthesis, supporting over 600 languages with zero-shot voice cloning from short audio samples and voice design via text descriptions. The architecture utilizes a Qwen3 backbone, with an optional SageAttention backend for GPU-accelerated attention on compatible hardware (SM80+). Key features include fast inference (RTF as low as 0.025), support for non-verbal expression tags, and automatic model downloading.
Quick Start & Requirements
ComfyUI/custom_nodes and running python install.py.omnivoice pip package's strict torch==2.8.* dependency is handled via --no-deps during installation to prevent ComfyUI's GPU acceleration from breaking. A GPU is highly recommended, especially for SageAttention support (CUDA, SM80+).Highlighted Details
[Speaker_N]: tags.[laughter] and [sigh].Maintenance & Community
The provided documentation does not detail specific community channels (e.g., Discord, Slack), active maintainers beyond the primary author, or a public roadmap. Credits acknowledge the original OmniVoice model authors.
Licensing & Compatibility
The ComfyUI-OmniVoice-TTS custom node is released under the Apache 2.0 License. The underlying OmniVoice model has its own separate license, which users must consult (refer to k2-fsa/OmniVoice). Apache 2.0 is generally permissive for commercial use, but the model's license may impose restrictions.
Limitations & Caveats
Installation requires careful management of Python dependencies, particularly PyTorch versions, to avoid conflicts with ComfyUI's core functionality. The high-performance SageAttention backend is restricted to GPUs with SM80+ compute capability (NVIDIA Ampere architecture or newer). The transformers library version requirement (>=5.3.0) may also conflict with other custom nodes. Dialect and style instructions are limited to a predefined set of supported values.
2 days ago
Inactive