Discover and explore top open-source AI tools and projects—updated daily.
ictnlpLLM-based real-time spoken chatbot
Top 99.4% on SourcePulse
LLaMA-Omni 2 is a series of speech-language models based on Qwen2.5 Instruct, designed for real-time spoken chatbots. It enables simultaneous text and speech generation, offering high-quality, low-latency interactions through a novel streaming autoregressive speech decoder. This project targets researchers and developers seeking advanced conversational AI with integrated speech capabilities.
How It Works
The system integrates Qwen2.5 LLMs with a streaming autoregressive speech decoder. This architecture allows for simultaneous generation of both text and speech responses, minimizing latency. The core innovation lies in the decoder's ability to produce speech in an autoregressive, streaming fashion, enhancing both quality and responsiveness compared to previous methods.
Quick Start & Requirements
Installation involves cloning the repository, creating a Conda environment (Python 3.10), and installing the package (pip install -e .). Prerequisites include downloading the Whisper-large-v3 model, CosyVoice 2 flow-matching model and vocoder, and specific LLaMA-Omni2 model checkpoints (e.g., ICTNLP/LLaMA-Omni2-7B-Bilingual) from Hugging Face. A Gradio demo can be launched by starting a controller, a web server, and a model worker using provided Python commands. Local inference scripts are also available.
Highlighted Details
Maintenance & Community
For questions, submit an issue or contact fangqingkai21b@ict.ac.cn. Commercial use inquiries should be directed to fengyang@ict.ac.cn.
Licensing & Compatibility
The code is released under the Apache-2.0 License. However, the models are strictly intended for academic research purposes and may not be used commercially. A commercial license requires explicit contact and agreement.
Limitations & Caveats
The primary limitation is the strict non-commercial use restriction for the models, requiring separate licensing for any commercial application.
7 months ago
Inactive
janhq