Discover and explore top open-source AI tools and projects—updated daily.
NVIDIAFull-duplex conversational speech model with real-time persona control
New!
Top 12.7% on SourcePulse
Summary
PersonaPlex offers real-time, full-duplex conversational speech models with precise persona and voice control. It targets developers building advanced conversational AI, enabling natural, low-latency interactions with consistent characterization via text and audio conditioning.
How It Works
Based on the Moshi architecture and Helium LLM, PersonaPlex processes speech bidirectionally in real-time. It uses text-based role prompts and audio-based voice conditioning for consistent persona. Trained on diverse synthetic and real dialogues, it achieves naturalistic, low-latency spoken interactions.
Quick Start & Requirements
Install via pip install moshi/. after cloning. Requires accepting the PersonaPlex model license on Huggingface and setting export HF_TOKEN=<YOUR_HUGGINGFACE_TOKEN>. Launch a server with python -m moshi.server --ssl <SSL_DIR> for web UI access. Offline evaluation uses moshi.offline with voice prompts and input WAVs.
Highlighted Details
Maintenance & Community
The README provides no details on community channels, active maintainers, or a public roadmap.
Licensing & Compatibility
Code is MIT licensed. Model weights use the NVIDIA Open Model license, which may restrict commercial use. Users must review the NVIDIA Open Model license terms carefully.
Limitations & Caveats
Setup necessitates accepting a Huggingface model license and configuring authentication. Performance on highly novel or out-of-distribution prompts is an area for user experimentation, not a guaranteed feature.
4 days ago
Inactive
2noise