personaplex by NVIDIA

Full-duplex conversational speech model with real-time persona control

Created 3 weeks ago

New!

3,801 stars

Top 12.7% on SourcePulse

View on GitHub

1 Expert Loves This Project

Benjamin Bolte

Cofounder of K-Scale Labs

Project Summary

Summary

PersonaPlex offers real-time, full-duplex conversational speech models with precise persona and voice control. It targets developers building advanced conversational AI, enabling natural, low-latency interactions with consistent characterization via text and audio conditioning.

How It Works

Based on the Moshi architecture and Helium LLM, PersonaPlex processes speech bidirectionally in real-time. It uses text-based role prompts and audio-based voice conditioning for consistent persona. Trained on diverse synthetic and real dialogues, it achieves naturalistic, low-latency spoken interactions.

Quick Start & Requirements

Install via pip install moshi/. after cloning. Requires accepting the PersonaPlex model license on Huggingface and setting export HF_TOKEN=<YOUR_HUGGINGFACE_TOKEN>. Launch a server with python -m moshi.server --ssl <SSL_DIR> for web UI access. Offline evaluation uses moshi.offline with voice prompts and input WAVs.

Highlighted Details

Voice Variety: Features 16 pre-packaged embeddings across Natural (NAT) and Variety (VAR) categories, male/female (e.g., NATF0-3, NATM0-3, VARF0-4, VARM0-4).
Flexible Prompting: Supports detailed role-playing for assistant, customer service (e.g., CitySan, Jerusalem Shakshuka), and casual conversations, leveraging corpora like Fisher English.
Emergent Generalization: Leverages the Helium LLM for plausible responses to out-of-distribution prompts, encouraging experimental use cases.

Maintenance & Community

The README provides no details on community channels, active maintainers, or a public roadmap.

Licensing & Compatibility

Code is MIT licensed. Model weights use the NVIDIA Open Model license, which may restrict commercial use. Users must review the NVIDIA Open Model license terms carefully.

Limitations & Caveats

Setup necessitates accepting a Huggingface model license and configuring authentication. Performance on highly novel or out-of-distribution prompts is an area for user experimentation, not a guaranteed feature.

Health Check

Last Commit

4 days ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

3,826 stars in the last 23 days