hertz-dev by Standard-Intelligence

Open-source base model for full-duplex conversational audio

Created 1 year ago

1,779 stars

Top 23.7% on SourcePulse

View on GitHub

1 Expert Loves This Project

Luis Capelo

Cofounder of Lightning AI

Project Summary

Hertz-dev is an open-source base model for full-duplex conversational audio, enabling real-time, two-way voice communication. It targets researchers and developers building interactive voice applications, offering a foundational model for advanced audio interaction.

How It Works

The project provides a base model for full-duplex conversational audio, allowing for simultaneous speaking and listening. It includes scripts for offline inference, a client-server architecture for live interaction, and a browser-based client using Streamlit and WebRTC for easier accessibility.

Quick Start & Requirements

Install: pip install -r requirements.txt (after installing PyTorch with CUDA 12.1 support: pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121). For WebRTC client: pip install -r requirements_webrtc.txt.
Prerequisites: Python 3.10, CUDA 12.1 (recommended). Ubuntu users may need libportaudio.
Models: Automatically downloaded to ./ckpt/.
Docs: Blog post: https://si.inc/hertz-dev/

Highlighted Details

Enables full-duplex conversational audio.
Offers offline inference, client-server, and browser-based (Streamlit + WebRTC) interaction modes.
Tested primarily on Ubuntu (server) and macOS (client).

Maintenance & Community

No specific community channels or contributor information is detailed in the README.

Licensing & Compatibility

The license is not specified in the provided README.

Limitations & Caveats

Inference is only confirmed to work reliably on Python 3.10 and CUDA 12.1; other versions are less tested. The client-server and WebRTC components are experimental. Remote hosting of the Streamlit client requires HTTPS and potential STUN server configuration for WebRTC connections.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

10 stars in the last 30 days