QuickAgent by gkamradt

Voice bot demo using speech and language models

Created 1 year ago

383 stars

Top 74.7% on SourcePulse

Project Summary

QuickAgent is an alpha-stage Python demo showcasing a voice-controlled chatbot. It integrates Text-to-Speech (TTS), Speech-to-Text (STT), and a Large Language Model (LLM) for conversational interaction, targeting users interested in real-time voice AI applications.

How It Works

The bot leverages streaming audio processing for both STT and TTS to minimize latency. It's configured to use Deepgram for audio services and Groq for its LLM, enabling a fluid, conversational experience. The core logic is contained within the QuickAgent.py script, with reusable components in the building_blocks directory.

Quick Start & Requirements

Primary install / run command: python3 QuickAgent.py
Prerequisites: Deepgram API key, Groq API key.
Setup time: Minimal, assuming API keys are readily available.

Highlighted Details

Utilizes streaming for STT and TTS for reduced latency.
Integrates Deepgram for audio processing and Groq for LLM.
Demonstrates a conversational voice bot architecture.

Maintenance & Community

No specific community channels, contributors, or roadmap details are provided in the README.

Licensing & Compatibility

The license is not specified in the README.

Limitations & Caveats

This is an alpha demo, indicating potential instability and incomplete features. The project relies on specific third-party services (Deepgram, Groq) which may incur costs and require API key management.

QuickAgent by gkamradt

Explore Similar Projects

LLaMA-Omni2 by ictnlp

local_llm_assistant by nickbild

S.A.T.U.R.D.A.Y by GRVYDEV

LLMVoX by mbzuai-oryx

VITA-Audio by VITA-MLLM

dia2 by nari-labs

fast-voice-assistant by dsa

ChatWaifu by cjyaddone

10x by 0xCrunchyy

ichigo by janhq

AI-Waifu-Vtuber by ardha27

mini-omni by gpt-omni