voice-chat-pdf by run-llama

Next.js voice chat app for document interaction

Created 1 year ago

331 stars

Top 82.8% on SourcePulse

Project Summary

This project provides a real-time voice chat interface for interacting with PDF documents using a Retrieval-Augmented Generation (RAG) system. It's built for developers and researchers looking to quickly prototype voice-enabled document Q&A applications, leveraging OpenAI's Realtime API and LlamaIndexTS.

How It Works

The application extends the openai/openai-realtime-console example by integrating LlamaIndexTS for RAG. It processes user-uploaded PDFs, generates embeddings for them, and uses these embeddings to retrieve relevant information in response to voice queries. The core advantage lies in its real-time, conversational interaction model, allowing users to speak naturally and receive spoken responses.

Quick Start & Requirements

Install dependencies: npm install
Generate embeddings: npm run generate
Run development server: npm run dev
Prerequisites: OpenAI API key with Realtime API access (set via .env or OPENAI_API_KEY environment variable).
Demo: http://localhost:3000

Highlighted Details

Real-time voice chat with PDF documents.
Integrates OpenAI's Realtime API and LlamaIndexTS.
Supports both Push-to-talk and Voice Activity Detection (VAD) modes.
Allows interruption of the model during conversation.

Maintenance & Community

This is a LlamaIndex project, indicating potential community support and development through the LlamaIndex ecosystem. Further community engagement details are available via the LlamaIndexTS GitHub repository.

Licensing & Compatibility

The project's specific license is not detailed in the README. Compatibility for commercial use or closed-source linking would require clarification of the underlying LlamaIndex and OpenAI API terms.

Limitations & Caveats

The README notes a prompt for the API key on startup that needs fixing. Microphone access is required. The project is presented as an example, suggesting it may not be production-ready without further development.

voice-chat-pdf by run-llama

Explore Similar Projects

whisperIME by woheller69

ChatGPT-OpenAI-Smart-Speaker by Olney1

pi-card by nkasmanoff

gpt-voice-conversation-chatbot by Adri6336

AIVoiceChat by KoljaB

ChatWaifuL2D by cjyaddone

FunAudioLLM-APP by FunAudioLLM

voice-chat-ai by bigsk1

Babagaboosh by DougDougGithub

Bing-GPT-Voice-Assistant by Ai-Austin

Linly-Talker by Kedreamix

openai-fm by openai