Next.js voice chat app for document interaction
Top 83.8% on sourcepulse
This project provides a real-time voice chat interface for interacting with PDF documents using a Retrieval-Augmented Generation (RAG) system. It's built for developers and researchers looking to quickly prototype voice-enabled document Q&A applications, leveraging OpenAI's Realtime API and LlamaIndexTS.
How It Works
The application extends the openai/openai-realtime-console
example by integrating LlamaIndexTS for RAG. It processes user-uploaded PDFs, generates embeddings for them, and uses these embeddings to retrieve relevant information in response to voice queries. The core advantage lies in its real-time, conversational interaction model, allowing users to speak naturally and receive spoken responses.
Quick Start & Requirements
npm install
npm run generate
npm run dev
.env
or OPENAI_API_KEY
environment variable).Highlighted Details
Maintenance & Community
This is a LlamaIndex project, indicating potential community support and development through the LlamaIndex ecosystem. Further community engagement details are available via the LlamaIndexTS GitHub repository.
Licensing & Compatibility
The project's specific license is not detailed in the README. Compatibility for commercial use or closed-source linking would require clarification of the underlying LlamaIndex and OpenAI API terms.
Limitations & Caveats
The README notes a prompt for the API key on startup that needs fixing. Microphone access is required. The project is presented as an example, suggesting it may not be production-ready without further development.
10 months ago
1 day