Voice assistant for web interface
Top 62.6% on sourcepulse
JARVIS is a personal voice assistant that converts spoken language to text, processes it with a Large Language Model (LLM), and then converts the response back to speech, all presented through a web interface. It is designed for users who want a customizable, AI-powered voice assistant integrated into a web application.
How It Works
The system utilizes a pipeline: user speech is captured via microphone and transcribed to text using Deepgram. This text is then sent to OpenAI's GPT-3 API for response generation. The LLM's text response is converted to speech using ElevenLabs and played back using Pygame. The entire conversation flow is displayed in a web interface built with Taipy.
Quick Start & Requirements
pip install -r requirements.txt
python display.py
(web interface), python main.py
(voice assistant).env
file with API keys.Highlighted Details
Maintenance & Community
No specific information on contributors, sponsorships, or community channels is provided in the README.
Licensing & Compatibility
The README does not specify a license. Compatibility for commercial use or closed-source linking is not mentioned.
Limitations & Caveats
The project relies heavily on third-party API keys (Deepgram, OpenAI, ElevenLabs), which may incur costs. The README does not detail error handling or fallback mechanisms for API failures.
1 year ago
1 day