JARVIS-ChatGPT by gia-guar

Voice-based assistant with synthetic voices, including J.A.R.V.I.S

Created 2 years ago

445 stars

Top 67.4% on SourcePulse

Project Summary

This project provides a voice-activated conversational assistant, JARVIS-ChatGPT, designed for users seeking real-time tips, research assistance, and interactive capabilities. It leverages OpenAI's ChatGPT and Whisper, along with IBM Watson for synthetic voices, offering a sophisticated AI companion for productivity and information gathering.

How It Works

The assistant processes voice input using OpenAI Whisper, then sends transcribed text to ChatGPT for response generation. LangChain agents are employed for tasks requiring external data retrieval or actions. Voice output is handled by IBM Watson Text-to-Speech, with fallback to pyttsx3, and includes options for custom voices like J.A.R.V.I.S. via PicoVoice or ElevenLabs. A "Research Mode" integrates with Semantic Scholar and other tools to manage and query research papers.

Quick Start & Requirements

Installation: Run setup.bat (Windows/Linux) or follow manual steps.
Prerequisites: Python >= 3.9 and < 3.10, OpenAI API key, PicoVoice/ElevenLabs API key (optional), ffmpeg, CUDA >= 11.2, PyTorch compatible with CUDA, mic, speaker.
Setup: Manual installation involves environment setup, API key configuration, and potentially manual PyTorch installation based on CUDA version.
Running: Execute openai_api_chatbot.py.
Docs: Vicuna Installation Guide

Highlighted Details

Research Mode: Automates paper identification via Semantic Scholar, suggests related work, and allows querying research databases.
Custom Voices: Supports J.A.R.V.I.S. voice and other expressive voices via PicoVoice and ElevenLabs.
LangChain Integration: Enables internet searching, data retrieval, and other actions through agent toolkits.
Offline Mode: Option to use Vicuna (LLaMa-based) for reduced API costs and offline functionality.

Maintenance & Community

The project's primary developer, Gianmarco Guarnier, indicated a hiatus for thesis work until 2024 but remains available via GitHub Issues. The project history shows consistent development activity prior to this announcement.

Licensing & Compatibility

The README does not explicitly state a license. Usage of OpenAI and IBM Watson APIs is subject to their respective terms and pricing. Commercial use may be restricted by API provider terms.

Limitations & Caveats

The project is described as a "huge beta" with ongoing development. A known limitation is the ChatGPT-3.5-Turbo token limit, restricting conversation length. The "Research Mode" is noted as not fully stable. The developer is taking a break until 2024, potentially impacting immediate updates.

JARVIS-ChatGPT by gia-guar

Explore Similar Projects

ChatGPT-OpenAI-Smart-Speaker by Olney1

gpt-voice-conversation-chatbot by Adri6336

desktop-waifu by AlizerUncaged

xtts2-ui by BoltzmannEntropy

voice-assistant-whisper-chatgpt by bhattbhavesh91

AIUI by lspahija

JARVIS by AlexandreSajus

react-voice-agent by langchain-ai

Bing-GPT-Voice-Assistant by Ai-Austin

voice-assistant by linyiLYi

bolna by bolna-ai

speech_recognition by Uberi