JARVIS-ChatGPT  by gia-guar

Voice-based assistant with synthetic voices, including J.A.R.V.I.S

created 2 years ago
418 stars

Top 71.2% on sourcepulse

GitHubView on GitHub
Project Summary

This project provides a voice-activated conversational assistant, JARVIS-ChatGPT, designed for users seeking real-time tips, research assistance, and interactive capabilities. It leverages OpenAI's ChatGPT and Whisper, along with IBM Watson for synthetic voices, offering a sophisticated AI companion for productivity and information gathering.

How It Works

The assistant processes voice input using OpenAI Whisper, then sends transcribed text to ChatGPT for response generation. LangChain agents are employed for tasks requiring external data retrieval or actions. Voice output is handled by IBM Watson Text-to-Speech, with fallback to pyttsx3, and includes options for custom voices like J.A.R.V.I.S. via PicoVoice or ElevenLabs. A "Research Mode" integrates with Semantic Scholar and other tools to manage and query research papers.

Quick Start & Requirements

  • Installation: Run setup.bat (Windows/Linux) or follow manual steps.
  • Prerequisites: Python >= 3.9 and < 3.10, OpenAI API key, PicoVoice/ElevenLabs API key (optional), ffmpeg, CUDA >= 11.2, PyTorch compatible with CUDA, mic, speaker.
  • Setup: Manual installation involves environment setup, API key configuration, and potentially manual PyTorch installation based on CUDA version.
  • Running: Execute openai_api_chatbot.py.
  • Docs: Vicuna Installation Guide

Highlighted Details

  • Research Mode: Automates paper identification via Semantic Scholar, suggests related work, and allows querying research databases.
  • Custom Voices: Supports J.A.R.V.I.S. voice and other expressive voices via PicoVoice and ElevenLabs.
  • LangChain Integration: Enables internet searching, data retrieval, and other actions through agent toolkits.
  • Offline Mode: Option to use Vicuna (LLaMa-based) for reduced API costs and offline functionality.

Maintenance & Community

The project's primary developer, Gianmarco Guarnier, indicated a hiatus for thesis work until 2024 but remains available via GitHub Issues. The project history shows consistent development activity prior to this announcement.

Licensing & Compatibility

The README does not explicitly state a license. Usage of OpenAI and IBM Watson APIs is subject to their respective terms and pricing. Commercial use may be restricted by API provider terms.

Limitations & Caveats

The project is described as a "huge beta" with ongoing development. A known limitation is the ChatGPT-3.5-Turbo token limit, restricting conversation length. The "Research Mode" is noted as not fully stable. The developer is taking a break until 2024, potentially impacting immediate updates.

Health Check
Last commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
1
Star History
11 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.