insight  by advaitpaliwal

AI assistant using Gemini 1.5 Pro for multimodal memory

created 1 year ago
320 stars

Top 86.0% on sourcepulse

GitHubView on GitHub
Project Summary

Insight is a personal AI assistant that leverages Gemini 1.5 Pro to answer questions based on visual and auditory input, with memory capabilities. It is designed for users seeking an AI companion that can process real-time sensory data and recall past interactions.

How It Works

The system integrates Gemini 1.5 Pro for advanced reasoning and question answering. It processes input from a webcam and microphone, enabling it to understand and respond to queries related to the user's environment and conversations. Memory is managed to retain context across interactions.

Quick Start & Requirements

  • Install: Clone the repository and run pip install -r requirements.txt.
  • Prerequisites: Raspberry Pi 4 or newer, webcam, USB microphone, Logitech webcam, Sony headphones with jack, monitor, keyboard, mouse.
  • Software: pvporcupine, google-generativeai, SpeechRecognition, firebase-admin, google-cloud-texttospeech, picamera2.
  • Configuration: Requires API keys in config.py (based on config.example.py).
  • Usage: Run python main.py.

Highlighted Details

  • Utilizes Gemini 1.5 Pro for AI capabilities.
  • Designed for real-time audio-visual processing.
  • Includes memory functionality for context retention.
  • Requires specific hardware, notably a Raspberry Pi.

Maintenance & Community

The project is maintained by @advaitpaliwal. Further community or roadmap information is not detailed in the README.

Licensing & Compatibility

The project is licensed under "[License Name]". The specific license type and its implications for commercial use or closed-source linking are not fully detailed.

Limitations & Caveats

The project has significant hardware dependencies, requiring a Raspberry Pi and specific peripherals. The licensing is not clearly specified, which may impact commercial adoption.

Health Check
Last commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
0 stars in the last 90 days

Explore Similar Projects

Starred by Lysandre Debut Lysandre Debut(Chief Open-Source Officer at Hugging Face), Mckay Wrigley Mckay Wrigley(Founder of Takeoff AI), and
1 more.

cheating-daddy by sohzm

1.8%
4k
Real-time AI assistance during calls
created 2 months ago
updated 4 days ago
Feedback? Help us improve.