insight  by advaitpaliwal

AI assistant using Gemini 1.5 Pro for multimodal memory

Created 1 year ago
320 stars

Top 84.6% on SourcePulse

GitHubView on GitHub
Project Summary

Insight is a personal AI assistant that leverages Gemini 1.5 Pro to answer questions based on visual and auditory input, with memory capabilities. It is designed for users seeking an AI companion that can process real-time sensory data and recall past interactions.

How It Works

The system integrates Gemini 1.5 Pro for advanced reasoning and question answering. It processes input from a webcam and microphone, enabling it to understand and respond to queries related to the user's environment and conversations. Memory is managed to retain context across interactions.

Quick Start & Requirements

  • Install: Clone the repository and run pip install -r requirements.txt.
  • Prerequisites: Raspberry Pi 4 or newer, webcam, USB microphone, Logitech webcam, Sony headphones with jack, monitor, keyboard, mouse.
  • Software: pvporcupine, google-generativeai, SpeechRecognition, firebase-admin, google-cloud-texttospeech, picamera2.
  • Configuration: Requires API keys in config.py (based on config.example.py).
  • Usage: Run python main.py.

Highlighted Details

  • Utilizes Gemini 1.5 Pro for AI capabilities.
  • Designed for real-time audio-visual processing.
  • Includes memory functionality for context retention.
  • Requires specific hardware, notably a Raspberry Pi.

Maintenance & Community

The project is maintained by @advaitpaliwal. Further community or roadmap information is not detailed in the README.

Licensing & Compatibility

The project is licensed under "[License Name]". The specific license type and its implications for commercial use or closed-source linking are not fully detailed.

Limitations & Caveats

The project has significant hardware dependencies, requiring a Raspberry Pi and specific peripherals. The licensing is not clearly specified, which may impact commercial adoption.

Health Check
Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
0 stars in the last 30 days

Explore Similar Projects

Starred by Junyang Lin Junyang Lin(Core Maintainer at Alibaba Qwen), Jinze Bai Jinze Bai(Research Scientist at Alibaba Qwen), and
1 more.

Qwen-Audio by QwenLM

0.4%
2k
Audio-language model for audio understanding and chat
Created 1 year ago
Updated 1 year ago
Feedback? Help us improve.