insight by advaitpaliwal

AI assistant using Gemini 1.5 Pro for multimodal memory

Created 1 year ago

320 stars

Top 84.9% on SourcePulse

Project Summary

Insight is a personal AI assistant that leverages Gemini 1.5 Pro to answer questions based on visual and auditory input, with memory capabilities. It is designed for users seeking an AI companion that can process real-time sensory data and recall past interactions.

How It Works

The system integrates Gemini 1.5 Pro for advanced reasoning and question answering. It processes input from a webcam and microphone, enabling it to understand and respond to queries related to the user's environment and conversations. Memory is managed to retain context across interactions.

Quick Start & Requirements

Install: Clone the repository and run pip install -r requirements.txt.
Prerequisites: Raspberry Pi 4 or newer, webcam, USB microphone, Logitech webcam, Sony headphones with jack, monitor, keyboard, mouse.
Software: pvporcupine, google-generativeai, SpeechRecognition, firebase-admin, google-cloud-texttospeech, picamera2.
Configuration: Requires API keys in config.py (based on config.example.py).
Usage: Run python main.py.

Highlighted Details

Utilizes Gemini 1.5 Pro for AI capabilities.
Designed for real-time audio-visual processing.
Includes memory functionality for context retention.
Requires specific hardware, notably a Raspberry Pi.

Maintenance & Community

The project is maintained by @advaitpaliwal. Further community or roadmap information is not detailed in the README.

Licensing & Compatibility

The project is licensed under "[License Name]". The specific license type and its implications for commercial use or closed-source linking are not fully detailed.

Limitations & Caveats

The project has significant hardware dependencies, requiring a Raspberry Pi and specific peripherals. The licensing is not clearly specified, which may impact commercial adoption.

insight by advaitpaliwal

Explore Similar Projects

gemini-cursor by 13point5

alibabacloud-bailian-speech-demo by aliyun

ltu by YuanGongND

vui by fluxions-ai

dia2 by nari-labs

ai_virtual_mate_web by swordswind

SALMONN by bytedance

Qwen-Audio by QwenLM

mini-omni2 by gpt-omni

AI0x0.com by mushan0x0

Linly-Talker by Kedreamix

mi-gpt by idootop