Discover and explore top open-source AI tools and projects—updated daily.
sseanliuReal-time AI assistant for smart glasses
New!
Top 30.1% on SourcePulse
VisionClaw offers a real-time AI assistant for Meta Ray-Ban smart glasses, integrating voice, vision, and agentic actions via Gemini Live and optional OpenClaw. It targets users seeking hands-free, context-aware assistance, enabling actions through connected apps.
How It Works
An iOS app bridges Meta glasses (or iPhone camera) with the Gemini Live API. Video (~1fps JPEG) and audio (16kHz PCM) stream to Gemini, which processes input for real-time visual description and voice understanding. Gemini responds with audio or tool calls. The optional OpenClaw gateway translates these calls into actions across 56+ skills (messaging, web search, smart home), enabling agentic capabilities. This architecture prioritizes native audio handling and direct tool execution.
Quick Start & Requirements
Clone the repo and open CameraAccess.xcodeproj in Xcode. Configure your Gemini API key (free from Google AI Studio) in GeminiConfig.swift. Requires iOS 17.0+ and Xcode 15.0+. Test without glasses using iPhone camera mode. For agentic actions, set up the OpenClaw gateway on a local machine, ensuring network access and enabling chatCompletions.
Highlighted Details
Maintenance & Community
The README provides no specific details on maintenance contributors, community channels, or roadmap information.
Licensing & Compatibility
Licensed under terms in the root LICENSE file. Specifics on commercial use or closed-source compatibility are not detailed in the README.
Limitations & Caveats
Agentic actions require optional OpenClaw setup. Primary use case needs Meta Ray-Ban glasses; iPhone mode serves as a testing alternative. The project is iOS-specific and depends on external APIs (Gemini Live).
1 day ago
Inactive
OthersideAI