VisionClaw by Intent-Lab

Real-time AI assistant for smart glasses

Created 3 months ago

2,320 stars

Top 19.1% on SourcePulse

View on GitHub

2 Experts Love This Project

Peer Richelsen

Cofounder of Cal.com

Jared Palmer

SVP at GitHub; Founder of Turborepo; Author of Formik, TSDX

Project Summary

VisionClaw offers a real-time AI assistant for Meta Ray-Ban smart glasses, integrating voice, vision, and agentic actions via Gemini Live and optional OpenClaw. It targets users seeking hands-free, context-aware assistance, enabling actions through connected apps.

How It Works

An iOS app bridges Meta glasses (or iPhone camera) with the Gemini Live API. Video (~1fps JPEG) and audio (16kHz PCM) stream to Gemini, which processes input for real-time visual description and voice understanding. Gemini responds with audio or tool calls. The optional OpenClaw gateway translates these calls into actions across 56+ skills (messaging, web search, smart home), enabling agentic capabilities. This architecture prioritizes native audio handling and direct tool execution.

Quick Start & Requirements

Clone the repo and open CameraAccess.xcodeproj in Xcode. Configure your Gemini API key (free from Google AI Studio) in GeminiConfig.swift. Requires iOS 17.0+ and Xcode 15.0+. Test without glasses using iPhone camera mode. For agentic actions, set up the OpenClaw gateway on a local machine, ensuring network access and enabling chatCompletions.

Highlighted Details

Real-time streaming pipeline: ~1fps video, bidirectional audio between glasses/iPhone, app, and Gemini Live.
Gemini Live API: Native audio handling, bypassing separate STT/TTS.
OpenClaw extensibility: Access to 56+ skills for diverse task execution.
Flexible testing: Supports Meta Ray-Ban glasses and iPhone camera.

Maintenance & Community

The README provides no specific details on maintenance contributors, community channels, or roadmap information.

Licensing & Compatibility

Licensed under terms in the root LICENSE file. Specifics on commercial use or closed-source compatibility are not detailed in the README.

Limitations & Caveats

Agentic actions require optional OpenClaw setup. Primary use case needs Meta Ray-Ban glasses; iPhone mode serves as a testing alternative. The project is iOS-specific and depends on external APIs (Gemini Live).

Health Check

Last Commit

3 weeks ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

128 stars in the last 30 days