Desktop AI cursor using Gemini 2.0 Flash
Top 84.8% on sourcepulse
This project provides an AI-powered desktop cursor that interacts with the user's screen and voice, leveraging Google's Gemini 2.0 Flash model. It's designed for users needing assistance with complex visual information, website navigation, or real-time AI tutoring.
How It Works
The application utilizes Electron with React and TypeScript for its frontend, integrating with Google's Gemini API. It exploits Gemini's multimodal capabilities, including vision, speech, and function calling, to enable the AI cursor to perceive the screen, understand spoken commands, and respond verbally. The architecture prioritizes low latency for a real-time interactive experience.
Quick Start & Requirements
npm install
npm run start
Highlighted Details
Maintenance & Community
The project is maintained by @13point5. Further community or roadmap information is not detailed in the README.
Licensing & Compatibility
The repository does not explicitly state a license. Users should verify compatibility for commercial or closed-source use.
Limitations & Caveats
The project is experimental and relies on the Gemini 2.0 Flash model, which may have limitations or undergo changes. Gemini API key is required for functionality.
5 months ago
1+ week