gemini-cursor  by 13point5

Desktop AI cursor using Gemini 2.0 Flash

created 5 months ago
326 stars

Top 84.8% on sourcepulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

This project provides an AI-powered desktop cursor that interacts with the user's screen and voice, leveraging Google's Gemini 2.0 Flash model. It's designed for users needing assistance with complex visual information, website navigation, or real-time AI tutoring.

How It Works

The application utilizes Electron with React and TypeScript for its frontend, integrating with Google's Gemini API. It exploits Gemini's multimodal capabilities, including vision, speech, and function calling, to enable the AI cursor to perceive the screen, understand spoken commands, and respond verbally. The architecture prioritizes low latency for a real-time interactive experience.

Quick Start & Requirements

  • Install dependencies: npm install
  • Run the app: npm run start
  • Prerequisites: Node.js (v16+), npm, Gemini API key.
  • Setup involves cloning the repo, installing dependencies, running the app, and entering the API key.

Highlighted Details

  • AI cursor can see, hear, and speak.
  • Real-time interaction with low latency.
  • Supports use cases like understanding diagrams, navigating websites, and AI tutoring.

Maintenance & Community

The project is maintained by @13point5. Further community or roadmap information is not detailed in the README.

Licensing & Compatibility

The repository does not explicitly state a license. Users should verify compatibility for commercial or closed-source use.

Limitations & Caveats

The project is experimental and relies on the Gemini 2.0 Flash model, which may have limitations or undergo changes. Gemini API key is required for functionality.

Health Check
Last commit

5 months ago

Responsiveness

1+ week

Pull Requests (30d)
0
Issues (30d)
0
Star History
6 stars in the last 90 days

Explore Similar Projects

Starred by Lysandre Debut Lysandre Debut(Chief Open-Source Officer at Hugging Face), Mckay Wrigley Mckay Wrigley(Founder of Takeoff AI), and
1 more.

cheating-daddy by sohzm

1.8%
4k
Real-time AI assistance during calls
created 2 months ago
updated 4 days ago
Feedback? Help us improve.