aipointer  by gonemedia

Vision-capable AI overlay for instant desktop answers

Created 1 month ago
251 stars

Top 99.8% on SourcePulse

GitHubView on GitHub
Project Summary

AIPointer is an open-source desktop overlay application designed to provide AI-powered assistance directly within the user's workflow across macOS, Windows, and Linux. It targets users who need quick, context-aware answers about on-screen content without switching applications. The primary benefit is enhanced productivity through instant, vision-capable AI interactions anchored to the cursor's current position, leveraging user-provided API keys for privacy and cost control.

How It Works

AIPointer functions as a desktop overlay that activates via a hotkey. Upon activation, it captures a screenshot of the region around the cursor, optionally includes clipboard content, and sends this data along with a user's text prompt to a configured vision-capable LLM provider (e.g., OpenAI, Gemini, Anthropic via OpenRouter). The approach is "local-first" with no telemetry, ensuring user data privacy. Its cursor-anchored interaction model allows AI to directly interpret and respond to visual context, offering a seamless alternative to manual copy-pasting or app switching.

Quick Start & Requirements

  • Installation: Download signed builds from aipointer.app or build from source (git clone, npm install, npm run dev).
  • Prerequisites:
    • macOS: Accessibility and Screen Recording permissions are required for global hotkey detection and region screenshots. Microphone access is optional for voice input. Finder & System Events automation is optional for file attachment.
    • Windows/Linux: Microphone prompted on first use. Linux requires libsecret or KWallet for secure API key storage.
    • API Keys: Required for supported LLM providers (OpenRouter, Anthropic, OpenAI, Gemini).
  • Setup: First launch prompts for necessary permissions and API key configuration via an onboarding wizard.
  • Links: aipointer.app, GitHub Repository

Highlighted Details

  • Cursor-Anchored Vision: Answers questions about the specific content under the cursor, reducing the need for descriptive prompts.
  • Multi-Provider Support: Integrates with OpenRouter (recommended), Anthropic, OpenAI, and Google Gemini, with configurable primary and fallback provider chains.
  • File Attachment (v1.1.5): Supports attaching up to 5 files via manual selection or by selecting files in Finder/Explorer and triggering the hotkey. Supports image vision, text inlining, and document referencing.
  • Voice Interaction (v1.1.1): Includes Text-to-Speech (TTS) and Speech-to-Text (STT) with multiple engine options (System default, optional Local, optional Cloud). Supports voice commands and a hands-free conversation loop.
  • Agentic Tools: Includes built-in tools for fetching/opening URLs, copying text, saving responses, revealing workspace, reading clipboard, and launching desktop apps, all requiring user approval.

Maintenance & Community

Developed by Mario Simic, also the author of the larger AI agent project Skales. The project is actively maintained, with frequent updates noted in the README detailing new features and improvements. Community links are not explicitly provided beyond the GitHub repository.

Licensing & Compatibility

AIPointer is released under the Business Source License 1.1 (BSL-1.1), which allows free use for personal, educational, and internal business purposes. Source code is publicly available on GitHub. Commercial redistribution, SaaS hosting, white-labeling, bundling, or resale require a written commercial license. The license automatically reverts to Apache 2.0 on May 19, 2030.

Limitations & Caveats

AIPointer is designed for single-shot Q&A and bounded tools, not long-running autonomous tasks; users are directed to Skales for more complex automation. It requires an active internet connection as LLMs run server-side. File attachment via selection is not supported on Linux due to the lack of a universal file manager selection mechanism.

Health Check
Last Commit

1 week ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
8
Star History
317 stars in the last 30 days

Explore Similar Projects

Starred by Lysandre Debut Lysandre Debut(Chief Open-Source Officer at Hugging Face), Mckay Wrigley Mckay Wrigley(Founder of Takeoff AI), and
1 more.

cheating-daddy by sohzm

0.2%
5k
Real-time AI assistance during calls
Created 1 year ago
Updated 6 days ago
Feedback? Help us improve.