macOS AI assistant answers questions about any application, in context and in audio
Top 34.2% on sourcepulse
macOSpilot is a voice and vision-powered AI assistant designed to answer user questions about any application running on macOS, directly within the context of the active window. It targets macOS users who want to quickly get information or assistance without switching applications, offering a seamless, in-context, and audio-based interaction.
How It Works
The assistant leverages a NodeJS/Electron architecture. Upon activation via a keyboard shortcut, it captures a screenshot of the active window and records user voice input. This data is sent to OpenAI's Whisper API for transcription, then to the GPT Vision API along with the screenshot for analysis. The AI's response is displayed as an overlay on the active window and read aloud using OpenAI's TTS API. This approach allows for application-agnostic context awareness and natural language interaction.
Quick Start & Requirements
yarn install
or npm install
.yarn start
or npm start
in the terminal..app
using npm run package-mac
.Highlighted Details
Maintenance & Community
The project is maintained by a self-taught developer, @ralfelfving on Twitter/X, who also shares tutorials on YouTube. The project is open-source, with potential for community contributions.
Licensing & Compatibility
The repository does not explicitly state a license in the provided README. Compatibility for commercial use or closed-source linking is not specified.
Limitations & Caveats
The OpenAI API key is not stored encrypted. Conversation state is not persistent between application sessions. Screenshot and audio data are stored locally and overwritten, not automatically deleted. The developer notes that the code may not be "beautiful nor efficient."
1 year ago
1 day