ProctorAI by jam3scampbell

Multimodal AI app to discourage procrastination

Created 1 year ago

416 stars

Top 70.4% on SourcePulse

Project Summary

ProctorAI is a multimodal AI designed to combat procrastination by monitoring user screen activity and intervening when unproductive behavior is detected. It targets individuals seeking a more intelligent and adaptable productivity tool than traditional site blockers, offering personalized interventions based on user-defined work sessions and rules.

How It Works

ProctorAI captures screenshots at user-defined intervals and processes them with multimodal LLMs (e.g., Claude-3.5-Sonnet, GPT-4o, LLaVA). Users specify their work goals and acceptable/unacceptable behaviors for each session, allowing for nuanced rule enforcement. If procrastination is detected, the AI can take control of the screen, issue personalized verbal warnings via text-to-speech, and enforce a cooldown period for the user to cease the distracting activity. A "two-tier" mode is recommended for cost efficiency, using a local model like LLaVA as a router to pre-screen images before sending them to a more powerful, expensive model.

Quick Start & Requirements

Install via git clone, create a virtual environment, and run pip install -r requirements.txt.
Execute the GUI with ./run.sh.
Requires MacOS (Windows version available in windows branch).
API keys for chosen LLMs (OpenAI, Anthropic, Gemini) and Eleven Labs (for TTS) must be set as environment variables.
For "two-tier" mode, Ollama and the LLaVA model are required.
Official documentation and setup guide: https://github.com/jam3scampbell/ProctorAI

Highlighted Details

Leverages multimodal LLMs for context-aware procrastination detection.
Supports personalized session specifications for flexible rule enforcement.
Features an "alive" design goal to create an intuitive sense of being monitored.
Offers optional text-to-speech for verbal interventions.

Maintenance & Community

The project is under active development with a roadmap including finetuning models, enhanced session scheduling, and improved user interaction. Community links are not explicitly provided in the README.

Licensing & Compatibility

The repository does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The project is currently MacOS-specific, though a Windows branch exists. It relies on external LLM APIs, incurring potential costs. The "active development" status suggests potential for breaking changes or incomplete features.

ProctorAI by jam3scampbell

Explore Similar Projects

nucleo-ai by AndrewVeee

chatluna by ChatLunaLab

AutoGLM-For-Android by Luokavin

liubai by yenche123

witsy by nbonamy

chat_gpt_sdk by redevrx

teams-sdk by microsoft

cactus by cactus-compute

bailing by wwbin2017

AppAgent by TencentQQGYLab

gallery by google-ai-edge

self-operating-computer by OthersideAI