Multimodal AI app to discourage procrastination
Top 73.2% on sourcepulse
ProctorAI is a multimodal AI designed to combat procrastination by monitoring user screen activity and intervening when unproductive behavior is detected. It targets individuals seeking a more intelligent and adaptable productivity tool than traditional site blockers, offering personalized interventions based on user-defined work sessions and rules.
How It Works
ProctorAI captures screenshots at user-defined intervals and processes them with multimodal LLMs (e.g., Claude-3.5-Sonnet, GPT-4o, LLaVA). Users specify their work goals and acceptable/unacceptable behaviors for each session, allowing for nuanced rule enforcement. If procrastination is detected, the AI can take control of the screen, issue personalized verbal warnings via text-to-speech, and enforce a cooldown period for the user to cease the distracting activity. A "two-tier" mode is recommended for cost efficiency, using a local model like LLaVA as a router to pre-screen images before sending them to a more powerful, expensive model.
Quick Start & Requirements
git clone
, create a virtual environment, and run pip install -r requirements.txt
../run.sh
.windows
branch).Highlighted Details
Maintenance & Community
The project is under active development with a roadmap including finetuning models, enhanced session scheduling, and improved user interaction. Community links are not explicitly provided in the README.
Licensing & Compatibility
The repository does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.
Limitations & Caveats
The project is currently MacOS-specific, though a Windows branch exists. It relies on external LLM APIs, incurring potential costs. The "active development" status suggests potential for breaking changes or incomplete features.
5 months ago
1 day