ProctorAI  by jam3scampbell

Multimodal AI app to discourage procrastination

created 1 year ago
402 stars

Top 73.2% on sourcepulse

GitHubView on GitHub
Project Summary

ProctorAI is a multimodal AI designed to combat procrastination by monitoring user screen activity and intervening when unproductive behavior is detected. It targets individuals seeking a more intelligent and adaptable productivity tool than traditional site blockers, offering personalized interventions based on user-defined work sessions and rules.

How It Works

ProctorAI captures screenshots at user-defined intervals and processes them with multimodal LLMs (e.g., Claude-3.5-Sonnet, GPT-4o, LLaVA). Users specify their work goals and acceptable/unacceptable behaviors for each session, allowing for nuanced rule enforcement. If procrastination is detected, the AI can take control of the screen, issue personalized verbal warnings via text-to-speech, and enforce a cooldown period for the user to cease the distracting activity. A "two-tier" mode is recommended for cost efficiency, using a local model like LLaVA as a router to pre-screen images before sending them to a more powerful, expensive model.

Quick Start & Requirements

  • Install via git clone, create a virtual environment, and run pip install -r requirements.txt.
  • Execute the GUI with ./run.sh.
  • Requires MacOS (Windows version available in windows branch).
  • API keys for chosen LLMs (OpenAI, Anthropic, Gemini) and Eleven Labs (for TTS) must be set as environment variables.
  • For "two-tier" mode, Ollama and the LLaVA model are required.
  • Official documentation and setup guide: https://github.com/jam3scampbell/ProctorAI

Highlighted Details

  • Leverages multimodal LLMs for context-aware procrastination detection.
  • Supports personalized session specifications for flexible rule enforcement.
  • Features an "alive" design goal to create an intuitive sense of being monitored.
  • Offers optional text-to-speech for verbal interventions.

Maintenance & Community

The project is under active development with a roadmap including finetuning models, enhanced session scheduling, and improved user interaction. Community links are not explicitly provided in the README.

Licensing & Compatibility

The repository does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The project is currently MacOS-specific, though a Windows branch exists. It relies on external LLM APIs, incurring potential costs. The "active development" status suggests potential for breaking changes or incomplete features.

Health Check
Last commit

5 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
44 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Toran Bruce Richards Toran Bruce Richards(Founder of AutoGPT), and
2 more.

OS-Copilot by OS-Copilot

0.1%
2k
OS agent for automating daily tasks
created 1 year ago
updated 10 months ago
Feedback? Help us improve.