AI-first process automation tool using multimodal models
Top 30.6% on sourcepulse
OpenAdapt provides an open-source framework for AI-first process automation, enabling Large Multimodal Models (LMMs) to interact with desktop and web GUIs. It targets developers and researchers looking to automate repetitive GUI tasks by leveraging AI, offering an alternative to traditional RPA tools.
How It Works
OpenAdapt records user interactions, including screenshots and input events. It converts this data into a tokenized format, allowing transformer models to generate synthetic inputs for replaying actions. The system is model-agnostic and emphasizes learning from human demonstrations to create auto-generated prompts, grounding agents in existing processes to mitigate hallucinations and improve task completion.
Quick Start & Requirements
python -m openadapt.entrypoint
. Record actions with python -m openadapt.record "description"
. Visualize with python -m openadapt.visualize
or run the dashboard with python -m openadapt.app.dashboard.run
. Replay actions using various strategies like python -m openadapt.replay NaiveReplayStrategy
.RECORD_BROWSER_EVENTS
to true
.Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
Recordings are currently recommended to be short (under a minute) due to potential memory intensity and an open issue regarding memory leaks. Touchpad/trackpad gesture support is limited to cursor movement and clicks.
4 months ago
1 day