Desktop AgentOS for automating Windows workflows via natural language
Top 7.1% on sourcepulse
UFO² (Desktop AgentOS) aims to automate complex, multi-application workflows on Windows using natural language. It targets power users and developers seeking to streamline repetitive tasks by leveraging AI agents that can interact with both graphical user interfaces and native APIs. The primary benefit is robust, intelligent automation that goes beyond simple UI scripting.
How It Works
UFO² employs a multi-agent architecture, with a HostAgent orchestrating specialized AppAgents. Each AppAgent utilizes a ReAct loop, multimodal perception, and a "Knowledge Substrate" for retrieval-augmented generation (RAG) from diverse sources like documentation, web searches, and execution traces. A key innovation is the "Speculative Executor," which reduces LLM latency by batching and validating predicted actions against live UI states. It also features hybrid control detection, combining UI Automation (UIA) with visual analysis for broader compatibility.
Quick Start & Requirements
pip install -r requirements.txt
after cloning the repository.ufo/config/config.yaml
.Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
The "Picture-in-Picture Desktop" feature is marked as "coming soon" and not yet available. The project disclaimer notes specific terms and conditions regarding functionality and data handling.
1 month ago
Inactive