Discover and explore top open-source AI tools and projects—updated daily.
AI agent for Windows GUI automation
Top 42.2% on SourcePulse
Windows-Use is an open-source automation agent designed to enable AI models to interact directly with the Windows graphical user interface (GUI). It bridges the gap between Large Language Models (LLMs) and the Windows operating system, allowing AI agents to perform tasks like opening applications, clicking buttons, typing text, and capturing UI state without relying solely on traditional computer vision. This empowers developers to integrate sophisticated automation capabilities into AI-driven applications running on Windows.
How It Works
The core approach involves the agent interacting directly with the Windows GUI layer. This method bypasses the need for complex computer vision pipelines typically used for UI automation, abstracting the OS interaction layer. This design allows any LLM to leverage the agent's capabilities for automation, promoting flexibility and reducing the dependency on specialized vision models for task execution.
Quick Start & Requirements
uv pip install windows-use
or pip install windows-use
.langchain_google_genai
and a compatible LLM (e.g., Gemini 2.0 Flash), along with the Chrome browser.Highlighted Details
Maintenance & Community
CONTRIBUTING
file within the repository. No other community channels (e.g., Discord, Slack) or roadmap details are specified in the README.Licensing & Compatibility
Limitations & Caveats
use_vision=True
parameter in example code suggests that visual processing capabilities may still be utilized or configurable.22 hours ago
Inactive