Windows-MCP.Net by shuyu-labs

.NET server for AI-driven Windows desktop automation

Created 6 months ago

251 stars

Top 99.8% on SourcePulse

Project Summary

Summary

Windows-MCP.Net provides a .NET-based server implementing the Model Context Protocol (MCP) for AI assistants to interact with the Windows desktop environment. It targets developers and power users seeking to automate complex desktop tasks, offering a robust bridge between AI models and the Windows OS for enhanced productivity and application control.

How It Works

This project acts as an MCP server, leveraging .NET 10.0 to expose a comprehensive suite of tools for Windows automation. It translates AI commands into direct OS interactions, covering application launching, UI manipulation, file system operations, OCR, and system controls. The design prioritizes a structured, programmatic interface for AI agents, enabling sophisticated, context-aware desktop automation beyond simple scripting.

Quick Start & Requirements

Prerequisites: Windows operating system, .NET 10.0 Runtime or higher.
Installation: Global installation via dotnet tool install --global WindowsMCP.Net is recommended. Alternatively, it can be run directly from source code for development.
Configuration: Requires adding the server configuration to an MCP client, with specific JSON setups provided for both global and source-based execution.
Links: .NET 10 download page.

Highlighted Details

Extensive toolset includes application launching, PowerShell integration, desktop state capture, clipboard, mouse/keyboard operations, window management, web scraping, browser control, screenshots, file system operations, OCR, and system controls (brightness, volume, resolution).
Advanced UI element identification supports finding elements by text, class name, or automation ID, with mechanisms for waiting for elements.
Comprehensive file system management covers reading, writing, copying, moving, deleting, listing, searching files, and creating/deleting directories.
OCR capabilities allow text extraction from screen regions or full screens, text finding, and coordinate retrieval.

Maintenance & Community

The project outlines contribution guidelines and a roadmap with phased development plans. Specific community links (e.g., Discord, Slack) or notable contributors are not detailed in the provided README.

Licensing & Compatibility

License: MIT License.
Compatibility: Compatible with commercial use under the MIT license terms. Requires Windows OS and .NET 10.0.

Limitations & Caveats

The project necessitates .NET 10.0 and appropriate Windows permissions for its automation functions. Advanced features like enhanced UI recognition, multi-language OCR, and multimedia processing are planned for future development phases. Users must adhere to relevant laws and software agreements, as developers disclaim liability for misuse.

Windows-MCP.Net by shuyu-labs

Explore Similar Projects

awesome-gemini-cli by Piebald-AI

MCPControl by claude-did-this

nova-act by aws

ai-in-the-terminal by theNetworkChuck

ClawX by ValueCell-ai

pywinassistant by a-real-ai

ralph-tui by subsy

terminator by mediar-ai

workany by workany-ai

ii-agent by Intelligent-Internet

ANUS by anus-dev

owl by camel-ai