mcp-server-macos-use  by mediar-ai

AI agent for macOS OS-level control via MCP

Created 1 year ago
270 stars

Top 95.1% on SourcePulse

GitHubView on GitHub
Project Summary

Summary

This project provides an MCP-compatible server for macOS, enabling AI agents to control computer applications using OS-level tools. It targets developers integrating AI models with desktop environments, offering a standardized way to automate macOS interactions and enhance AI agent capabilities by bridging AI models with native application interfaces.

How It Works

The server is built in Swift and acts as an intermediary, translating Model Context Protocol (MCP) commands received over standard input/output (stdio) into actions on macOS. It leverages the MacosUseSDK and macOS's accessibility APIs to simulate user interactions like opening applications, clicking, typing, and pressing keys within target applications. This approach allows AI models to programmatically control the macOS user interface and access application states.

Quick Start & Requirements

  • Installation: Build the server using Swift: swift build -c debug (or release).
  • Running: Execute the compiled binary: ./.build/debug/mcp-server-macos-use.
  • Prerequisites: macOS operating system, Swift development environment. Requires the MacosUseSDK (assumed to be available locally or as an external Swift package).
  • Integration: Configure client applications (e.g., Claude Desktop) by providing the absolute path to the server executable in their MCP server settings.
  • Documentation: Official website: https://macos-use.dev/

Highlighted Details

  • Tooling: Exposes several MCP CallTool methods: macos-use_open_application_and_traverse, macos-use_click_and_traverse, macos-use_type_and_traverse, macos-use_press_key_and_traverse, and macos-use_refresh_traversal.
  • Granular Control: Tools allow specifying application identifiers (name, bundle ID, path), process IDs (PID), coordinates for clicks, text for typing, and specific key presses with modifier flags.
  • Traversal Options: Supports common optional parameters for actions, including pre/post-action tree traversal, diffing, visibility filtering, and animation feedback.

Maintenance & Community

  • Contact: Reach out to matt@mediar.ai for tailoring or inquiries.
  • Community: Discord: m13v_.
  • Development: Open to issues and custom feature requests.

Licensing & Compatibility

  • License: Not explicitly stated in the provided README.
  • Compatibility: Designed exclusively for macOS.

Limitations & Caveats

The project relies on macOS accessibility APIs, which can sometimes be brittle or change between OS versions. The MacosUseSDK dependency is assumed to be available and correctly configured. No explicit license is provided, which may impact commercial or broader adoption without clarification. The server is macOS-specific and will not function on other operating systems.

Health Check
Last Commit

1 day ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
68 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems").

macOS-use by browser-use

0.6%
2k
AI agent for macOS app automation
Created 1 year ago
Updated 1 year ago
Feedback? Help us improve.