open-codex-computer-use  by iFurySt

AI-driven computer interaction service for macOS and Windows

Created 1 week ago

New!

581 stars

Top 55.5% on SourcePulse

GitHubView on GitHub
Project Summary

Summary

This project offers an open-source, MCP-exposed service for AI agents to interact with macOS and Windows GUIs, serving as an alternative to OpenAI's Codex Computer Use. It enables programmatic control over applications, enhancing AI's computer-based task capabilities.

How It Works

macOS implementation uses native Accessibility APIs; Windows runtime employs UI Automation with Win32 message fallback. Both expose a consistent 9-tool interface via MCP, allowing direct integration with AI agents or MCP clients for non-intrusive computer use control.

Quick Start & Requirements

  • macOS: Install via npm: npm i -g open-computer-use. Grant Accessibility/Screen Recording permissions to Open Computer Use.app. Run MCP server: open-computer-use mcp. Client config in README.
  • Windows: Build runtime: ./scripts/build-open-computer-use-windows.sh --arch arm64. Run .exe directly (e.g., open-computer-use.exe mcp). Note: service/detached SSH may limit UI Automation window exposure.
  • Prerequisites: Node.js (macOS npm), Go (Windows build), Swift (CursorMotion).

Highlighted Details

  • Direct integration with MCP clients and AI frameworks (Claude, Codex) via install-claude-mcp, install-codex-mcp.
  • Supports command sequences with optional delays and state reuse.
  • Experimental Windows runtime built with Go.
  • Includes "Cursor Motion," an open-source macOS cursor control system.

Maintenance & Community

No specific details on contributors, sponsorships, or community channels (Discord, Slack) are provided in the README.

Licensing & Compatibility

Licensed under the MIT License. This permissive license allows commercial use and integration into closed-source projects without significant restrictions.

Limitations & Caveats

Windows runtime is experimental; service/detached SSH execution may yield unexpected results. Specific environment variables (OPEN_COMPUTER_USE_WINDOWS_ALLOW_APP_LAUNCH, OPEN_COMPUTER_USE_WINDOWS_ALLOW_FOCUS_ACTIONS, OPEN_COMPUTER_USE_WINDOWS_ALLOW_UIA_TEXT_FALLBACK) are needed for certain foreground behaviors and input fallbacks on Windows.

Health Check
Last Commit

5 days ago

Responsiveness

Inactive

Pull Requests (30d)
9
Issues (30d)
2
Star History
586 stars in the last 11 days

Explore Similar Projects

Starred by Jason Huggins Jason Huggins(Creator of Selenium), Eric Zhu Eric Zhu(Coauthor of AutoGen; Research Scientist at Microsoft Research), and
3 more.

UI-TARS-desktop by bytedance

0.3%
30k
GUI agent app for computer control via natural language
Created 1 year ago
Updated 13 hours ago
Feedback? Help us improve.