python-sdk by askui

Python tool for AI-driven desktop, mobile, and HMI automation

Created 1 year ago

516 stars

Top 60.9% on SourcePulse

3 Experts Love This Project

apsdehal

Amanpreet Singh

Cofounder of Contextual AI

MagMueller

Cofounder of Browser Use

omarsar

Founder of DAIR.AI

Project Summary

This project provides an AI-powered framework for automating desktop and mobile tasks across Windows, macOS, Linux, Android, and iOS. It targets developers and RPA engineers seeking to integrate AI agents for UI automation, offering both step-by-step commands and intent-based instructions for flexible task execution.

How It Works

The framework combines a custom-built "Agent OS" for cross-platform UI interaction (screenshots, mouse control, typing) with various AI models for element recognition and action execution. Users can leverage models like Anthropic's Claude or AskUI's proprietary Prompt-to-Automation (PTA) models, allowing for flexible AI integration and on-premise deployment.

Quick Start & Requirements

Install Agent OS: Download OS-specific installers from provided links (Windows, Linux, macOS). Linux users must use XOrg, not Wayland.
Install Python package: pip install askui (requires Python >= 3.10).
Authentication: Set environment variables for AI model providers (e.g., ANTHROPIC_API_KEY, ASKUI_WORKSPACE_ID, ASKUI_TOKEN).
Demo: Test with Hugging Face models via Spaces API (rate-limited).
Docs: askui.com

Highlighted Details

Supports Windows, Linux, macOS, Android, iOS, and Citrix.
Offers in-background automation on Windows.
Allows hot-swapping and retraining of AI models.
Provides direct access to underlying OS and browser tools.
Supports advanced element locating using visual descriptions and AI elements.

Maintenance & Community

Active development with a Discord community available via invite link.
Telemetry is enabled by default but can be disabled.

Licensing & Compatibility

The license is not explicitly stated in the README.
Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The experimental chat feature has numerous known issues, including inability to stop agents, lack of retry options, and focus problems.
Response schema extraction is limited to the default askui model.
Multi-monitor support requires manual display number selection.

Health Check

Last Commit

1 day ago

Responsiveness

1 day

Pull Requests (30d)

19

Issues (30d)

0

Star History

16 stars in the last 30 days

Explore Similar Projects

awesome-gemini-cli by Piebald-AI

AI agent for terminal development and automation

Created 7 months ago

Updated 3 days ago

MaaMCP by MAA-AI

AI agent for cross-platform device automation

Created 3 months ago

Updated 2 months ago

Starred by

Rodrigo Nader

Rodrigo Nader(Cofounder of Langflow) and

Harrison Chase

Harrison Chase(Founder of LangChain).

Clevrr-Computer by Clevrr-AI

Automation agent for precise system actions

Created 1 year ago

Updated 1 year ago

MCPControl by claude-did-this

Control Windows desktop via AI

Created 1 year ago

Updated 3 months ago

computer-agent by suitedaces

Desktop app for AI computer control via Claude API

Created 1 year ago

Updated 2 months ago

lecca-io by lecca-digital

AI platform for building and deploying LLM-powered agents

Created 1 year ago

Updated 9 months ago

GenericAgent by lsdefine

Autonomous PC agent for desktop automation

Created 1 month ago

Updated 18 hours ago

pywinassistant by a-real-ai

Computer-Using-Agent for Windows GUI automation via natural language

Created 2 years ago

Updated 1 year ago

Starred by

Travis Fischer

Travis Fischer(Founder of Agentic) and

Amanpreet Singh

Amanpreet Singh(Cofounder of Contextual AI).

Peekaboo by steipete

macOS GUI automation and screenshot analysis tool

Created 9 months ago

Updated 23 hours ago

Starred by

Clement Delangue

Clement Delangue(Cofounder of Hugging Face),

Lewis Tunstall

Lewis Tunstall(Research Engineer at Hugging Face), and

1 more.

Open-Claude-Cowork by DevAgentForge

Desktop AI collaboration partner for complex tasks

Created 1 month ago

Updated 4 weeks ago

Starred by

Chip Huyen

Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems") and

Gregor Zunic

Gregor Zunic(Cofounder of Browser Use).

droidrun by droidrun

Framework for controlling Android devices via LLM agents

Created 11 months ago

Updated 1 day ago

Starred by

Yaowei Zheng

Yaowei Zheng(Author of LLaMA-Factory),

Travis Fischer

Travis Fischer(Founder of Agentic), and

1 more.

UFO by microsoft

Desktop AgentOS for automating Windows workflows via natural language

Created 2 years ago

Updated 1 day ago

Feedback? Help us improve.