Framework for general computer control via foundation agents
Top 20.8% on sourcepulse
Cradle is a framework designed for General Computer Control (GCC), enabling foundation models to perform complex tasks across various software and games using a human-like interface of screenshots and keyboard/mouse inputs. It targets researchers and developers aiming to build autonomous agents capable of interacting with digital environments.
How It Works
Cradle operates by abstracting computer interactions into a unified environment. It processes screenshots as input, leverages Large Language Models (LLMs) for reasoning and planning, and outputs keyboard and mouse commands. The framework supports a modular design, allowing for the integration of custom skills and environment-specific logic, facilitating adaptation to new applications.
Quick Start & Requirements
conda create --name cradle-dev python=3.10
), activate it (conda activate cradle-dev
), and install dependencies (pip install -r requirements.txt
)..env
file), and spaCy language models (en_core_web_lg
).Highlighted Details
icon_replacer.py
for improving icon recognition by LLMs.Maintenance & Community
The project is actively maintained by BAAI-Agents. Further community engagement details are not explicitly provided in the README.
Licensing & Compatibility
Limitations & Caveats
The framework's effectiveness is dependent on the LLM's capabilities and the quality of the environment-specific configurations and skills. Adapting to new games or software requires careful implementation following provided guidelines.
8 months ago
1 day