This repository provides a framework for deploying and evaluating Large Language Model (LLM) and Vision-Language Model (VLM) agents for personal computer gaming. It targets researchers and developers interested in AI-driven game playing and agent performance benchmarking across various game genres.
How It Works
The project utilizes a modular worker-based architecture, allowing different LLM/VLM providers (OpenAI, Anthropic, Gemini, Deepseek) to power game agents. Agents interact with games through screen capture and input simulation, with specialized workers for tasks like memory management, evidence handling, and decision-making. This approach enables flexible agent configuration and policy customization for diverse gaming scenarios.
Quick Start & Requirements
- Installation: Clone the repository, create a conda environment (
conda create -n game_cua python==3.10
), activate it (conda activate game_cua
), and install dependencies (pip install -e .
).
- Prerequisites: Python 3.10, Conda, game-specific installations (e.g., SuperMarioBros-C, Sokoban, 2048-Pygame, Python-Tetris-Game-Pygame), and API keys for supported LLM providers.
- Setup: Requires cloning the repo, setting up a Python environment, installing dependencies, and configuring API keys. Game-specific setup and ROMs are also necessary.
- Docs: https://github.com/lmgame-org/GamingAgent
Highlighted Details
- Supports multiple LLM/VLM providers including OpenAI (GPT-4o, GPT-4o-mini), Anthropic (Claude 3.5, Claude 3.7), Gemini, and Deepseek.
- Features agents for classic games like Super Mario Bros., Sokoban, 2048, Tetris, Candy Crush, and Ace Attorney.
- Includes a modular architecture with specialized workers for memory, evidence, and decision-making, particularly for Ace Attorney.
- Offers customizable policies for agent behavior, such as 'long', 'short', 'alternate', or 'mixed' for Super Mario Bros.
Maintenance & Community
- The project is hosted by lmgame-org. Further community or maintenance details are not explicitly provided in the README.
Licensing & Compatibility
- The README does not specify a license. Users should verify licensing for commercial use or closed-source integration.
Limitations & Caveats
- Game integration requires specific game installations and potential modifications to game code (e.g., Tetris game speed, screen capture regions).
- High-concurrency agent deployment with powerful models may incur significant API costs.
- Agent coordination and policy effectiveness can vary, with ongoing work to improve multi-worker coordination in games like Tetris.