Large-Language-Models-play-StarCraftII  by histmeisah

StarCraft II environment for LLM agents, with benchmarks and summarization

created 1 year ago
285 stars

Top 92.8% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides TextStarCraft II, a pure language environment for Large Language Models (LLMs) to play StarCraft II. It addresses the limitations of previous AI agents in long-term strategic planning and interpretability by leveraging LLMs and a novel Chain of Summarization (CoS) approach. The project is targeted at AI researchers and developers interested in evaluating LLM capabilities in complex real-time strategy games.

How It Works

The core innovation is the Chain of Summarization (CoS) method, which processes raw game observations through single-frame summarization and multi-frame summarization. This allows LLMs to analyze game state, generate command recommendations, and make strategic decisions. This approach aims to provide greater strategy interpretability and expandability compared to traditional RL or SL methods.

Quick Start & Requirements

  • Installation: Requires StarCraft II installation (Windows only). Install dependencies via pip install -r requirements.txt. ChromaDB should be installed before burnysc2.
  • Prerequisites: Windows 11, Python 3.10, CUDA 12.1, PyTorch 2.1.0, OpenAI Python client (0.27.9).
  • Setup: Download StarCraft II ladder maps using the StarCraft II Editor and place them in the StarCraft II\Maps directory.
  • Running: Execute test_the_env.py for single-process testing or multiprocess_test.py for parallel execution. Key parameters include player_race (currently Protoss only), difficulty, LLM_model_name, and API keys.
  • Links: Paper, Website, Demo Video.

Highlighted Details

  • LLM agents can defeat the built-in AI at Harder (Lv5) difficulty.
  • Achieves strategy interpretability and expandability, unlike prior methods.
  • Supports various LLMs including GPT-3.5-turbo-16k, GPT4-Turbo, Gemini-Pro, GLM4, Claude2.1, and local models like Qwen.
  • Evaluation metrics include Win Rate, Population Block Ratio (PBR), Resource Utilization Ratio (RUR), Average Population Utilization (APU), and Technology Rate (TR).

Maintenance & Community

No specific community links (Discord/Slack) or notable contributors are mentioned in the README.

Licensing & Compatibility

The repository does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The environment is currently Windows-only due to Blizzard's lack of Linux support for the latest SC2 version. Only the Protoss race is supported; Zerg and Terran are under development. A single game can take approximately 7 hours to run using LLMs.

Health Check
Last commit

3 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
1
Star History
21 stars in the last 90 days

Explore Similar Projects

Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n) and Travis Fischer Travis Fischer(Founder of Agentic).

AI_Diplomacy by EveryInc

1.5%
535
AI agents for turn-based strategy game Diplomacy
created 5 months ago
updated 3 days ago
Feedback? Help us improve.