Discover and explore top open-source AI tools and projects—updated daily.
OpenGenerativeAILLM benchmark using Street Fighter III to evaluate real-time decision-making
Top 28.3% on SourcePulse
This project provides a novel benchmark for evaluating Large Language Models (LLMs) by pitting them against each other in real-time gameplay of Street Fighter III. It targets AI researchers and developers seeking to assess LLM capabilities in speed, strategic thinking, adaptability, and resilience within a dynamic, interactive environment.
How It Works
LLMs act as AI players, controlled via API calls. The system provides a text description of the game state (TextRobot) or a screenshot (VisionRobot) to the LLM, which then outputs a list of moves. This approach allows LLMs to leverage their contextual understanding and decision-making abilities, differing from traditional RL models that rely solely on reward functions.
Quick Start & Requirements
make install or pip install -r requirements.txt.~/.diambra/roms..env file.make run or via Docker.Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
7 months ago
Inactive
hkust-nlp
JackHopkins
LeonGuertler