Steve by YuvDwi

AI agents for embodied gameplay and complex task execution

Created 4 months ago

1,208 stars

Top 32.2% on SourcePulse

Project Summary

This project provides AI agents, named "Steve," that can play Minecraft, automating tasks through natural language commands. It targets Minecraft players and AI researchers interested in embodied agents, offering benefits like autonomous gameplay, collaborative task execution, and a novel interaction paradigm.

How It Works

Each Steve agent operates on a ReAct-style loop: Reason, Act, and Observe. Natural language instructions are processed by an LLM (Groq, OpenAI, or Gemini) which breaks them down into executable code. This code interacts directly with Minecraft's game mechanics. For multi-agent coordination, a server-side manager divides tasks like building into spatial sections, assigns agents, prevents conflicts, and rebalances workloads deterministically.

Quick Start & Requirements

Prerequisites: Minecraft 1.20.1 with Forge, Java 17, and an API key for a supported LLM (OpenAI, Groq, or Gemini).
Installation: Download the JAR from the releases page, place it in the Minecraft mods folder, launch the game, and configure your API key in config/steve-common.toml.
Usage: Spawn an agent with the command /steve spawn <name> and press 'K' to open the command panel.
Links: Project repository: https://github.com/YuvDwi/Steve

Highlighted Details

Resource Extraction: Agents autonomously determine optimal mining locations and strategies.
Autonomous Building: Agents plan structures, manage materials, and construct them block by block.
Collaborative Execution: Multiple agents can automatically partition and parallelize complex tasks like building a castle.
Natural Language Interface: Agents interpret and execute commands such as "mine 20 iron ore" or "build a house near me."

Maintenance & Community

The project is inspired by LangChain/AutoGPT and utilizes Minecraft Forge. Planned features include crafting, voice commands, persistent memory via vector databases, and asynchronous action execution. Community interaction details like Discord/Slack are not specified.

Licensing & Compatibility

The project is licensed under the MIT license, permitting commercial use and closed-source linking. It requires Minecraft 1.20.1 with Forge.

Limitations & Caveats

Agent intelligence is dependent on the LLM used; GPT-3.5 may make suboptimal decisions compared to GPT-4. Currently, agents cannot craft tools. Actions are synchronous, preventing multitasking. Agent memory and world context reset upon game restart, though persistent memory is planned.

Steve by YuvDwi

Explore Similar Projects

awesome-large-multimodal-agents by jun0wanan

langchain-skills by langchain-ai

SimWorld by SimWorld-AI

Agentic-AI-Systems by alirezadir

GPT-Agent by SamurAIGPT

agentstack by i-am-bee

oreilly-ai-agents by sinanuozdemir

eino-examples by cloudwego

GPTeam by 101dotxyz

agentic-design-patterns by xindoo

camel by camel-ai

agno by agno-agi