Minecraft agent learning via video pretraining research paper
Top 28.2% on sourcepulse
This repository provides code and pre-trained models for Video Pre-Training (VPT), a method for training agents to act in environments by watching unlabeled online videos. It's primarily aimed at researchers and developers interested in imitation learning and reinforcement learning in complex environments like Minecraft. The key benefit is enabling agents to learn sophisticated behaviors from diverse video data without explicit task supervision.
How It Works
VPT utilizes a transformer-based architecture to process video frames and predict actions. The core idea is to learn a general-purpose "foundational model" from a large corpus of unlabeled Minecraft gameplay videos using self-supervised objectives, such as predicting future actions or reconstructing masked video segments. This foundational model is then fine-tuned on smaller, task-specific datasets, potentially using reinforcement learning to optimize for specific rewards, allowing it to adapt to new tasks efficiently.
Quick Start & Requirements
pip install git+https://github.com/minerllabs/minerl
pip install -r requirements.txt
torch==1.9.0
, incompatible with Python 3.10+. For Python 3.10+, install a newer PyTorch version (pip install torch
), but expect potential performance changes.python run_agent.py --model [path to .model file] --weights [path to .weight file]
Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
1 year ago
1 day