alfred by askforalfred

Benchmark dataset for instruction-following agents in interactive, visually-realistic environments

Created 6 years ago

499 stars

Top 62.4% on SourcePulse

View on GitHub

2 Experts Love This Project

Binyuan Hui

Research Scientist at Alibaba Qwen

Amanpreet Singh

Cofounder of Contextual AI

Project Summary

ALFRED is a benchmark dataset and framework for embodied AI agents that learn to follow natural language instructions for everyday household tasks. It targets researchers and engineers developing agents capable of grounded language understanding and action sequencing in simulated environments, aiming to bridge the gap between academic benchmarks and real-world applications.

How It Works

ALFRED utilizes the AI2-THOR simulator to create realistic household environments. Agents are trained to map egocentric vision and natural language instructions to sequences of actions. The benchmark emphasizes long composition rollouts and non-reversible state changes, presenting challenges similar to real-world task execution.

Quick Start & Requirements

Install: Clone the repository and install requirements using pip install -r requirements.txt within a Python virtual environment.
Data: Download Trajectory JSONs and Resnet features (~17GB) using sh download_data.sh json_feat.
Prerequisites: Python 3, PyTorch 1.1.0, Torchvision 0.3.0, AI2THOR 2.1.0.
Hardware: Tested on GPU (GTX 1080 Ti, 12GB), CPU (Quad Core), 16GB RAM, Ubuntu 16.04. OpenGL support is required for the simulator.
Docs: askforalfred.com

Highlighted Details

Supports training Seq2Seq models with optional auxiliary losses for progress monitoring and subgoals.
Provides a framework for evaluating agent performance on seen and unseen test sets via email submissions to a leaderboard.
Includes Docker setup for easier deployment on cloud instances and headless environments.
Lists several State-of-the-Art (SOTA) models and their associated papers/code.

Maintenance & Community

The project is associated with prominent researchers from institutions like the University of Washington and Meta AI.
Contact for questions or issues is via askforalfred@googlegroups.com.
The AI2 leaderboard has been deprecated as of April 2025, with instructions for email submissions provided.

Licensing & Compatibility

License: MIT License.
Compatibility: Permissive MIT license allows for commercial use and integration with closed-source projects.

Limitations & Caveats

The AI2 leaderboard has been deprecated, requiring manual email submissions for evaluation. The benchmark's training process can be resource-intensive, requiring significant data downloads and GPU resources.

Health Check

Last Commit

1 month ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

14 stars in the last 30 days