Holodeck  by allenai

3D environment generation research paper (CVPR 2024)

created 1 year ago
453 stars

Top 67.6% on sourcepulse

GitHubView on GitHub
Project Summary

Holodeck enables the generation of 3D embodied AI environments guided by natural language descriptions. It is designed for researchers and developers in embodied AI, offering a novel approach to creating complex, interactive virtual worlds for training and testing agents. The system leverages large language models to interpret prompts and construct detailed scenes.

How It Works

Holodeck utilizes a pipeline that translates natural language queries into structured scene descriptions. It employs a language model (GPT-4o) to parse the user's request and generate a scene graph or configuration. This configuration is then used to procedurally assemble a 3D environment within the AI2-THOR framework, which provides the underlying simulation engine and rendering capabilities. The system prioritizes generating realistic and functional layouts, with options for different solvers like DFS for improved scene structure.

Quick Start & Requirements

  • Installation: Clone the repository, create a conda environment (conda create --name holodeck python=3.10), activate it (conda activate holodeck), and install dependencies (pip install -r requirements.txt, pip install --extra-index-url https://ai2thor-pypi.allenai.org ai2thor==0+8524eadda94df0ab2dbb2ef5a577e4d37c712897).
  • Data Download: Run python -m objathor.dataset.download_holodeck_base_data --version 2023_09_23, python -m objathor.dataset.download_assets --version 2023_09_23, python -m objathor.dataset.download_annotations --version 2023_09_23, and python -m objathor.dataset.download_features --version 2023_09_23. Set OBJAVERSE_ASSETS_DIR environment variable if data is not saved to ~/.objathor-assets.
  • Usage: python holodeck/main.py --query "a living room" --openai_api_key <OPENAI_API_KEY>
  • Prerequisites: Python 3.10, Conda, OpenAI API key (GPT-4o), Unity Editor (version 2020.3.25f1), specific AI2-THOR commit.
  • Links: Project Page, Paper

Highlighted Details

  • Language-guided generation of 3D embodied AI environments.
  • Leverages GPT-4o for natural language understanding and scene construction.
  • Built upon the AI2-THOR simulation platform.
  • Supports procedural generation of diverse indoor scenes.

Maintenance & Community

The project is from Allen Institute for AI (AI2). Further community or roadmap information is not explicitly detailed in the README.

Licensing & Compatibility

The repository appears to be released under a permissive license, but specific details are not provided in the README. Compatibility for commercial use or closed-source linking would require verification of the exact license terms.

Limitations & Caveats

The system requires specific Unity Editor versions and a particular commit of the AI2-THOR repository, indicating potential setup complexity and version sensitivity. Access to GPT-4o is mandatory for operation. Older versions of the repo might require specific flags (--use_milp False) for optimal performance.

Health Check
Last commit

4 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
2
Star History
45 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.