Holodeck  by allenai

3D environment generation research paper (CVPR 2024)

Created 1 year ago
473 stars

Top 64.6% on SourcePulse

GitHubView on GitHub
Project Summary

Holodeck enables the generation of 3D embodied AI environments guided by natural language descriptions. It is designed for researchers and developers in embodied AI, offering a novel approach to creating complex, interactive virtual worlds for training and testing agents. The system leverages large language models to interpret prompts and construct detailed scenes.

How It Works

Holodeck utilizes a pipeline that translates natural language queries into structured scene descriptions. It employs a language model (GPT-4o) to parse the user's request and generate a scene graph or configuration. This configuration is then used to procedurally assemble a 3D environment within the AI2-THOR framework, which provides the underlying simulation engine and rendering capabilities. The system prioritizes generating realistic and functional layouts, with options for different solvers like DFS for improved scene structure.

Quick Start & Requirements

  • Installation: Clone the repository, create a conda environment (conda create --name holodeck python=3.10), activate it (conda activate holodeck), and install dependencies (pip install -r requirements.txt, pip install --extra-index-url https://ai2thor-pypi.allenai.org ai2thor==0+8524eadda94df0ab2dbb2ef5a577e4d37c712897).
  • Data Download: Run python -m objathor.dataset.download_holodeck_base_data --version 2023_09_23, python -m objathor.dataset.download_assets --version 2023_09_23, python -m objathor.dataset.download_annotations --version 2023_09_23, and python -m objathor.dataset.download_features --version 2023_09_23. Set OBJAVERSE_ASSETS_DIR environment variable if data is not saved to ~/.objathor-assets.
  • Usage: python holodeck/main.py --query "a living room" --openai_api_key <OPENAI_API_KEY>
  • Prerequisites: Python 3.10, Conda, OpenAI API key (GPT-4o), Unity Editor (version 2020.3.25f1), specific AI2-THOR commit.
  • Links: Project Page, Paper

Highlighted Details

  • Language-guided generation of 3D embodied AI environments.
  • Leverages GPT-4o for natural language understanding and scene construction.
  • Built upon the AI2-THOR simulation platform.
  • Supports procedural generation of diverse indoor scenes.

Maintenance & Community

The project is from Allen Institute for AI (AI2). Further community or roadmap information is not explicitly detailed in the README.

Licensing & Compatibility

The repository appears to be released under a permissive license, but specific details are not provided in the README. Compatibility for commercial use or closed-source linking would require verification of the exact license terms.

Limitations & Caveats

The system requires specific Unity Editor versions and a particular commit of the AI2-THOR repository, indicating potential setup complexity and version sensitivity. Access to GPT-4o is mandatory for operation. Older versions of the repo might require specific flags (--use_milp False) for optimal performance.

Health Check
Last Commit

5 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
2
Star History
14 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Luis Capelo Luis Capelo(Cofounder of Lightning AI), and
6 more.

threestudio by threestudio-project

0.2%
7k
Framework for 3D content generation from text/images using 2D diffusion
Created 2 years ago
Updated 9 months ago
Feedback? Help us improve.