cube  by Roblox

3D foundation model research paper for Roblox asset generation

created 4 months ago
788 stars

Top 45.4% on sourcepulse

GitHubView on GitHub
Project Summary

Cube is a generative AI system from Roblox focused on 3D intelligence, aiming to assist developers in creating 3D assets, scenes, and behaviors for virtual experiences. It targets creators, researchers, and businesses, offering a text-to-shape generation model to enhance 3D asset creation.

How It Works

Cube 3D utilizes a shape tokenizer and a text-to-shape generation model. The system encodes 3D shapes into discrete tokens and then uses a transformer-based model to generate these tokens from text prompts. This approach allows for efficient representation and generation of 3D geometry, enabling the creation of complex assets from natural language descriptions.

Quick Start & Requirements

  • Install: git clone https://github.com/Roblox/cube.git && cd cube && pip install -e .[meshlab]
  • Prerequisites: Python, PyTorch with CUDA support (e.g., pip install torch --index-url https://download.pytorch.org/whl/cu124 --force-reinstall), Blender (>= 4.3) for GIF rendering.
  • Models: Download from Hugging Face (huggingface-cli download Roblox/cube3d-v0.1 --local-dir ./model_weights).
  • Inference: python -m cube3d.generate --gpt-ckpt-path model_weights/shape_gpt.safetensors --shape-ckpt-path model_weights/shape_tokenizer.safetensors --prompt "..."
  • Hardware: Recommended GPU with 24GB VRAM (16GB without --fast-inference). Tested on NVIDIA H100/A100/3080 and Apple Silicon M2.
  • Docs: Hugging Face Interactive Demo

Highlighted Details

  • Text-to-shape generation for 3D asset creation.
  • Shape tokenization and de-tokenization capabilities.
  • Optional --fast-inference flag for reduced VRAM usage (CUDA only).
  • --render-gif flag for turntable animations (requires Blender).
  • API available for programmatic use.

Maintenance & Community

  • The project is from Roblox's Foundation AI Team.
  • Mentions contributions from TRELLIS, CraftsMan3D, threestudio, Hunyuan3D-2, minGPT, dinov2, OptVQ, and 1d-tokenizer.
  • Technical report available for citation.

Licensing & Compatibility

  • License details are not explicitly stated in the README.

Limitations & Caveats

  • The --fast-inference flag is not available on macOS.
  • GIF rendering requires Blender to be in the system's PATH.
  • Future features like bounding box conditioning and scene generation are listed as "Coming Soon."
Health Check
Last commit

1 week ago

Responsiveness

1 week

Pull Requests (30d)
2
Issues (30d)
0
Star History
127 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Patrick von Platen Patrick von Platen(Core Contributor to Hugging Face Transformers and Diffusers), and
7 more.

stable-dreamfusion by ashawkey

0.1%
9k
Text-to-3D model using NeRF and diffusion
created 2 years ago
updated 1 year ago
Feedback? Help us improve.