cube  by Roblox

3D foundation model research paper for Roblox asset generation

Created 6 months ago
823 stars

Top 43.1% on SourcePulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

Cube is a generative AI system from Roblox focused on 3D intelligence, aiming to assist developers in creating 3D assets, scenes, and behaviors for virtual experiences. It targets creators, researchers, and businesses, offering a text-to-shape generation model to enhance 3D asset creation.

How It Works

Cube 3D utilizes a shape tokenizer and a text-to-shape generation model. The system encodes 3D shapes into discrete tokens and then uses a transformer-based model to generate these tokens from text prompts. This approach allows for efficient representation and generation of 3D geometry, enabling the creation of complex assets from natural language descriptions.

Quick Start & Requirements

  • Install: git clone https://github.com/Roblox/cube.git && cd cube && pip install -e .[meshlab]
  • Prerequisites: Python, PyTorch with CUDA support (e.g., pip install torch --index-url https://download.pytorch.org/whl/cu124 --force-reinstall), Blender (>= 4.3) for GIF rendering.
  • Models: Download from Hugging Face (huggingface-cli download Roblox/cube3d-v0.1 --local-dir ./model_weights).
  • Inference: python -m cube3d.generate --gpt-ckpt-path model_weights/shape_gpt.safetensors --shape-ckpt-path model_weights/shape_tokenizer.safetensors --prompt "..."
  • Hardware: Recommended GPU with 24GB VRAM (16GB without --fast-inference). Tested on NVIDIA H100/A100/3080 and Apple Silicon M2.
  • Docs: Hugging Face Interactive Demo

Highlighted Details

  • Text-to-shape generation for 3D asset creation.
  • Shape tokenization and de-tokenization capabilities.
  • Optional --fast-inference flag for reduced VRAM usage (CUDA only).
  • --render-gif flag for turntable animations (requires Blender).
  • API available for programmatic use.

Maintenance & Community

  • The project is from Roblox's Foundation AI Team.
  • Mentions contributions from TRELLIS, CraftsMan3D, threestudio, Hunyuan3D-2, minGPT, dinov2, OptVQ, and 1d-tokenizer.
  • Technical report available for citation.

Licensing & Compatibility

  • License details are not explicitly stated in the README.

Limitations & Caveats

  • The --fast-inference flag is not available on macOS.
  • GIF rendering requires Blender to be in the system's PATH.
  • Future features like bounding box conditioning and scene generation are listed as "Coming Soon."
Health Check
Last Commit

1 month ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
3
Star History
25 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Luis Capelo Luis Capelo(Cofounder of Lightning AI), and
6 more.

threestudio by threestudio-project

0.2%
7k
Framework for 3D content generation from text/images using 2D diffusion
Created 2 years ago
Updated 9 months ago
Starred by Yaowei Zheng Yaowei Zheng(Author of LLaMA-Factory), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
13 more.

stable-dreamfusion by ashawkey

0.1%
9k
Text-to-3D model using NeRF and diffusion
Created 2 years ago
Updated 1 year ago
Feedback? Help us improve.