cube by Roblox

3D foundation model research paper for Roblox asset generation

Created 10 months ago

877 stars

Top 41.1% on SourcePulse

1 Expert Loves This Project

hahnbeelee

Cofounder of Mintlify

Project Summary

Cube is a generative AI system from Roblox focused on 3D intelligence, aiming to assist developers in creating 3D assets, scenes, and behaviors for virtual experiences. It targets creators, researchers, and businesses, offering a text-to-shape generation model to enhance 3D asset creation.

How It Works

Cube 3D utilizes a shape tokenizer and a text-to-shape generation model. The system encodes 3D shapes into discrete tokens and then uses a transformer-based model to generate these tokens from text prompts. This approach allows for efficient representation and generation of 3D geometry, enabling the creation of complex assets from natural language descriptions.

Quick Start & Requirements

Install: git clone https://github.com/Roblox/cube.git && cd cube && pip install -e .[meshlab]
Prerequisites: Python, PyTorch with CUDA support (e.g., pip install torch --index-url https://download.pytorch.org/whl/cu124 --force-reinstall), Blender (>= 4.3) for GIF rendering.
Models: Download from Hugging Face (huggingface-cli download Roblox/cube3d-v0.1 --local-dir ./model_weights).
Inference: python -m cube3d.generate --gpt-ckpt-path model_weights/shape_gpt.safetensors --shape-ckpt-path model_weights/shape_tokenizer.safetensors --prompt "..."
Hardware: Recommended GPU with 24GB VRAM (16GB without --fast-inference). Tested on NVIDIA H100/A100/3080 and Apple Silicon M2.
Docs: Hugging Face Interactive Demo

Highlighted Details

Text-to-shape generation for 3D asset creation.
Shape tokenization and de-tokenization capabilities.
Optional --fast-inference flag for reduced VRAM usage (CUDA only).
--render-gif flag for turntable animations (requires Blender).
API available for programmatic use.

Maintenance & Community

The project is from Roblox's Foundation AI Team.
Mentions contributions from TRELLIS, CraftsMan3D, threestudio, Hunyuan3D-2, minGPT, dinov2, OptVQ, and 1d-tokenizer.
Technical report available for citation.

Licensing & Compatibility

License details are not explicitly stated in the README.

Limitations & Caveats

The --fast-inference flag is not available on macOS.
GIF rendering requires Blender to be in the system's PATH.
Future features like bounding box conditioning and scene generation are listed as "Coming Soon."

Health Check

Last Commit

5 months ago

Responsiveness

1 week

Pull Requests (30d)

0

Issues (30d)

2

Star History

11 stars in the last 30 days

Explore Similar Projects

MeshXL by OpenMeshLab

3D foundation model for mesh generation using neural coordinate fields

Created 1 year ago

Updated 9 months ago

richdreamer by modelscope

Text-to-3D model for generating detailed 3D assets

Created 2 years ago

Updated 1 year ago

EmbodiedGen by HorizonRobotics

Generative 3D world engine for embodied AI research

Created 7 months ago

Updated 1 week ago

Awesome-Text-to-3D by yyeboah

Curated list of Text-to-3D and Diffusion-to-3D research papers

Created 2 years ago

Updated 4 days ago

Hunyuan3D-Omni by Tencent-Hunyuan

Controllable 3D asset generation from diverse inputs

Created 3 months ago

Updated 2 months ago

PhysX-Anything by ziangcao0312

Generate simulation-ready 3D assets from single images

Created 2 months ago

Updated 3 weeks ago

GaussianDreamer by hustvl

Framework for fast text-to-3D Gaussian generation

Created 2 years ago

Updated 1 year ago

MeshAnythingV2 by buaacyw

Research paper implementation for artist-created mesh generation

Created 1 year ago

Updated 8 months ago

HunyuanWorld-1.0 by Tencent-Hunyuan

Generate immersive 3D worlds from text or pixels

Created 5 months ago

Updated 3 weeks ago

Starred by

Chip Huyen

Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"),

Luis Capelo

Luis Capelo(Cofounder of Lightning AI), and

6 more.

threestudio by threestudio-project

Framework for 3D content generation from text/images using 2D diffusion

Created 2 years ago

Updated 1 year ago

Starred by

Yaowei Zheng

Yaowei Zheng(Author of LLaMA-Factory),

Chip Huyen

Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and

13 more.

stable-dreamfusion by ashawkey

Text-to-3D model using NeRF and diffusion

Created 3 years ago

Updated 2 years ago

Starred by

Vincent Weisser

Vincent Weisser(Cofounder of Prime Intellect),

Wei-Lin Chiang

Wei-Lin Chiang(Cofounder of LMArena), and

4 more.

shap-e by openai

3D object generator conditioned on text or images

Created 2 years ago

Updated 1 year ago

Feedback? Help us improve.