Diffusion4D by VITA-Group

Research code for fast, consistent 4D generation via video diffusion models

Created 1 year ago

332 stars

Top 82.7% on SourcePulse

Project Summary

Diffusion4D enables fast, spatially and temporally consistent 4D generation using video diffusion models. It targets researchers and developers working with 3D content creation, animation, and generative AI, offering a novel approach to dynamic 3D asset generation from various inputs.

How It Works

Diffusion4D leverages video diffusion models to generate dynamic 4D content. The core innovation lies in its ability to maintain spatial-temporal consistency across generated frames, crucial for realistic 4D representations. This is achieved by training diffusion models on curated datasets of 3D objects rendered into video sequences, capturing both object appearance and motion.

Quick Start & Requirements

Install: Clone the repository and navigate to the rendering directory.
Prerequisites: Python, Blender (v3.2.2 recommended), objaverse library.
Data: Requires downloading 3D assets from Objaverse-1.0 or Objaverse-XL. Rendering the Objaverse-XL dataset can take approximately 30 days with 8 GPUs.
Links: Project Page, Arxiv, Huggingface Dataset

Highlighted Details

Offers image-to-4D, text-to-4D, and 3D-to-4D generation capabilities.
Provides scripts for rendering custom 4D datasets using Blender.
Released curated datasets and metadata for Objaverse-1.0 and Objaverse-XL.
Rendering script is based on point-e and Objaverse rendering scripts.

Maintenance & Community

The project is associated with the VITA-Group. Further community engagement details are not explicitly provided in the README.

Licensing & Compatibility

The repository's license is not explicitly stated in the README. The project acknowledges contributions from various open-source projects with their respective licenses.

Limitations & Caveats

The rendering process for large datasets is computationally intensive and time-consuming. The README advises against generating excessively long frame sequences due to motion limitations in some assets.

Diffusion4D by VITA-Group

Explore Similar Projects

awesome-4d-generation by cwchenwang

gcd by basilevh

ShareGPT-4o-Image by FreedomIntelligence

World-Simulator by ALEEEHU

TesserAct by UMass-Embodied-AGI

LayoutGPT by weixi-feng

3D-LLM by UMass-Embodied-AGI

OpenLRM by 3DTopia

DimensionX by wenqsun

threestudio by threestudio-project

stable-dreamfusion by ashawkey

generative-models by Stability-AI