Diffusion4D  by VITA-Group

Research code for fast, consistent 4D generation via video diffusion models

Created 1 year ago
324 stars

Top 83.8% on SourcePulse

GitHubView on GitHub
Project Summary

Diffusion4D enables fast, spatially and temporally consistent 4D generation using video diffusion models. It targets researchers and developers working with 3D content creation, animation, and generative AI, offering a novel approach to dynamic 3D asset generation from various inputs.

How It Works

Diffusion4D leverages video diffusion models to generate dynamic 4D content. The core innovation lies in its ability to maintain spatial-temporal consistency across generated frames, crucial for realistic 4D representations. This is achieved by training diffusion models on curated datasets of 3D objects rendered into video sequences, capturing both object appearance and motion.

Quick Start & Requirements

  • Install: Clone the repository and navigate to the rendering directory.
  • Prerequisites: Python, Blender (v3.2.2 recommended), objaverse library.
  • Data: Requires downloading 3D assets from Objaverse-1.0 or Objaverse-XL. Rendering the Objaverse-XL dataset can take approximately 30 days with 8 GPUs.
  • Links: Project Page, Arxiv, Huggingface Dataset

Highlighted Details

  • Offers image-to-4D, text-to-4D, and 3D-to-4D generation capabilities.
  • Provides scripts for rendering custom 4D datasets using Blender.
  • Released curated datasets and metadata for Objaverse-1.0 and Objaverse-XL.
  • Rendering script is based on point-e and Objaverse rendering scripts.

Maintenance & Community

The project is associated with the VITA-Group. Further community engagement details are not explicitly provided in the README.

Licensing & Compatibility

The repository's license is not explicitly stated in the README. The project acknowledges contributions from various open-source projects with their respective licenses.

Limitations & Caveats

The rendering process for large datasets is computationally intensive and time-consuming. The README advises against generating excessively long frame sequences due to motion limitations in some assets.

Health Check
Last Commit

9 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
5 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Luis Capelo Luis Capelo(Cofounder of Lightning AI), and
6 more.

threestudio by threestudio-project

0.1%
7k
Framework for 3D content generation from text/images using 2D diffusion
Created 2 years ago
Updated 10 months ago
Starred by Yaowei Zheng Yaowei Zheng(Author of LLaMA-Factory), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
13 more.

stable-dreamfusion by ashawkey

0.1%
9k
Text-to-3D model using NeRF and diffusion
Created 3 years ago
Updated 1 year ago
Feedback? Help us improve.