4DNeX by 3DTopia

Feed-forward 4D generative modeling from single images

Created 4 months ago

811 stars

Top 43.6% on SourcePulse

Project Summary

Summary

4DNeX offers a feed-forward framework for single-image 4D generative modeling, producing dynamic 3D scene representations. It bypasses computationally intensive optimization and multi-frame input requirements of prior methods, delivering an efficient, end-to-end image-to-4D solution. Targeted at generative modeling and computer vision practitioners, it enables high-quality dynamic point cloud generation and novel-view video synthesis with robust generalizability.

How It Works

The framework fine-tunes a pretrained video diffusion model (Wan2.1 I2V 14B) via adaptation strategies. A unified 6D video representation jointly models RGB and XYZ sequences for structured appearance and geometry learning. This approach is supported by the large-scale 4DNeX-10M dataset, curated to address 4D data scarcity. The feed-forward, single-image-to-4D pipeline provides a scalable and efficient alternative to optimization-heavy methods.

Quick Start & Requirements

Environment setup requires Conda, Python 3.10, PyTorch with CUDA 12.1, git-lfs, and rerun-sdk. Users must download pretrained Wan2.1 I2V 14B and 4DNex-Lora weights from Hugging Face into a specified directory structure. Inference is executed via a Python script, referencing example image and prompt files. The 4DNeX-10M dataset from Hugging Face is a prerequisite for training.

Highlighted Details

First feed-forward framework for single-image 4D scene generation.
Generates high-quality dynamic point clouds.
Enables novel-view video synthesis.
Demonstrates strong generalizability.
Introduces the 4DNeX-10M dataset.

Maintenance & Community

The provided README lacks specific details on project maintainers, community channels, or a public roadmap.

Licensing & Compatibility

The README does not specify the software license or provide compatibility notes for commercial use or closed-source linking.

Limitations & Caveats

The project's TODO list indicates missing data preprocessing and visualization scripts. A data preparation script for training is also noted as absent.

4DNeX by 3DTopia

Explore Similar Projects

gcd by basilevh

awesome-3DGS by qqqqqqy0227

diffusion-renderer by nv-tlabs

lyra by nv-tlabs

DimensionX by wenqsun

Make-It-3D by junshutang

zero123 by cvlab-columbia

LLFF by Fyusion

VGen by ali-vilab

Awesome-Video-Diffusion by showlab

SkyReels-V2 by SkyworkAI

generative-models by Stability-AI