4DNeX  by 3DTopia

Feed-forward 4D generative modeling from single images

Created 1 month ago
536 stars

Top 59.4% on SourcePulse

GitHubView on GitHub
Project Summary

Summary

4DNeX offers a feed-forward framework for single-image 4D generative modeling, producing dynamic 3D scene representations. It bypasses computationally intensive optimization and multi-frame input requirements of prior methods, delivering an efficient, end-to-end image-to-4D solution. Targeted at generative modeling and computer vision practitioners, it enables high-quality dynamic point cloud generation and novel-view video synthesis with robust generalizability.

How It Works

The framework fine-tunes a pretrained video diffusion model (Wan2.1 I2V 14B) via adaptation strategies. A unified 6D video representation jointly models RGB and XYZ sequences for structured appearance and geometry learning. This approach is supported by the large-scale 4DNeX-10M dataset, curated to address 4D data scarcity. The feed-forward, single-image-to-4D pipeline provides a scalable and efficient alternative to optimization-heavy methods.

Quick Start & Requirements

Environment setup requires Conda, Python 3.10, PyTorch with CUDA 12.1, git-lfs, and rerun-sdk. Users must download pretrained Wan2.1 I2V 14B and 4DNex-Lora weights from Hugging Face into a specified directory structure. Inference is executed via a Python script, referencing example image and prompt files. The 4DNeX-10M dataset from Hugging Face is a prerequisite for training.

Highlighted Details

  • First feed-forward framework for single-image 4D scene generation.
  • Generates high-quality dynamic point clouds.
  • Enables novel-view video synthesis.
  • Demonstrates strong generalizability.
  • Introduces the 4DNeX-10M dataset.

Maintenance & Community

The provided README lacks specific details on project maintainers, community channels, or a public roadmap.

Licensing & Compatibility

The README does not specify the software license or provide compatibility notes for commercial use or closed-source linking.

Limitations & Caveats

The project's TODO list indicates missing data preprocessing and visualization scripts. A data preparation script for training is also noted as absent.

Health Check
Last Commit

2 days ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
3
Star History
443 stars in the last 30 days

Explore Similar Projects

Starred by Alex Yu Alex Yu(Research Scientist at OpenAI; Former Cofounder of Luma AI), Jiaming Song Jiaming Song(Chief Scientist at Luma AI), and
1 more.

SkyReels-V2 by SkyworkAI

3.3%
4k
Film generation model for infinite-length videos using diffusion forcing
Created 5 months ago
Updated 1 month ago
Feedback? Help us improve.