vid2vid-zero  by baaivision

Video editing research paper using image diffusion

Created 2 years ago
355 stars

Top 78.5% on SourcePulse

GitHubView on GitHub
Project Summary

vid2vid-zero enables zero-shot video editing by leveraging pre-trained image diffusion models without requiring video-specific training. It targets researchers and practitioners in computer vision and generative AI who need to modify video content based on textual descriptions. The primary benefit is the ability to edit attributes, subjects, and scenes in real-world videos with high fidelity and temporal consistency.

How It Works

The method employs three core modules: null-text inversion for aligning text prompts with video content, cross-frame modeling for maintaining temporal consistency across video frames, and spatial regularization to preserve the original video's fidelity. It utilizes the dynamic nature of attention mechanisms within diffusion models for bidirectional temporal modeling at inference time, avoiding the need for explicit video training.

Quick Start & Requirements

  • Install: pip install -r requirements.txt
  • Prerequisites: Python, PyTorch. xformers is highly recommended for performance. Requires pre-trained Stable Diffusion models (v1-4 default).
  • Run: accelerate launch test_vid2vid_zero.py --config path/to/config
  • Demo: Local Gradio demo available via python app.py or online at Hugging Face Spaces.
  • Docs: Hugging Face Spaces Demo

Highlighted Details

  • Zero-shot video editing using off-the-shelf image diffusion models.
  • No video-specific training required.
  • Achieves promising results in editing attributes, subjects, and places.
  • Employs null-text inversion, cross-frame modeling, and spatial regularization.

Maintenance & Community

The project is associated with BAAI Vision Team and ZJU. Contact information for hiring and collaboration is provided.

Licensing & Compatibility

The repository does not explicitly state a license in the README. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The README does not specify any explicit limitations or known bugs. The project appears to be relatively new, with code released in April 2023.

Health Check
Last Commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
0 stars in the last 30 days

Explore Similar Projects

Starred by Alex Yu Alex Yu(Research Scientist at OpenAI; Former Cofounder of Luma AI), Jiaming Song Jiaming Song(Chief Scientist at Luma AI), and
1 more.

SkyReels-V2 by SkyworkAI

3.3%
4k
Film generation model for infinite-length videos using diffusion forcing
Created 5 months ago
Updated 1 month ago
Feedback? Help us improve.