Rerender_A_Video by williamyang1991

Video-to-video translation framework for zero-shot text-guided video rendering

Created 2 years ago

3,002 stars

Top 15.8% on SourcePulse

Project Summary

This project provides a zero-shot, text-guided video-to-video translation framework for researchers and artists. It addresses the challenge of maintaining temporal consistency in video generation by leveraging adapted diffusion models, enabling users to restyle videos based on text prompts without retraining.

How It Works

The framework consists of two main stages: key frame translation and full video translation. Key frames are generated using a diffusion model enhanced with hierarchical cross-frame constraints to ensure coherence in shape, texture, and color. Subsequent frames are then propagated from these key frames using temporal-aware patch matching and frame blending techniques. This approach allows for global style and local texture consistency with minimal computational cost.

Quick Start & Requirements

Install: Clone the repository with --recursive and run pip install -r requirements.txt or use the provided environment.yml.
Prerequisites: PyTorch with CUDA support, Python 3.x. Requires 24GB VRAM.
Run Demo: python rerender.py --cfg config/real2sculpture.json
More Info: Project Page

Highlighted Details

Zero-shot translation: No training or fine-tuning required.
Compatibility with ControlNet and LoRA for customized translations.
Achieves temporal consistency through cross-frame constraints and shape/pixel-aware fusion.
Offers a WebUI for interactive experimentation and command-line scripts for batch processing.

Maintenance & Community

The project was accepted to SIGGRAPH Asia 2023 and has been integrated into Hugging Face Diffusers. Updates include Loose cross-frame attention and FreeU integration.

Licensing & Compatibility

The repository is released under the MIT License, permitting commercial use and linking with closed-source projects.

Limitations & Caveats

The primary requirement is 24GB of VRAM, though memory reduction techniques are suggested. Installation on Windows may require manual setup of CUDA, Git, and Visual Studio with the Windows SDK, and pre-compiled binaries for ebsynth are provided as a fallback. Path names should only contain English letters or underscores to avoid FileNotFoundError.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

1 stars in the last 30 days