SD-CN-Animation  by volotat

Video stylization tool using StableDiffusion and ControlNet

created 2 years ago
821 stars

Top 44.1% on sourcepulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

This project provides an extension for the Automatic1111 Stable Diffusion web UI to automate video stylization and generation. It targets users looking to create stylized videos from existing footage (vid2vid) or generate entirely new videos from text prompts, offering control over resolution and length. The key benefit is enhanced stability and quality in video generation through optical flow estimation.

How It Works

The extension leverages RAFT for optical flow estimation in vid2vid mode to maintain animation stability and generate occlusion masks for frame-to-frame consistency. For text-to-video generation, it utilizes a "FloweR" method (in progress) to predict optical flow. ControlNet integration is crucial, especially in vid2vid, to prevent choppy results. The text-to-video mode also supports using video as guidance for ControlNet, enabling stronger stylization.

Quick Start & Requirements

  • Install via Automatic1111 web UI: Extensions tab -> Install from URL, enter https://github.com/volotat/SD-CN-Animation.git.
  • Requires Automatic1111 web UI.
  • Not compatible with Macs.
  • Ensure 'Apply color correction to img2img results to match original colors.' is disabled in Stable Diffusion settings.
  • Update web UI if encountering 'Need to enable queue to use generators.' errors.

Highlighted Details

  • Supports custom Stable Diffusion models.
  • Vid2vid mode allows fine-grained control via 'Extra params'.
  • Text-to-video mode automatically sets seed to -1 after the first frame to prevent blurring.
  • ControlNet can be used in text-to-video mode with video guidance for stronger stylization.

Maintenance & Community

  • Recent updates (v0.9) address multiple issues, improve vid2vid controls, and enhance ControlNet integration.
  • Primarily developed by volotat.

Licensing & Compatibility

  • The repository does not explicitly state a license in the provided README.

Limitations & Caveats

  • Not compatible with Macs.
  • Potential issues with color correction settings and older web UI versions.
  • The "FloweR" method for text-to-video is noted as "work in progress."
Health Check
Last commit

1 year ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
8 stars in the last 90 days

Explore Similar Projects

Starred by Chenlin Meng Chenlin Meng(Cofounder of Pika), Patrick von Platen Patrick von Platen(Core Contributor to Hugging Face Transformers and Diffusers), and
1 more.

Tune-A-Video by showlab

0%
4k
Text-to-video generation via diffusion model fine-tuning
created 2 years ago
updated 1 year ago
Feedback? Help us improve.