SEINE  by Vchitect

Video diffusion model for generative transition and prediction (ICLR 2024 paper)

created 1 year ago
943 stars

Top 39.7% on sourcepulse

GitHubView on GitHub
Project Summary

SEINE is a diffusion model for generating and predicting video content, specifically designed for transitioning between short clips and extending them into longer sequences. It targets researchers and developers in video generation, offering a novel approach to temporal consistency and content extension.

How It Works

SEINE employs a diffusion model architecture, building upon Stable Diffusion v1.4. Its core innovation lies in its ability to handle short-to-long video generation, enabling generative transitions and temporal prediction. This approach allows for extending existing video content or creating smooth transitions between different video segments.

Quick Start & Requirements

  • Install: conda create -n seine python==3.9.16, conda activate seine, pip install -r requirement.txt
  • Prerequisites: Python 3.9.16, Stable Diffusion v1.4 model weights (downloaded to ./pretrained/stable-diffusion-v1-4), SEINE model checkpoint (downloaded to ./pretrained).
  • Usage:
    • I2V: python sample_scripts/with_mask_sample.py --config configs/sample_i2v.yaml
    • Transition: python sample_scripts/with_mask_sample.py --config configs/sample_transition.yaml
  • Links: LaVie (related project)

Highlighted Details

  • Official implementation of the ICLR 2024 paper "SEINE: Short-to-Long Video Diffusion Model for Generative Transition and Prediction".
  • Part of the Vchitect video generation system.
  • Supports Image-to-Video (I2V) and video transition generation.

Maintenance & Community

Licensing & Compatibility

  • Code licensed under Apache-2.0.
  • Model weights are open for academic research and free commercial usage.
  • For commercial licensing inquiries, contact vchitect@pjlab.org.cn.

Limitations & Caveats

The model is not trained for realistic representation of people or events, and its use for generating pornographic, violent, or harmful content is prohibited and disclaimed by the authors. Users are solely liable for their actions.

Health Check
Last commit

8 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
8 stars in the last 90 days

Explore Similar Projects

Starred by Chenlin Meng Chenlin Meng(Cofounder of Pika), Patrick von Platen Patrick von Platen(Core Contributor to Hugging Face Transformers and Diffusers), and
1 more.

Tune-A-Video by showlab

0%
4k
Text-to-video generation via diffusion model fine-tuning
created 2 years ago
updated 1 year ago
Feedback? Help us improve.