SEINE by Vchitect

Video diffusion model for generative transition and prediction (ICLR 2024 paper)

Created 2 years ago

954 stars

Top 38.5% on SourcePulse

View on GitHub

2 Experts Love This Project

Chip Huyen

Author of "AI Engineering", "Designing Machine Learning Systems"

Omar Sanseviero

DevRel at Google DeepMind

Project Summary

SEINE is a diffusion model for generating and predicting video content, specifically designed for transitioning between short clips and extending them into longer sequences. It targets researchers and developers in video generation, offering a novel approach to temporal consistency and content extension.

How It Works

SEINE employs a diffusion model architecture, building upon Stable Diffusion v1.4. Its core innovation lies in its ability to handle short-to-long video generation, enabling generative transitions and temporal prediction. This approach allows for extending existing video content or creating smooth transitions between different video segments.

Quick Start & Requirements

Install: conda create -n seine python==3.9.16, conda activate seine, pip install -r requirement.txt
Prerequisites: Python 3.9.16, Stable Diffusion v1.4 model weights (downloaded to ./pretrained/stable-diffusion-v1-4), SEINE model checkpoint (downloaded to ./pretrained).
Usage:
- I2V: python sample_scripts/with_mask_sample.py --config configs/sample_i2v.yaml
- Transition: python sample_scripts/with_mask_sample.py --config configs/sample_transition.yaml
Links: LaVie (related project)

Highlighted Details

Official implementation of the ICLR 2024 paper "SEINE: Short-to-Long Video Diffusion Model for Generative Transition and Prediction".
Part of the Vchitect video generation system.
Supports Image-to-Video (I2V) and video transition generation.

Maintenance & Community

Contact: Xinyuan Chen (chenxinyuan@pjlab.org.cn), Yaohui Wang (wangyaohui@pjlab.org.cn).
Built upon LaVie, diffusers, and Stable Diffusion.

Licensing & Compatibility

Code licensed under Apache-2.0.
Model weights are open for academic research and free commercial usage.
For commercial licensing inquiries, contact vchitect@pjlab.org.cn.

Limitations & Caveats

The model is not trained for realistic representation of people or events, and its use for generating pornographic, violent, or harmful content is prohibited and disclaimed by the authors. Users are solely liable for their actions.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

4 stars in the last 30 days