Video synthesis codebase for state-of-the-art generative models
Top 15.7% on sourcepulse
VGen is a comprehensive open-source codebase for video generation, offering implementations of state-of-the-art diffusion models for various synthesis tasks. It caters to researchers and developers in AI video generation, providing tools for training, inference, and customization with a focus on high-quality output and controllability.
How It Works
VGen leverages cascaded diffusion models and hierarchical spatio-temporal decoupling to achieve high-quality video synthesis. It supports text-to-video, image-to-video, and controllable generation based on motion and subject customization. The ecosystem is designed for expandability and includes components for managing experiments and integrating various diffusion model architectures.
Quick Start & Requirements
conda create -n vgen python=3.8
, conda activate vgen
, pip install torch==1.12.0+cu113 torchvision==0.13.0+cu113 torchaudio==0.12.0 --extra-index-url https://download.pytorch.org/whl/cu113
, pip install -r requirements.txt
.ffmpeg
, libsm6
, libxext6
.Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
6 months ago
1 week