LAMP by RQ-Wu

LAMP: Few-shot video generation research paper (CVPR 2024)

Created 2 years ago

280 stars

Top 93.0% on SourcePulse

Project Summary

LAMP is a few-shot text-to-video generation framework designed for researchers and practitioners in computer vision and generative AI. It enables users to learn custom motion patterns from a small set of videos (8-16) and then generate new videos based on these learned motions, offering a more efficient approach to specialized video synthesis compared to training large models from scratch.

How It Works

LAMP leverages a motion pattern learning approach, building upon a pre-trained text-to-image diffusion model (specifically Stable Diffusion v1.4). It fine-tunes the model to capture the temporal dynamics and motion characteristics present in a small dataset of videos. This allows the model to generate novel video sequences that adhere to a specific learned motion, while also supporting video editing tasks by modifying existing video content based on new prompts.

Quick Start & Requirements

Installation: Clone the repository, create a Conda environment (conda create -n LAMP python=3.8), activate it, and install dependencies using pip install -r requirements.txt after installing PyTorch with CUDA 11.3 support and xformers.
Prerequisites: Ubuntu 18.04+, CUDA 11.3, Python 3.8, PyTorch 1.12.1, and git-lfs for downloading weights. A GPU with at least 15 GB VRAM is required for training.
Resources: Pre-trained checkpoints and training data are available via Baidu Disk and Google Drive links.
Links: Arxiv Paper, Website Page, Colab Notebook (Note: Colab link is illustrative, actual link may vary).

Highlighted Details

Achieved CVPR 2024 acceptance.
Supports few-shot learning for custom motion patterns with 8-16 training videos.
Offers functionality for both text-to-video generation and video editing.
Built upon the Tune-A-Video framework.

Maintenance & Community

The repository is maintained by Ruiqi Wu. The project is based on the Tune-A-Video codebase. Further community interaction details (e.g., Discord/Slack) are not explicitly mentioned in the README.

Licensing & Compatibility

Licensed under Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0). This license strictly prohibits commercial use without formal permission.

Limitations & Caveats

The primary limitation is the non-commercial use restriction imposed by the CC BY-NC 4.0 license. Commercial applications would require explicit permission from the authors. The specific CUDA version requirement (11.3) might also pose a compatibility challenge for users with different CUDA setups.

LAMP by RQ-Wu

Explore Similar Projects

t2v-turbo by Ji4chenLi

RAVE by RehgLab

TATS by songweige

Gen-L-Video by G-U-N

MotionClone by LPengYang

kandinsky-5 by kandinskylab

Allegro by rhymes-ai

Awesome-Video-Diffusion-Models by ChenHsing

video-diffusion-pytorch by lucidrains

EasyAnimate by aigc-apps

Text2Video-Zero by Picsart-AI-Research

SkyReels-V2 by SkyworkAI