Desktop software for video generation via next-frame prediction
Top 3.3% on sourcepulse
FramePack offers a novel approach to video generation by treating it as a next-frame prediction task, enabling efficient, progressive video creation. It compresses input contexts to a fixed length, making generation workload independent of video duration, and allows for large batch sizes akin to image diffusion training. This makes it accessible for users with limited hardware, including laptop GPUs, to generate longer videos.
How It Works
FramePack employs a next-frame prediction neural network architecture that generates videos sequentially. Its core innovation lies in compressing input contexts into a constant-length representation. This design choice decouples generation complexity from video length, allowing for efficient processing of extended sequences and enabling training with larger batch sizes, similar to image diffusion models.
Quick Start & Requirements
update.bat
, then run.bat
. Models download automatically (30GB+).pip install -r requirements.txt
. Run GUI with python demo_gradio.py
.Highlighted Details
Maintenance & Community
The project is associated with researchers from Meta AI and Princeton University. The team is on leave from April 21-30, which may delay PR merging.
Licensing & Compatibility
The repository does not explicitly state a license. The paper is available on arXiv.
Limitations & Caveats
The project is described as a functional desktop software with a minimal standalone sampling system. Users are advised that results can be sensitive to hardware and software configurations, and optimizations like "teacache" may impact output quality. The project also warns against numerous unofficial websites claiming to offer FramePack services.
2 weeks ago
1 day