Discover and explore top open-source AI tools and projects—updated daily.
Francis-RingsFaster infinite portrait animation
New!
Top 71.3% on SourcePulse
Summary
FlashPortrait addresses the challenge of generating high-fidelity, identity-preserving, and infinitely long portrait animations efficiently. It targets researchers and engineers in video synthesis and AI animation, offering a significant speedup (up to 6x) over existing methods without compromising visual quality or identity consistency.
How It Works
This project utilizes an end-to-end video diffusion transformer architecture. It begins by extracting identity-agnostic facial expression features, which are then aligned with diffusion latents via a novel Normalized Facial Expression Block to enhance identity stability. For long video synthesis, a dynamic sliding-window scheme with weighted blending ensures smooth transitions. Crucially, FlashPortrait employs Adaptive Latent Prediction, leveraging higher-order latent derivatives to skip denoising steps, thereby achieving substantial inference acceleration.
Quick Start & Requirements
Installation involves PyTorch (v2.6.0, CUDA 12.4 recommended) and dependencies from requirements.txt. Optional acceleration can be gained by installing flash_attn. Model weights must be downloaded manually from Hugging Face. Inference is initiated via python infer.py or python fast_infer.py. Links to the project page, code, and technical report are available from the December 15, 2025 release.
Highlighted Details
Maintenance & Community
The project saw significant releases in December 2025, including code, checkpoints, and a ComfyUI implementation. Development is ongoing, with a "To-Do List" indicating planned features like multi-GPU inference. Key contributors are affiliated with major research institutions and tech companies. No direct community channels (like Discord/Slack) are listed.
Licensing & Compatibility
The README does not specify a software license. This absence requires clarification for any adoption decision, particularly concerning commercial use or derivative works.
Limitations & Caveats
Training FlashPortrait demands substantial VRAM (40-50GB), while inference VRAM can be reduced from ~60GB to ~10GB using CPU offloading techniques. The 3D VAE decoder can be memory-intensive for very long videos, though CPU decoding is an option. Training requires meticulously organized datasets with specific mask types and static backgrounds. Multi-GPU inference support is still under development.
1 day ago
Inactive