Discover and explore top open-source AI tools and projects—updated daily.
Tencent-HunyuanLightweight, high-quality video generation model
Top 15.6% on SourcePulse
Summary
HunyuanVideo-1.5 is a lightweight, high-performance video generation model offering state-of-the-art quality with an accessible 8.3B parameters, designed for consumer GPUs. It supports both text-to-video (T2V) and image-to-video (I2V) generation, lowering barriers for developers and creators.
How It Works
The model features an 8.3B-parameter Diffusion Transformer (DiT) with a 3D causal VAE. Its core innovation, Selective and Sliding Tile Attention (SSTA), prunes computations to accelerate inference. It incorporates meticulous data curation, glyph-aware text encoding, and a multi-stage progressive training strategy for enhanced motion coherence and visual quality.
Quick Start & Requirements
pip install -r requirements.txt, pip install tencentcloud-sdk-python. Flash Attention, Flex-Block-Attention, SageAttention recommended for performance.torchrun --nproc_per_node=<N> generate.py .... Pretrained models require separate download.https://github.com/Tencent-Hunyuan/HunyuanVideo-1.5.Highlighted Details
Maintenance & Community
Community contributions are encouraged (e.g., ComfyUI plugins). WeChat and Discord channels are available. Acknowledges open-source contributions from Transformers, Diffusers, HuggingFace, and Qwen-VL.
Licensing & Compatibility
The license type is not specified in the provided README text.
Limitations & Caveats
Distillation and sparse attention models are noted as "coming soon." Diffusers support is not yet implemented. Primary environment appears to be Linux.
1 week ago
Inactive
Lightricks
Wan-Video