Discover and explore top open-source AI tools and projects—updated daily.
thu-mlReal-time, high-quality video generation framework
New!
Top 75.6% on SourcePulse
Causal Forcing addresses high-quality, real-time interactive video generation using an autoregressive diffusion distillation approach. It targets researchers and engineers in video synthesis, offering significant improvements in visual quality and motion dynamics over prior methods like Self Forcing, enabling streaming generation on consumer hardware.
How It Works
The core innovation is "Causal Forcing," an autoregressive diffusion distillation technique that enhances visual fidelity and motion coherence. It offers both frame-wise and chunk-wise model variants, catering to different needs for expressiveness versus stability. A recent development introduces "causal consistency distillation" as a more data-efficient alternative to ODE distillation, simplifying the training pipeline by removing the need for ODE-paired data.
Quick Start & Requirements
Installation involves creating a Python 3.10 Conda environment, installing dependencies from requirements.txt, and specific packages like CLIP and flash-attn. Inference requires downloading pre-trained checkpoints for frame-wise or chunk-wise models via Hugging Face CLI. Training involves multiple stages, including autoregressive diffusion, ODE initialization (or the newer Causal CD), and DMD, often requiring distributed training setups (torchrun) and substantial datasets (~300GB for ODE data). Real-time inference is demonstrated on an RTX 4090.
Highlighted Details
Maintenance & Community
The project is associated with Tsinghua University and UT Austin, with key contributors listed. A Chinese-language blog and QA are available. No explicit community channels (Slack/Discord) or roadmap links are provided.
Licensing & Compatibility
The repository's license is not specified in the README, making its terms for commercial use or closed-source integration indeterminate.
Limitations & Caveats
The recently introduced "Causal CD" is noted as an early-stage preview with potential suboptimal implementations. Training requires significant computational resources and large datasets. The absence of a clear license is a primary adoption blocker for commercial applications.
2 days ago
Inactive
hao-ai-lab
Lightricks
SkyworkAI