Discover and explore top open-source AI tools and projects—updated daily.
Accelerating diffusion models with predictive feature caching
Top 92.1% on SourcePulse
TaylorSeer accelerates Diffusion Transformer (DiT) models for image and video synthesis by predicting future timestep features using Taylor series expansion, enabling significant speedups without retraining. It targets researchers and developers working with high-fidelity generative models who need to reduce inference latency for real-time applications.
How It Works
TaylorSeer leverages the observation that diffusion model features evolve slowly and continuously across timesteps. It approximates higher-order derivatives of these features to predict future states via Taylor series expansion. This forecasting approach aims to overcome the quality degradation seen in traditional feature caching methods when timestep intervals are large, offering substantial acceleration with minimal impact on generation quality.
Quick Start & Requirements
git clone https://github.com/Shenyi-Z/TaylorSeer.git
). Specific implementations for FLUX, HunyuanVideo, DiT, Wan2.1, and HiDream are available in subdirectories.Highlighted Details
Maintenance & Community
The project is associated with ICCV 2025 and ICLR 2025 submissions. It acknowledges contributions from various model implementations (DiT, FLUX, HiDream, etc.) and has community contributions like ComfyUI-TaylorSeer. Contact email: shenyizou@outlook.com.
Licensing & Compatibility
The repository does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.
Limitations & Caveats
The project is presented as research code, with specific implementations for various models. While claiming "lossless" or "near lossless" acceleration, the exact quality metrics and potential trade-offs at higher acceleration ratios may require further investigation. The primary focus is on DiT architectures and related models.
1 month ago
Inactive