Discover and explore top open-source AI tools and projects—updated daily.
madebyollinFaster, lighter video latent processing
Top 97.0% on SourcePulse
TAEHV is a Tiny AutoEncoder designed for efficient latent space manipulation in video generation models like Hunyuan Video. It targets researchers and developers needing faster, memory-light video encoding/decoding for applications like real-time previews or interactive video, offering significant performance gains at a slight quality cost.
How It Works
This project implements a compact AutoEncoder architecture optimized for speed and low memory footprint. By processing video latents through a smaller model, TAEHV achieves decoding speeds orders of magnitude faster and requires substantially less VRAM than traditional, full-scale video VAEs, making it suitable for resource-constrained environments or interactive workflows.
Quick Start & Requirements
.pth, .safetensors) are required for compatibility with base video models (e.g., Hunyuan Video 1.5, Wan 2.1/2.2, Qwen Image, CogVideoX, Hunyuan Video 1, Open-Sora 1.3). Example notebooks are available for usage and integration.Highlighted Details
Maintenance & Community
The project benefits from contributions enabling integrations into popular UIs and tools like ComfyUI, stable-diffusion.cpp, and SDNext. Specific contributors are credited for these integrations. No direct community channels (e.g., Discord, Slack) or a public roadmap are detailed in the README.
Licensing & Compatibility
The repository's license is not explicitly stated in the provided README text.
Limitations & Caveats
TAEHV prioritizes speed and efficiency, resulting in slightly lower video quality compared to full-size VAEs. Models like Mochi 1 and SVD are not directly supported and require separate repositories. Integration with libraries like Diffusers necessitates careful handling of dimension order (NTCHW vs NCTHW) and value ranges ( vs [-1, 1]).
3 weeks ago
Inactive
hao-ai-lab
Lightricks